summaryrefslogtreecommitdiffstats
path: root/contrib/llvm/tools/clang/lib/Format
diff options
context:
space:
mode:
authordim <dim@FreeBSD.org>2016-12-26 20:36:37 +0000
committerdim <dim@FreeBSD.org>2016-12-26 20:36:37 +0000
commit06210ae42d418d50d8d9365d5c9419308ae9e7ee (patch)
treeab60b4cdd6e430dda1f292a46a77ddb744723f31 /contrib/llvm/tools/clang/lib/Format
parent2dd166267f53df1c3748b4325d294b9b839de74b (diff)
downloadFreeBSD-src-06210ae42d418d50d8d9365d5c9419308ae9e7ee.zip
FreeBSD-src-06210ae42d418d50d8d9365d5c9419308ae9e7ee.tar.gz
MFC r309124:
Upgrade our copies of clang, llvm, lldb, compiler-rt and libc++ to 3.9.0 release, and add lld 3.9.0. Also completely revamp the build system for clang, llvm, lldb and their related tools. Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11 support to build; see UPDATING for more information. Release notes for llvm, clang and lld are available here: <http://llvm.org/releases/3.9.0/docs/ReleaseNotes.html> <http://llvm.org/releases/3.9.0/tools/clang/docs/ReleaseNotes.html> <http://llvm.org/releases/3.9.0/tools/lld/docs/ReleaseNotes.html> Thanks to Ed Maste, Bryan Drewery, Andrew Turner, Antoine Brodin and Jan Beich for their help. Relnotes: yes MFC r309147: Pull in r282174 from upstream llvm trunk (by Krzysztof Parzyszek): [PPC] Set SP after loading data from stack frame, if no red zone is present Follow-up to r280705: Make sure that the SP is only restored after all data is loaded from the stack frame, if there is no red zone. This completes the fix for https://llvm.org/bugs/show_bug.cgi?id=26519. Differential Revision: https://reviews.llvm.org/D24466 Reported by: Mark Millard PR: 214433 MFC r309149: Pull in r283060 from upstream llvm trunk (by Hal Finkel): [PowerPC] Refactor soft-float support, and enable PPC64 soft float This change enables soft-float for PowerPC64, and also makes soft-float disable all vector instruction sets for both 32-bit and 64-bit modes. This latter part is necessary because the PPC backend canonicalizes many Altivec vector types to floating-point types, and so soft-float breaks scalarization support for many operations. Both for embedded targets and for operating-system kernels desiring soft-float support, it seems reasonable that disabling hardware floating-point also disables vector instructions (embedded targets without hardware floating point support are unlikely to have Altivec, etc. and operating system kernels desiring not to use floating-point registers to lower syscall cost are unlikely to want to use vector registers either). If someone needs this to work, we'll need to change the fact that we promote many Altivec operations to act on v4f32. To make it possible to disable Altivec when soft-float is enabled, hardware floating-point support needs to be expressed as a positive feature, like the others, and not a negative feature, because target features cannot have dependencies on the disabling of some other feature. So +soft-float has now become -hard-float. Fixes PR26970. Pull in r283061 from upstream clang trunk (by Hal Finkel): [PowerPC] Enable soft-float for PPC64, and +soft-float -> -hard-float Enable soft-float support on PPC64, as the backend now supports it. Also, the backend now uses -hard-float instead of +soft-float, so set the target features accordingly. Fixes PR26970. Reported by: Mark Millard PR: 214433 MFC r309212: Add a few missed clang 3.9.0 files to OptionalObsoleteFiles. MFC r309262: Fix packaging for clang, lldb and lld 3.9.0 During the upgrade of clang/llvm etc to 3.9.0 in r309124, the PACKAGE directive in the usr.bin/clang/*.mk files got dropped accidentally. Restore it, with a few minor changes and additions: * Correct license in clang.ucl to NCSA * Add PACKAGE=clang for clang and most of the "ll" tools * Put lldb in its own package * Put lld in its own package Reviewed by: gjb, jmallett Differential Revision: https://reviews.freebsd.org/D8666 MFC r309656: During the bootstrap phase, when building the minimal llvm library on PowerPC, add lib/Support/Atomic.cpp. This is needed because upstream llvm revision r271821 disabled the use of std::call_once, which causes some fallback functions from Atomic.cpp to be used instead. Reported by: Mark Millard PR: 214902 MFC r309835: Tentatively apply https://reviews.llvm.org/D18730 to work around gcc PR 70528 (bogus error: constructor required before non-static data member). This should fix buildworld with the external gcc package. Reported by: https://jenkins.freebsd.org/job/FreeBSD_HEAD_amd64_gcc/ MFC r310194: Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to 3.9.1 release. Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11 support to build; see UPDATING for more information. Release notes for llvm, clang and lld will be available here: <http://releases.llvm.org/3.9.1/docs/ReleaseNotes.html> <http://releases.llvm.org/3.9.1/tools/clang/docs/ReleaseNotes.html> <http://releases.llvm.org/3.9.1/tools/lld/docs/ReleaseNotes.html> Relnotes: yes
Diffstat (limited to 'contrib/llvm/tools/clang/lib/Format')
-rw-r--r--contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.cpp150
-rw-r--r--contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.h67
-rw-r--r--contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp72
-rw-r--r--contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.h5
-rw-r--r--contrib/llvm/tools/clang/lib/Format/Encoding.h1
-rw-r--r--contrib/llvm/tools/clang/lib/Format/Format.cpp1642
-rw-r--r--contrib/llvm/tools/clang/lib/Format/FormatToken.cpp1
-rw-r--r--contrib/llvm/tools/clang/lib/Format/FormatToken.h59
-rw-r--r--contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.cpp597
-rw-r--r--contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.h97
-rw-r--r--contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.cpp442
-rw-r--r--contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.h36
-rw-r--r--contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.cpp138
-rw-r--r--contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.h108
-rw-r--r--contrib/llvm/tools/clang/lib/Format/TokenAnnotator.cpp231
-rw-r--r--contrib/llvm/tools/clang/lib/Format/TokenAnnotator.h22
-rw-r--r--contrib/llvm/tools/clang/lib/Format/UnwrappedLineFormatter.cpp8
-rw-r--r--contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.cpp186
-rw-r--r--contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.h1
-rw-r--r--contrib/llvm/tools/clang/lib/Format/WhitespaceManager.cpp32
-rw-r--r--contrib/llvm/tools/clang/lib/Format/WhitespaceManager.h4
21 files changed, 2817 insertions, 1082 deletions
diff --git a/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.cpp b/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.cpp
new file mode 100644
index 0000000..5d4df19
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.cpp
@@ -0,0 +1,150 @@
+//===--- AffectedRangeManager.cpp - Format C++ code -----------------------===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file implements AffectRangeManager class.
+///
+//===----------------------------------------------------------------------===//
+
+#include "AffectedRangeManager.h"
+
+#include "FormatToken.h"
+#include "TokenAnnotator.h"
+
+namespace clang {
+namespace format {
+
+bool AffectedRangeManager::computeAffectedLines(
+ SmallVectorImpl<AnnotatedLine *>::iterator I,
+ SmallVectorImpl<AnnotatedLine *>::iterator E) {
+ bool SomeLineAffected = false;
+ const AnnotatedLine *PreviousLine = nullptr;
+ while (I != E) {
+ AnnotatedLine *Line = *I;
+ Line->LeadingEmptyLinesAffected = affectsLeadingEmptyLines(*Line->First);
+
+ // If a line is part of a preprocessor directive, it needs to be formatted
+ // if any token within the directive is affected.
+ if (Line->InPPDirective) {
+ FormatToken *Last = Line->Last;
+ SmallVectorImpl<AnnotatedLine *>::iterator PPEnd = I + 1;
+ while (PPEnd != E && !(*PPEnd)->First->HasUnescapedNewline) {
+ Last = (*PPEnd)->Last;
+ ++PPEnd;
+ }
+
+ if (affectsTokenRange(*Line->First, *Last,
+ /*IncludeLeadingNewlines=*/false)) {
+ SomeLineAffected = true;
+ markAllAsAffected(I, PPEnd);
+ }
+ I = PPEnd;
+ continue;
+ }
+
+ if (nonPPLineAffected(Line, PreviousLine))
+ SomeLineAffected = true;
+
+ PreviousLine = Line;
+ ++I;
+ }
+ return SomeLineAffected;
+}
+
+bool AffectedRangeManager::affectsCharSourceRange(
+ const CharSourceRange &Range) {
+ for (SmallVectorImpl<CharSourceRange>::const_iterator I = Ranges.begin(),
+ E = Ranges.end();
+ I != E; ++I) {
+ if (!SourceMgr.isBeforeInTranslationUnit(Range.getEnd(), I->getBegin()) &&
+ !SourceMgr.isBeforeInTranslationUnit(I->getEnd(), Range.getBegin()))
+ return true;
+ }
+ return false;
+}
+
+bool AffectedRangeManager::affectsTokenRange(const FormatToken &First,
+ const FormatToken &Last,
+ bool IncludeLeadingNewlines) {
+ SourceLocation Start = First.WhitespaceRange.getBegin();
+ if (!IncludeLeadingNewlines)
+ Start = Start.getLocWithOffset(First.LastNewlineOffset);
+ SourceLocation End = Last.getStartOfNonWhitespace();
+ End = End.getLocWithOffset(Last.TokenText.size());
+ CharSourceRange Range = CharSourceRange::getCharRange(Start, End);
+ return affectsCharSourceRange(Range);
+}
+
+bool AffectedRangeManager::affectsLeadingEmptyLines(const FormatToken &Tok) {
+ CharSourceRange EmptyLineRange = CharSourceRange::getCharRange(
+ Tok.WhitespaceRange.getBegin(),
+ Tok.WhitespaceRange.getBegin().getLocWithOffset(Tok.LastNewlineOffset));
+ return affectsCharSourceRange(EmptyLineRange);
+}
+
+void AffectedRangeManager::markAllAsAffected(
+ SmallVectorImpl<AnnotatedLine *>::iterator I,
+ SmallVectorImpl<AnnotatedLine *>::iterator E) {
+ while (I != E) {
+ (*I)->Affected = true;
+ markAllAsAffected((*I)->Children.begin(), (*I)->Children.end());
+ ++I;
+ }
+}
+
+bool AffectedRangeManager::nonPPLineAffected(
+ AnnotatedLine *Line, const AnnotatedLine *PreviousLine) {
+ bool SomeLineAffected = false;
+ Line->ChildrenAffected =
+ computeAffectedLines(Line->Children.begin(), Line->Children.end());
+ if (Line->ChildrenAffected)
+ SomeLineAffected = true;
+
+ // Stores whether one of the line's tokens is directly affected.
+ bool SomeTokenAffected = false;
+ // Stores whether we need to look at the leading newlines of the next token
+ // in order to determine whether it was affected.
+ bool IncludeLeadingNewlines = false;
+
+ // Stores whether the first child line of any of this line's tokens is
+ // affected.
+ bool SomeFirstChildAffected = false;
+
+ for (FormatToken *Tok = Line->First; Tok; Tok = Tok->Next) {
+ // Determine whether 'Tok' was affected.
+ if (affectsTokenRange(*Tok, *Tok, IncludeLeadingNewlines))
+ SomeTokenAffected = true;
+
+ // Determine whether the first child of 'Tok' was affected.
+ if (!Tok->Children.empty() && Tok->Children.front()->Affected)
+ SomeFirstChildAffected = true;
+
+ IncludeLeadingNewlines = Tok->Children.empty();
+ }
+
+ // Was this line moved, i.e. has it previously been on the same line as an
+ // affected line?
+ bool LineMoved = PreviousLine && PreviousLine->Affected &&
+ Line->First->NewlinesBefore == 0;
+
+ bool IsContinuedComment =
+ Line->First->is(tok::comment) && Line->First->Next == nullptr &&
+ Line->First->NewlinesBefore < 2 && PreviousLine &&
+ PreviousLine->Affected && PreviousLine->Last->is(tok::comment);
+
+ if (SomeTokenAffected || SomeFirstChildAffected || LineMoved ||
+ IsContinuedComment) {
+ Line->Affected = true;
+ SomeLineAffected = true;
+ }
+ return SomeLineAffected;
+}
+
+} // namespace format
+} // namespace clang
diff --git a/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.h b/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.h
new file mode 100644
index 0000000..d8d5ee5
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/AffectedRangeManager.h
@@ -0,0 +1,67 @@
+//===--- AffectedRangeManager.h - Format C++ code ---------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief AffectedRangeManager class manages affected ranges in the code.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_LIB_FORMAT_AFFECTEDRANGEMANAGER_H
+#define LLVM_CLANG_LIB_FORMAT_AFFECTEDRANGEMANAGER_H
+
+#include "clang/Basic/SourceManager.h"
+
+namespace clang {
+namespace format {
+
+struct FormatToken;
+class AnnotatedLine;
+
+class AffectedRangeManager {
+public:
+ AffectedRangeManager(const SourceManager &SourceMgr,
+ const ArrayRef<CharSourceRange> Ranges)
+ : SourceMgr(SourceMgr), Ranges(Ranges.begin(), Ranges.end()) {}
+
+ // Determines which lines are affected by the SourceRanges given as input.
+ // Returns \c true if at least one line between I and E or one of their
+ // children is affected.
+ bool computeAffectedLines(SmallVectorImpl<AnnotatedLine *>::iterator I,
+ SmallVectorImpl<AnnotatedLine *>::iterator E);
+
+ // Returns true if 'Range' intersects with one of the input ranges.
+ bool affectsCharSourceRange(const CharSourceRange &Range);
+
+private:
+ // Returns true if the range from 'First' to 'Last' intersects with one of the
+ // input ranges.
+ bool affectsTokenRange(const FormatToken &First, const FormatToken &Last,
+ bool IncludeLeadingNewlines);
+
+ // Returns true if one of the input ranges intersect the leading empty lines
+ // before 'Tok'.
+ bool affectsLeadingEmptyLines(const FormatToken &Tok);
+
+ // Marks all lines between I and E as well as all their children as affected.
+ void markAllAsAffected(SmallVectorImpl<AnnotatedLine *>::iterator I,
+ SmallVectorImpl<AnnotatedLine *>::iterator E);
+
+ // Determines whether 'Line' is affected by the SourceRanges given as input.
+ // Returns \c true if line or one if its children is affected.
+ bool nonPPLineAffected(AnnotatedLine *Line,
+ const AnnotatedLine *PreviousLine);
+
+ const SourceManager &SourceMgr;
+ const SmallVector<CharSourceRange, 8> Ranges;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_LIB_FORMAT_AFFECTEDRANGEMANAGER_H
diff --git a/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp b/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp
index b820f53..322969e 100644
--- a/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp
@@ -64,7 +64,7 @@ static bool startsNextParameter(const FormatToken &Current,
ContinuationIndenter::ContinuationIndenter(const FormatStyle &Style,
const AdditionalKeywords &Keywords,
- SourceManager &SourceMgr,
+ const SourceManager &SourceMgr,
WhitespaceManager &Whitespaces,
encoding::Encoding Encoding,
bool BinPackInconclusiveFunctions)
@@ -151,6 +151,7 @@ bool ContinuationIndenter::mustBreak(const LineState &State) {
return true;
if ((startsNextParameter(Current, Style) || Previous.is(tok::semi) ||
(Previous.is(TT_TemplateCloser) && Current.is(TT_StartOfName) &&
+ Style.Language == FormatStyle::LK_Cpp &&
// FIXME: This is a temporary workaround for the case where clang-format
// sets BreakBeforeParameter to avoid bin packing and this creates a
// completely unnecessary line break after a template type that isn't
@@ -249,7 +250,7 @@ bool ContinuationIndenter::mustBreak(const LineState &State) {
// If the return type spans multiple lines, wrap before the function name.
if ((Current.is(TT_FunctionDeclarationName) ||
(Current.is(tok::kw_operator) && !Previous.is(tok::coloncolon))) &&
- State.Stack.back().BreakBeforeParameter)
+ !Previous.is(tok::kw_template) && State.Stack.back().BreakBeforeParameter)
return true;
if (startsSegmentOfBuilderTypeCall(Current) &&
@@ -352,9 +353,20 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState &State, bool DryRun,
// disallowing any further line breaks if there is no line break after the
// opening parenthesis. Don't break if it doesn't conserve columns.
if (Style.AlignAfterOpenBracket == FormatStyle::BAS_AlwaysBreak &&
- Previous.is(tok::l_paren) && State.Column > getNewLineColumn(State) &&
+ Previous.isOneOf(tok::l_paren, TT_TemplateOpener, tok::l_square) &&
+ State.Column > getNewLineColumn(State) &&
(!Previous.Previous ||
- !Previous.Previous->isOneOf(tok::kw_for, tok::kw_while, tok::kw_switch)))
+ !Previous.Previous->isOneOf(tok::kw_for, tok::kw_while,
+ tok::kw_switch)) &&
+ // Don't do this for simple (no expressions) one-argument function calls
+ // as that feels like needlessly wasting whitespace, e.g.:
+ //
+ // caaaaaaaaaaaall(
+ // caaaaaaaaaaaall(
+ // caaaaaaaaaaaall(
+ // caaaaaaaaaaaaaaaaaaaaaaall(aaaaaaaaaaaaaa, aaaaaaaaa))));
+ Current.FakeLParens.size() > 0 &&
+ Current.FakeLParens.back() > prec::Unknown)
State.Stack.back().NoLineBreak = true;
if (Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign &&
@@ -400,9 +412,9 @@ void ContinuationIndenter::addTokenOnCurrentLine(LineState &State, bool DryRun,
(Previous.isNot(tok::lessless) || Previous.OperatorIndex != 0 ||
Previous.NextOperator)) ||
Current.StartsBinaryExpression)) {
- // Always indent relative to the RHS of the expression unless this is a
- // simple assignment without binary expression on the RHS. Also indent
- // relative to unary operators and the colons of constructor initializers.
+ // Indent relative to the RHS of the expression unless this is a simple
+ // assignment without binary expression on the RHS. Also indent relative to
+ // unary operators and the colons of constructor initializers.
State.Stack.back().LastSpace = State.Column;
} else if (Previous.is(TT_InheritanceColon)) {
State.Stack.back().Indent = State.Column;
@@ -464,10 +476,13 @@ unsigned ContinuationIndenter::addTokenOnNewLine(LineState &State,
// // code
// }
//
- // is common and should be formatted like a free-standing function.
- if (Style.Language != FormatStyle::LK_JavaScript ||
- Current.NestingLevel != 0 || !PreviousNonComment->is(tok::equal) ||
- !Current.is(Keywords.kw_function))
+ // is common and should be formatted like a free-standing function. The same
+ // goes for wrapping before the lambda return type arrow.
+ if (!Current.is(TT_LambdaArrow) &&
+ (Style.Language != FormatStyle::LK_JavaScript ||
+ Current.NestingLevel != 0 || !PreviousNonComment ||
+ !PreviousNonComment->is(tok::equal) ||
+ !Current.isOneOf(Keywords.kw_async, Keywords.kw_function)))
State.Stack.back().NestedBlockIndent = State.Column;
if (NextNonComment->isMemberAccess()) {
@@ -529,6 +544,12 @@ unsigned ContinuationIndenter::addTokenOnNewLine(LineState &State,
if (!Current.isTrailingComment())
State.Stack.back().LastSpace = State.Column;
+ if (Current.is(tok::lessless))
+ // If we are breaking before a "<<", we always want to indent relative to
+ // RHS. This is necessary only for "<<", as we special-case it and don't
+ // always indent relative to the RHS.
+ State.Stack.back().LastSpace += 3; // 3 -> width of "<< ".
+
State.StartOfLineLevel = Current.NestingLevel;
State.LowestLevelOnLine = Current.NestingLevel;
@@ -703,11 +724,15 @@ unsigned ContinuationIndenter::moveStateToNextToken(LineState &State,
if (Current.is(TT_ArraySubscriptLSquare) &&
State.Stack.back().StartOfArraySubscripts == 0)
State.Stack.back().StartOfArraySubscripts = State.Column;
- if ((Current.is(tok::question) && Style.BreakBeforeTernaryOperators) ||
- (Current.getPreviousNonComment() && Current.isNot(tok::colon) &&
- Current.getPreviousNonComment()->is(tok::question) &&
- !Style.BreakBeforeTernaryOperators))
+ if (Style.BreakBeforeTernaryOperators && Current.is(tok::question))
State.Stack.back().QuestionColumn = State.Column;
+ if (!Style.BreakBeforeTernaryOperators && Current.isNot(tok::colon)) {
+ const FormatToken *Previous = Current.Previous;
+ while (Previous && Previous->isTrailingComment())
+ Previous = Previous->Previous;
+ if (Previous && Previous->is(tok::question))
+ State.Stack.back().QuestionColumn = State.Column;
+ }
if (!Current.opensScope() && !Current.closesScope())
State.LowestLevelOnLine =
std::min(State.LowestLevelOnLine, Current.NestingLevel);
@@ -835,7 +860,7 @@ void ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
// there is a line-break right after the operator.
// Exclude relational operators, as there, it is always more desirable to
// have the LHS 'left' of the RHS.
- if (Previous && Previous->getPrecedence() > prec::Assignment &&
+ if (Previous && Previous->getPrecedence() != prec::Assignment &&
Previous->isOneOf(TT_BinaryOperator, TT_ConditionalExpr) &&
Previous->getPrecedence() != prec::Relational) {
bool BreakBeforeOperator =
@@ -857,7 +882,8 @@ void ContinuationIndenter::moveStatePastFakeLParens(LineState &State,
// ParameterToInnerFunction));
if (*I > prec::Unknown)
NewParenState.LastSpace = std::max(NewParenState.LastSpace, State.Column);
- if (*I != prec::Conditional && !Current.is(TT_UnaryOperator))
+ if (*I != prec::Conditional && !Current.is(TT_UnaryOperator) &&
+ Style.AlignAfterOpenBracket != FormatStyle::BAS_DontAlign)
NewParenState.StartOfFunctionCall = State.Column;
// Always indent conditional expressions. Never indent expression where
@@ -1022,6 +1048,9 @@ void ContinuationIndenter::moveStateToNewBlock(LineState &State) {
unsigned ContinuationIndenter::addMultilineToken(const FormatToken &Current,
LineState &State) {
+ if (!Current.IsMultiline)
+ return 0;
+
// Break before further function parameters on all levels.
for (unsigned i = 0, e = State.Stack.size(); i != e; ++i)
State.Stack[i].BreakBeforeParameter = true;
@@ -1060,7 +1089,8 @@ unsigned ContinuationIndenter::breakProtrudingToken(const FormatToken &Current,
// FIXME: String literal breaking is currently disabled for Java and JS, as
// it requires strings to be merged using "+" which we don't support.
if (Style.Language == FormatStyle::LK_Java ||
- Style.Language == FormatStyle::LK_JavaScript)
+ Style.Language == FormatStyle::LK_JavaScript ||
+ !Style.BreakStringLiterals)
return 0;
// Don't break string literals inside preprocessor directives (except for
@@ -1100,10 +1130,10 @@ unsigned ContinuationIndenter::breakProtrudingToken(const FormatToken &Current,
} else {
return 0;
}
- } else if (Current.is(TT_BlockComment) && Current.isTrailingComment()) {
- if (!Style.ReflowComments ||
+ } else if (Current.is(TT_BlockComment)) {
+ if (!Current.isTrailingComment() || !Style.ReflowComments ||
CommentPragmasRegex.match(Current.TokenText.substr(2)))
- return 0;
+ return addMultilineToken(Current, State);
Token.reset(new BreakableBlockComment(
Current, State.Line->Level, StartColumn, Current.OriginalColumn,
!Current.Previous, State.Line->InPPDirective, Encoding, Style));
diff --git a/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.h b/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.h
index 9b9154e..21ad653 100644
--- a/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.h
+++ b/contrib/llvm/tools/clang/lib/Format/ContinuationIndenter.h
@@ -38,7 +38,8 @@ public:
/// column \p FirstIndent.
ContinuationIndenter(const FormatStyle &Style,
const AdditionalKeywords &Keywords,
- SourceManager &SourceMgr, WhitespaceManager &Whitespaces,
+ const SourceManager &SourceMgr,
+ WhitespaceManager &Whitespaces,
encoding::Encoding Encoding,
bool BinPackInconclusiveFunctions);
@@ -137,7 +138,7 @@ private:
FormatStyle Style;
const AdditionalKeywords &Keywords;
- SourceManager &SourceMgr;
+ const SourceManager &SourceMgr;
WhitespaceManager &Whitespaces;
encoding::Encoding Encoding;
bool BinPackInconclusiveFunctions;
diff --git a/contrib/llvm/tools/clang/lib/Format/Encoding.h b/contrib/llvm/tools/clang/lib/Format/Encoding.h
index 592d720..148f7fd 100644
--- a/contrib/llvm/tools/clang/lib/Format/Encoding.h
+++ b/contrib/llvm/tools/clang/lib/Format/Encoding.h
@@ -17,6 +17,7 @@
#define LLVM_CLANG_LIB_FORMAT_ENCODING_H
#include "clang/Basic/LLVM.h"
+#include "llvm/ADT/StringRef.h"
#include "llvm/Support/ConvertUTF.h"
#include "llvm/Support/Unicode.h"
diff --git a/contrib/llvm/tools/clang/lib/Format/Format.cpp b/contrib/llvm/tools/clang/lib/Format/Format.cpp
index 2689368..32d6bb8 100644
--- a/contrib/llvm/tools/clang/lib/Format/Format.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/Format.cpp
@@ -14,7 +14,11 @@
//===----------------------------------------------------------------------===//
#include "clang/Format/Format.h"
+#include "AffectedRangeManager.h"
#include "ContinuationIndenter.h"
+#include "FormatTokenLexer.h"
+#include "SortJavaScriptImports.h"
+#include "TokenAnalyzer.h"
#include "TokenAnnotator.h"
#include "UnwrappedLineFormatter.h"
#include "UnwrappedLineParser.h"
@@ -22,6 +26,7 @@
#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/DiagnosticOptions.h"
#include "clang/Basic/SourceManager.h"
+#include "clang/Basic/VirtualFileSystem.h"
#include "clang/Lex/Lexer.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Allocator.h"
@@ -29,6 +34,8 @@
#include "llvm/Support/Path.h"
#include "llvm/Support/Regex.h"
#include "llvm/Support/YAMLTraits.h"
+#include <algorithm>
+#include <memory>
#include <queue>
#include <string>
@@ -68,6 +75,16 @@ template <> struct ScalarEnumerationTraits<FormatStyle::UseTabStyle> {
IO.enumCase(Value, "Always", FormatStyle::UT_Always);
IO.enumCase(Value, "true", FormatStyle::UT_Always);
IO.enumCase(Value, "ForIndentation", FormatStyle::UT_ForIndentation);
+ IO.enumCase(Value, "ForContinuationAndIndentation",
+ FormatStyle::UT_ForContinuationAndIndentation);
+ }
+};
+
+template <> struct ScalarEnumerationTraits<FormatStyle::JavaScriptQuoteStyle> {
+ static void enumeration(IO &IO, FormatStyle::JavaScriptQuoteStyle &Value) {
+ IO.enumCase(Value, "Leave", FormatStyle::JSQS_Leave);
+ IO.enumCase(Value, "Single", FormatStyle::JSQS_Single);
+ IO.enumCase(Value, "Double", FormatStyle::JSQS_Double);
}
};
@@ -275,6 +292,9 @@ template <> struct MappingTraits<FormatStyle> {
Style.BreakBeforeTernaryOperators);
IO.mapOptional("BreakConstructorInitializersBeforeComma",
Style.BreakConstructorInitializersBeforeComma);
+ IO.mapOptional("BreakAfterJavaFieldAnnotations",
+ Style.BreakAfterJavaFieldAnnotations);
+ IO.mapOptional("BreakStringLiterals", Style.BreakStringLiterals);
IO.mapOptional("ColumnLimit", Style.ColumnLimit);
IO.mapOptional("CommentPragmas", Style.CommentPragmas);
IO.mapOptional("ConstructorInitializerAllOnOneLineOrOnePerLine",
@@ -289,10 +309,13 @@ template <> struct MappingTraits<FormatStyle> {
Style.ExperimentalAutoDetectBinPacking);
IO.mapOptional("ForEachMacros", Style.ForEachMacros);
IO.mapOptional("IncludeCategories", Style.IncludeCategories);
+ IO.mapOptional("IncludeIsMainRegex", Style.IncludeIsMainRegex);
IO.mapOptional("IndentCaseLabels", Style.IndentCaseLabels);
IO.mapOptional("IndentWidth", Style.IndentWidth);
IO.mapOptional("IndentWrappedFunctionNames",
Style.IndentWrappedFunctionNames);
+ IO.mapOptional("JavaScriptQuotes", Style.JavaScriptQuotes);
+ IO.mapOptional("JavaScriptWrapImports", Style.JavaScriptWrapImports);
IO.mapOptional("KeepEmptyLinesAtTheStartOfBlocks",
Style.KeepEmptyLinesAtTheStartOfBlocks);
IO.mapOptional("MacroBlockBegin", Style.MacroBlockBegin);
@@ -488,8 +511,9 @@ FormatStyle getLLVMStyle() {
LLVMStyle.BreakBeforeBraces = FormatStyle::BS_Attach;
LLVMStyle.BraceWrapping = {false, false, false, false, false, false,
false, false, false, false, false};
- LLVMStyle.BreakConstructorInitializersBeforeComma = false;
LLVMStyle.BreakAfterJavaFieldAnnotations = false;
+ LLVMStyle.BreakConstructorInitializersBeforeComma = false;
+ LLVMStyle.BreakStringLiterals = true;
LLVMStyle.ColumnLimit = 80;
LLVMStyle.CommentPragmas = "^ IWYU pragma:";
LLVMStyle.ConstructorInitializerAllOnOneLineOrOnePerLine = false;
@@ -504,9 +528,12 @@ FormatStyle getLLVMStyle() {
LLVMStyle.IncludeCategories = {{"^\"(llvm|llvm-c|clang|clang-c)/", 2},
{"^(<|\"(gtest|isl|json)/)", 3},
{".*", 1}};
+ LLVMStyle.IncludeIsMainRegex = "$";
LLVMStyle.IndentCaseLabels = false;
LLVMStyle.IndentWrappedFunctionNames = false;
LLVMStyle.IndentWidth = 2;
+ LLVMStyle.JavaScriptQuotes = FormatStyle::JSQS_Leave;
+ LLVMStyle.JavaScriptWrapImports = true;
LLVMStyle.TabWidth = 8;
LLVMStyle.MaxEmptyLinesToKeep = 1;
LLVMStyle.KeepEmptyLinesAtTheStartOfBlocks = true;
@@ -518,6 +545,7 @@ FormatStyle getLLVMStyle() {
LLVMStyle.SpacesBeforeTrailingComments = 1;
LLVMStyle.Standard = FormatStyle::LS_Cpp11;
LLVMStyle.UseTab = FormatStyle::UT_Never;
+ LLVMStyle.JavaScriptQuotes = FormatStyle::JSQS_Leave;
LLVMStyle.ReflowComments = true;
LLVMStyle.SpacesInParentheses = false;
LLVMStyle.SpacesInSquareBrackets = false;
@@ -555,6 +583,7 @@ FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {
GoogleStyle.ConstructorInitializerAllOnOneLineOrOnePerLine = true;
GoogleStyle.DerivePointerAlignment = true;
GoogleStyle.IncludeCategories = {{"^<.*\\.h>", 1}, {"^<.*", 2}, {".*", 3}};
+ GoogleStyle.IncludeIsMainRegex = "([-_](test|unittest))?$";
GoogleStyle.IndentCaseLabels = true;
GoogleStyle.KeepEmptyLinesAtTheStartOfBlocks = false;
GoogleStyle.ObjCSpaceAfterProperty = false;
@@ -583,9 +612,12 @@ FormatStyle getGoogleStyle(FormatStyle::LanguageKind Language) {
GoogleStyle.AllowShortFunctionsOnASingleLine = FormatStyle::SFS_Inline;
GoogleStyle.AlwaysBreakBeforeMultilineStrings = false;
GoogleStyle.BreakBeforeTernaryOperators = false;
- GoogleStyle.CommentPragmas = "@(export|visibility) {";
+ GoogleStyle.CommentPragmas = "@(export|requirecss|return|see|visibility) ";
GoogleStyle.MaxEmptyLinesToKeep = 3;
+ GoogleStyle.NamespaceIndentation = FormatStyle::NI_All;
GoogleStyle.SpacesInContainerLiterals = false;
+ GoogleStyle.JavaScriptQuotes = FormatStyle::JSQS_Single;
+ GoogleStyle.JavaScriptWrapImports = false;
} else if (Language == FormatStyle::LK_Proto) {
GoogleStyle.AllowShortFunctionsOnASingleLine = FormatStyle::SFS_None;
GoogleStyle.SpacesInContainerLiterals = false;
@@ -759,734 +791,35 @@ std::string configurationAsText(const FormatStyle &Style) {
namespace {
-class FormatTokenLexer {
-public:
- FormatTokenLexer(SourceManager &SourceMgr, FileID ID, FormatStyle &Style,
- encoding::Encoding Encoding)
- : FormatTok(nullptr), IsFirstToken(true), GreaterStashed(false),
- LessStashed(false), Column(0), TrailingWhitespace(0),
- SourceMgr(SourceMgr), ID(ID), Style(Style),
- IdentTable(getFormattingLangOpts(Style)), Keywords(IdentTable),
- Encoding(Encoding), FirstInLineIndex(0), FormattingDisabled(false),
- MacroBlockBeginRegex(Style.MacroBlockBegin),
- MacroBlockEndRegex(Style.MacroBlockEnd) {
- Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,
- getFormattingLangOpts(Style)));
- Lex->SetKeepWhitespaceMode(true);
-
- for (const std::string &ForEachMacro : Style.ForEachMacros)
- ForEachMacros.push_back(&IdentTable.get(ForEachMacro));
- std::sort(ForEachMacros.begin(), ForEachMacros.end());
- }
-
- ArrayRef<FormatToken *> lex() {
- assert(Tokens.empty());
- assert(FirstInLineIndex == 0);
- do {
- Tokens.push_back(getNextToken());
- if (Style.Language == FormatStyle::LK_JavaScript)
- tryParseJSRegexLiteral();
- tryMergePreviousTokens();
- if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
- FirstInLineIndex = Tokens.size() - 1;
- } while (Tokens.back()->Tok.isNot(tok::eof));
- return Tokens;
- }
-
- const AdditionalKeywords &getKeywords() { return Keywords; }
-
-private:
- void tryMergePreviousTokens() {
- if (tryMerge_TMacro())
- return;
- if (tryMergeConflictMarkers())
- return;
- if (tryMergeLessLess())
- return;
-
- if (Style.Language == FormatStyle::LK_JavaScript) {
- if (tryMergeTemplateString())
- return;
-
- static const tok::TokenKind JSIdentity[] = {tok::equalequal, tok::equal};
- static const tok::TokenKind JSNotIdentity[] = {tok::exclaimequal,
- tok::equal};
- static const tok::TokenKind JSShiftEqual[] = {tok::greater, tok::greater,
- tok::greaterequal};
- static const tok::TokenKind JSRightArrow[] = {tok::equal, tok::greater};
- // FIXME: Investigate what token type gives the correct operator priority.
- if (tryMergeTokens(JSIdentity, TT_BinaryOperator))
- return;
- if (tryMergeTokens(JSNotIdentity, TT_BinaryOperator))
- return;
- if (tryMergeTokens(JSShiftEqual, TT_BinaryOperator))
- return;
- if (tryMergeTokens(JSRightArrow, TT_JsFatArrow))
- return;
- }
- }
-
- bool tryMergeLessLess() {
- // Merge X,less,less,Y into X,lessless,Y unless X or Y is less.
- if (Tokens.size() < 3)
- return false;
-
- bool FourthTokenIsLess = false;
- if (Tokens.size() > 3)
- FourthTokenIsLess = (Tokens.end() - 4)[0]->is(tok::less);
-
- auto First = Tokens.end() - 3;
- if (First[2]->is(tok::less) || First[1]->isNot(tok::less) ||
- First[0]->isNot(tok::less) || FourthTokenIsLess)
- return false;
-
- // Only merge if there currently is no whitespace between the two "<".
- if (First[1]->WhitespaceRange.getBegin() !=
- First[1]->WhitespaceRange.getEnd())
- return false;
-
- First[0]->Tok.setKind(tok::lessless);
- First[0]->TokenText = "<<";
- First[0]->ColumnWidth += 1;
- Tokens.erase(Tokens.end() - 2);
- return true;
- }
-
- bool tryMergeTokens(ArrayRef<tok::TokenKind> Kinds, TokenType NewType) {
- if (Tokens.size() < Kinds.size())
- return false;
-
- SmallVectorImpl<FormatToken *>::const_iterator First =
- Tokens.end() - Kinds.size();
- if (!First[0]->is(Kinds[0]))
- return false;
- unsigned AddLength = 0;
- for (unsigned i = 1; i < Kinds.size(); ++i) {
- if (!First[i]->is(Kinds[i]) ||
- First[i]->WhitespaceRange.getBegin() !=
- First[i]->WhitespaceRange.getEnd())
- return false;
- AddLength += First[i]->TokenText.size();
- }
- Tokens.resize(Tokens.size() - Kinds.size() + 1);
- First[0]->TokenText = StringRef(First[0]->TokenText.data(),
- First[0]->TokenText.size() + AddLength);
- First[0]->ColumnWidth += AddLength;
- First[0]->Type = NewType;
- return true;
- }
-
- // Returns \c true if \p Tok can only be followed by an operand in JavaScript.
- bool precedesOperand(FormatToken *Tok) {
- // NB: This is not entirely correct, as an r_paren can introduce an operand
- // location in e.g. `if (foo) /bar/.exec(...);`. That is a rare enough
- // corner case to not matter in practice, though.
- return Tok->isOneOf(tok::period, tok::l_paren, tok::comma, tok::l_brace,
- tok::r_brace, tok::l_square, tok::semi, tok::exclaim,
- tok::colon, tok::question, tok::tilde) ||
- Tok->isOneOf(tok::kw_return, tok::kw_do, tok::kw_case, tok::kw_throw,
- tok::kw_else, tok::kw_new, tok::kw_delete, tok::kw_void,
- tok::kw_typeof, Keywords.kw_instanceof,
- Keywords.kw_in) ||
- Tok->isBinaryOperator();
- }
-
- bool canPrecedeRegexLiteral(FormatToken *Prev) {
- if (!Prev)
- return true;
-
- // Regex literals can only follow after prefix unary operators, not after
- // postfix unary operators. If the '++' is followed by a non-operand
- // introducing token, the slash here is the operand and not the start of a
- // regex.
- if (Prev->isOneOf(tok::plusplus, tok::minusminus))
- return (Tokens.size() < 3 || precedesOperand(Tokens[Tokens.size() - 3]));
-
- // The previous token must introduce an operand location where regex
- // literals can occur.
- if (!precedesOperand(Prev))
- return false;
-
- return true;
- }
-
- // Tries to parse a JavaScript Regex literal starting at the current token,
- // if that begins with a slash and is in a location where JavaScript allows
- // regex literals. Changes the current token to a regex literal and updates
- // its text if successful.
- void tryParseJSRegexLiteral() {
- FormatToken *RegexToken = Tokens.back();
- if (!RegexToken->isOneOf(tok::slash, tok::slashequal))
- return;
-
- FormatToken *Prev = nullptr;
- for (auto I = Tokens.rbegin() + 1, E = Tokens.rend(); I != E; ++I) {
- // NB: Because previous pointers are not initialized yet, this cannot use
- // Token.getPreviousNonComment.
- if ((*I)->isNot(tok::comment)) {
- Prev = *I;
- break;
- }
- }
-
- if (!canPrecedeRegexLiteral(Prev))
- return;
-
- // 'Manually' lex ahead in the current file buffer.
- const char *Offset = Lex->getBufferLocation();
- const char *RegexBegin = Offset - RegexToken->TokenText.size();
- StringRef Buffer = Lex->getBuffer();
- bool InCharacterClass = false;
- bool HaveClosingSlash = false;
- for (; !HaveClosingSlash && Offset != Buffer.end(); ++Offset) {
- // Regular expressions are terminated with a '/', which can only be
- // escaped using '\' or a character class between '[' and ']'.
- // See http://www.ecma-international.org/ecma-262/5.1/#sec-7.8.5.
- switch (*Offset) {
- case '\\':
- // Skip the escaped character.
- ++Offset;
- break;
- case '[':
- InCharacterClass = true;
- break;
- case ']':
- InCharacterClass = false;
- break;
- case '/':
- if (!InCharacterClass)
- HaveClosingSlash = true;
- break;
- }
- }
-
- RegexToken->Type = TT_RegexLiteral;
- // Treat regex literals like other string_literals.
- RegexToken->Tok.setKind(tok::string_literal);
- RegexToken->TokenText = StringRef(RegexBegin, Offset - RegexBegin);
- RegexToken->ColumnWidth = RegexToken->TokenText.size();
-
- resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset)));
- }
-
- bool tryMergeTemplateString() {
- if (Tokens.size() < 2)
- return false;
-
- FormatToken *EndBacktick = Tokens.back();
- // Backticks get lexed as tok::unknown tokens. If a template string contains
- // a comment start, it gets lexed as a tok::comment, or tok::unknown if
- // unterminated.
- if (!EndBacktick->isOneOf(tok::comment, tok::string_literal,
- tok::char_constant, tok::unknown))
- return false;
- size_t CommentBacktickPos = EndBacktick->TokenText.find('`');
- // Unknown token that's not actually a backtick, or a comment that doesn't
- // contain a backtick.
- if (CommentBacktickPos == StringRef::npos)
- return false;
-
- unsigned TokenCount = 0;
- bool IsMultiline = false;
- unsigned EndColumnInFirstLine =
- EndBacktick->OriginalColumn + EndBacktick->ColumnWidth;
- for (auto I = Tokens.rbegin() + 1, E = Tokens.rend(); I != E; I++) {
- ++TokenCount;
- if (I[0]->IsMultiline)
- IsMultiline = true;
-
- // If there was a preceding template string, this must be the start of a
- // template string, not the end.
- if (I[0]->is(TT_TemplateString))
- return false;
-
- if (I[0]->isNot(tok::unknown) || I[0]->TokenText != "`") {
- // Keep track of the rhs offset of the last token to wrap across lines -
- // its the rhs offset of the first line of the template string, used to
- // determine its width.
- if (I[0]->IsMultiline)
- EndColumnInFirstLine = I[0]->OriginalColumn + I[0]->ColumnWidth;
- // If the token has newlines, the token before it (if it exists) is the
- // rhs end of the previous line.
- if (I[0]->NewlinesBefore > 0 && (I + 1 != E)) {
- EndColumnInFirstLine = I[1]->OriginalColumn + I[1]->ColumnWidth;
- IsMultiline = true;
- }
- continue;
- }
-
- Tokens.resize(Tokens.size() - TokenCount);
- Tokens.back()->Type = TT_TemplateString;
- const char *EndOffset =
- EndBacktick->TokenText.data() + 1 + CommentBacktickPos;
- if (CommentBacktickPos != 0) {
- // If the backtick was not the first character (e.g. in a comment),
- // re-lex after the backtick position.
- SourceLocation Loc = EndBacktick->Tok.getLocation();
- resetLexer(SourceMgr.getFileOffset(Loc) + CommentBacktickPos + 1);
- }
- Tokens.back()->TokenText =
- StringRef(Tokens.back()->TokenText.data(),
- EndOffset - Tokens.back()->TokenText.data());
-
- unsigned EndOriginalColumn = EndBacktick->OriginalColumn;
- if (EndOriginalColumn == 0) {
- SourceLocation Loc = EndBacktick->Tok.getLocation();
- EndOriginalColumn = SourceMgr.getSpellingColumnNumber(Loc);
- }
- // If the ` is further down within the token (e.g. in a comment).
- EndOriginalColumn += CommentBacktickPos;
-
- if (IsMultiline) {
- // ColumnWidth is from backtick to last token in line.
- // LastLineColumnWidth is 0 to backtick.
- // x = `some content
- // until here`;
- Tokens.back()->ColumnWidth =
- EndColumnInFirstLine - Tokens.back()->OriginalColumn;
- // +1 for the ` itself.
- Tokens.back()->LastLineColumnWidth = EndOriginalColumn + 1;
- Tokens.back()->IsMultiline = true;
- } else {
- // Token simply spans from start to end, +1 for the ` itself.
- Tokens.back()->ColumnWidth =
- EndOriginalColumn - Tokens.back()->OriginalColumn + 1;
- }
- return true;
- }
- return false;
- }
-
- bool tryMerge_TMacro() {
- if (Tokens.size() < 4)
- return false;
- FormatToken *Last = Tokens.back();
- if (!Last->is(tok::r_paren))
- return false;
-
- FormatToken *String = Tokens[Tokens.size() - 2];
- if (!String->is(tok::string_literal) || String->IsMultiline)
- return false;
-
- if (!Tokens[Tokens.size() - 3]->is(tok::l_paren))
- return false;
-
- FormatToken *Macro = Tokens[Tokens.size() - 4];
- if (Macro->TokenText != "_T")
- return false;
-
- const char *Start = Macro->TokenText.data();
- const char *End = Last->TokenText.data() + Last->TokenText.size();
- String->TokenText = StringRef(Start, End - Start);
- String->IsFirst = Macro->IsFirst;
- String->LastNewlineOffset = Macro->LastNewlineOffset;
- String->WhitespaceRange = Macro->WhitespaceRange;
- String->OriginalColumn = Macro->OriginalColumn;
- String->ColumnWidth = encoding::columnWidthWithTabs(
- String->TokenText, String->OriginalColumn, Style.TabWidth, Encoding);
- String->NewlinesBefore = Macro->NewlinesBefore;
- String->HasUnescapedNewline = Macro->HasUnescapedNewline;
-
- Tokens.pop_back();
- Tokens.pop_back();
- Tokens.pop_back();
- Tokens.back() = String;
- return true;
- }
-
- bool tryMergeConflictMarkers() {
- if (Tokens.back()->NewlinesBefore == 0 && Tokens.back()->isNot(tok::eof))
- return false;
-
- // Conflict lines look like:
- // <marker> <text from the vcs>
- // For example:
- // >>>>>>> /file/in/file/system at revision 1234
- //
- // We merge all tokens in a line that starts with a conflict marker
- // into a single token with a special token type that the unwrapped line
- // parser will use to correctly rebuild the underlying code.
-
- FileID ID;
- // Get the position of the first token in the line.
- unsigned FirstInLineOffset;
- std::tie(ID, FirstInLineOffset) = SourceMgr.getDecomposedLoc(
- Tokens[FirstInLineIndex]->getStartOfNonWhitespace());
- StringRef Buffer = SourceMgr.getBuffer(ID)->getBuffer();
- // Calculate the offset of the start of the current line.
- auto LineOffset = Buffer.rfind('\n', FirstInLineOffset);
- if (LineOffset == StringRef::npos) {
- LineOffset = 0;
- } else {
- ++LineOffset;
- }
-
- auto FirstSpace = Buffer.find_first_of(" \n", LineOffset);
- StringRef LineStart;
- if (FirstSpace == StringRef::npos) {
- LineStart = Buffer.substr(LineOffset);
- } else {
- LineStart = Buffer.substr(LineOffset, FirstSpace - LineOffset);
- }
-
- TokenType Type = TT_Unknown;
- if (LineStart == "<<<<<<<" || LineStart == ">>>>") {
- Type = TT_ConflictStart;
- } else if (LineStart == "|||||||" || LineStart == "=======" ||
- LineStart == "====") {
- Type = TT_ConflictAlternative;
- } else if (LineStart == ">>>>>>>" || LineStart == "<<<<") {
- Type = TT_ConflictEnd;
- }
-
- if (Type != TT_Unknown) {
- FormatToken *Next = Tokens.back();
-
- Tokens.resize(FirstInLineIndex + 1);
- // We do not need to build a complete token here, as we will skip it
- // during parsing anyway (as we must not touch whitespace around conflict
- // markers).
- Tokens.back()->Type = Type;
- Tokens.back()->Tok.setKind(tok::kw___unknown_anytype);
-
- Tokens.push_back(Next);
- return true;
- }
-
- return false;
- }
-
- FormatToken *getStashedToken() {
- // Create a synthesized second '>' or '<' token.
- Token Tok = FormatTok->Tok;
- StringRef TokenText = FormatTok->TokenText;
-
- unsigned OriginalColumn = FormatTok->OriginalColumn;
- FormatTok = new (Allocator.Allocate()) FormatToken;
- FormatTok->Tok = Tok;
- SourceLocation TokLocation =
- FormatTok->Tok.getLocation().getLocWithOffset(Tok.getLength() - 1);
- FormatTok->Tok.setLocation(TokLocation);
- FormatTok->WhitespaceRange = SourceRange(TokLocation, TokLocation);
- FormatTok->TokenText = TokenText;
- FormatTok->ColumnWidth = 1;
- FormatTok->OriginalColumn = OriginalColumn + 1;
-
- return FormatTok;
- }
-
- FormatToken *getNextToken() {
- if (GreaterStashed) {
- GreaterStashed = false;
- return getStashedToken();
- }
- if (LessStashed) {
- LessStashed = false;
- return getStashedToken();
- }
-
- FormatTok = new (Allocator.Allocate()) FormatToken;
- readRawToken(*FormatTok);
- SourceLocation WhitespaceStart =
- FormatTok->Tok.getLocation().getLocWithOffset(-TrailingWhitespace);
- FormatTok->IsFirst = IsFirstToken;
- IsFirstToken = false;
-
- // Consume and record whitespace until we find a significant token.
- unsigned WhitespaceLength = TrailingWhitespace;
- while (FormatTok->Tok.is(tok::unknown)) {
- StringRef Text = FormatTok->TokenText;
- auto EscapesNewline = [&](int pos) {
- // A '\r' here is just part of '\r\n'. Skip it.
- if (pos >= 0 && Text[pos] == '\r')
- --pos;
- // See whether there is an odd number of '\' before this.
- unsigned count = 0;
- for (; pos >= 0; --pos, ++count)
- if (Text[pos] != '\\')
- break;
- return count & 1;
- };
- // FIXME: This miscounts tok:unknown tokens that are not just
- // whitespace, e.g. a '`' character.
- for (int i = 0, e = Text.size(); i != e; ++i) {
- switch (Text[i]) {
- case '\n':
- ++FormatTok->NewlinesBefore;
- FormatTok->HasUnescapedNewline = !EscapesNewline(i - 1);
- FormatTok->LastNewlineOffset = WhitespaceLength + i + 1;
- Column = 0;
- break;
- case '\r':
- FormatTok->LastNewlineOffset = WhitespaceLength + i + 1;
- Column = 0;
- break;
- case '\f':
- case '\v':
- Column = 0;
- break;
- case ' ':
- ++Column;
- break;
- case '\t':
- Column += Style.TabWidth - Column % Style.TabWidth;
- break;
- case '\\':
- if (i + 1 == e || (Text[i + 1] != '\r' && Text[i + 1] != '\n'))
- FormatTok->Type = TT_ImplicitStringLiteral;
- break;
- default:
- FormatTok->Type = TT_ImplicitStringLiteral;
- break;
- }
- if (FormatTok->Type == TT_ImplicitStringLiteral)
- break;
- }
-
- if (FormatTok->is(TT_ImplicitStringLiteral))
- break;
- WhitespaceLength += FormatTok->Tok.getLength();
-
- readRawToken(*FormatTok);
- }
-
- // In case the token starts with escaped newlines, we want to
- // take them into account as whitespace - this pattern is quite frequent
- // in macro definitions.
- // FIXME: Add a more explicit test.
- while (FormatTok->TokenText.size() > 1 && FormatTok->TokenText[0] == '\\' &&
- FormatTok->TokenText[1] == '\n') {
- ++FormatTok->NewlinesBefore;
- WhitespaceLength += 2;
- FormatTok->LastNewlineOffset = 2;
- Column = 0;
- FormatTok->TokenText = FormatTok->TokenText.substr(2);
- }
-
- FormatTok->WhitespaceRange = SourceRange(
- WhitespaceStart, WhitespaceStart.getLocWithOffset(WhitespaceLength));
-
- FormatTok->OriginalColumn = Column;
-
- TrailingWhitespace = 0;
- if (FormatTok->Tok.is(tok::comment)) {
- // FIXME: Add the trimmed whitespace to Column.
- StringRef UntrimmedText = FormatTok->TokenText;
- FormatTok->TokenText = FormatTok->TokenText.rtrim(" \t\v\f");
- TrailingWhitespace = UntrimmedText.size() - FormatTok->TokenText.size();
- } else if (FormatTok->Tok.is(tok::raw_identifier)) {
- IdentifierInfo &Info = IdentTable.get(FormatTok->TokenText);
- FormatTok->Tok.setIdentifierInfo(&Info);
- FormatTok->Tok.setKind(Info.getTokenID());
- if (Style.Language == FormatStyle::LK_Java &&
- FormatTok->isOneOf(tok::kw_struct, tok::kw_union, tok::kw_delete,
- tok::kw_operator)) {
- FormatTok->Tok.setKind(tok::identifier);
- FormatTok->Tok.setIdentifierInfo(nullptr);
- } else if (Style.Language == FormatStyle::LK_JavaScript &&
- FormatTok->isOneOf(tok::kw_struct, tok::kw_union,
- tok::kw_operator)) {
- FormatTok->Tok.setKind(tok::identifier);
- FormatTok->Tok.setIdentifierInfo(nullptr);
- }
- } else if (FormatTok->Tok.is(tok::greatergreater)) {
- FormatTok->Tok.setKind(tok::greater);
- FormatTok->TokenText = FormatTok->TokenText.substr(0, 1);
- GreaterStashed = true;
- } else if (FormatTok->Tok.is(tok::lessless)) {
- FormatTok->Tok.setKind(tok::less);
- FormatTok->TokenText = FormatTok->TokenText.substr(0, 1);
- LessStashed = true;
- }
-
- // Now FormatTok is the next non-whitespace token.
-
- StringRef Text = FormatTok->TokenText;
- size_t FirstNewlinePos = Text.find('\n');
- if (FirstNewlinePos == StringRef::npos) {
- // FIXME: ColumnWidth actually depends on the start column, we need to
- // take this into account when the token is moved.
- FormatTok->ColumnWidth =
- encoding::columnWidthWithTabs(Text, Column, Style.TabWidth, Encoding);
- Column += FormatTok->ColumnWidth;
- } else {
- FormatTok->IsMultiline = true;
- // FIXME: ColumnWidth actually depends on the start column, we need to
- // take this into account when the token is moved.
- FormatTok->ColumnWidth = encoding::columnWidthWithTabs(
- Text.substr(0, FirstNewlinePos), Column, Style.TabWidth, Encoding);
-
- // The last line of the token always starts in column 0.
- // Thus, the length can be precomputed even in the presence of tabs.
- FormatTok->LastLineColumnWidth = encoding::columnWidthWithTabs(
- Text.substr(Text.find_last_of('\n') + 1), 0, Style.TabWidth,
- Encoding);
- Column = FormatTok->LastLineColumnWidth;
- }
-
- if (Style.Language == FormatStyle::LK_Cpp) {
- if (!(Tokens.size() > 0 && Tokens.back()->Tok.getIdentifierInfo() &&
- Tokens.back()->Tok.getIdentifierInfo()->getPPKeywordID() ==
- tok::pp_define) &&
- std::find(ForEachMacros.begin(), ForEachMacros.end(),
- FormatTok->Tok.getIdentifierInfo()) != ForEachMacros.end()) {
- FormatTok->Type = TT_ForEachMacro;
- } else if (FormatTok->is(tok::identifier)) {
- if (MacroBlockBeginRegex.match(Text)) {
- FormatTok->Type = TT_MacroBlockBegin;
- } else if (MacroBlockEndRegex.match(Text)) {
- FormatTok->Type = TT_MacroBlockEnd;
- }
- }
- }
-
- return FormatTok;
- }
-
- FormatToken *FormatTok;
- bool IsFirstToken;
- bool GreaterStashed, LessStashed;
- unsigned Column;
- unsigned TrailingWhitespace;
- std::unique_ptr<Lexer> Lex;
- SourceManager &SourceMgr;
- FileID ID;
- FormatStyle &Style;
- IdentifierTable IdentTable;
- AdditionalKeywords Keywords;
- encoding::Encoding Encoding;
- llvm::SpecificBumpPtrAllocator<FormatToken> Allocator;
- // Index (in 'Tokens') of the last token that starts a new line.
- unsigned FirstInLineIndex;
- SmallVector<FormatToken *, 16> Tokens;
- SmallVector<IdentifierInfo *, 8> ForEachMacros;
-
- bool FormattingDisabled;
-
- llvm::Regex MacroBlockBeginRegex;
- llvm::Regex MacroBlockEndRegex;
-
- void readRawToken(FormatToken &Tok) {
- Lex->LexFromRawLexer(Tok.Tok);
- Tok.TokenText = StringRef(SourceMgr.getCharacterData(Tok.Tok.getLocation()),
- Tok.Tok.getLength());
- // For formatting, treat unterminated string literals like normal string
- // literals.
- if (Tok.is(tok::unknown)) {
- if (!Tok.TokenText.empty() && Tok.TokenText[0] == '"') {
- Tok.Tok.setKind(tok::string_literal);
- Tok.IsUnterminatedLiteral = true;
- } else if (Style.Language == FormatStyle::LK_JavaScript &&
- Tok.TokenText == "''") {
- Tok.Tok.setKind(tok::char_constant);
- }
- }
-
- if (Tok.is(tok::comment) && (Tok.TokenText == "// clang-format on" ||
- Tok.TokenText == "/* clang-format on */")) {
- FormattingDisabled = false;
- }
-
- Tok.Finalized = FormattingDisabled;
-
- if (Tok.is(tok::comment) && (Tok.TokenText == "// clang-format off" ||
- Tok.TokenText == "/* clang-format off */")) {
- FormattingDisabled = true;
- }
- }
-
- void resetLexer(unsigned Offset) {
- StringRef Buffer = SourceMgr.getBufferData(ID);
- Lex.reset(new Lexer(SourceMgr.getLocForStartOfFile(ID),
- getFormattingLangOpts(Style), Buffer.begin(),
- Buffer.begin() + Offset, Buffer.end()));
- Lex->SetKeepWhitespaceMode(true);
- TrailingWhitespace = 0;
- }
-};
-
-static StringRef getLanguageName(FormatStyle::LanguageKind Language) {
- switch (Language) {
- case FormatStyle::LK_Cpp:
- return "C++";
- case FormatStyle::LK_Java:
- return "Java";
- case FormatStyle::LK_JavaScript:
- return "JavaScript";
- case FormatStyle::LK_Proto:
- return "Proto";
- default:
- return "Unknown";
- }
-}
-
-class Formatter : public UnwrappedLineConsumer {
+class Formatter : public TokenAnalyzer {
public:
- Formatter(const FormatStyle &Style, SourceManager &SourceMgr, FileID ID,
- ArrayRef<CharSourceRange> Ranges)
- : Style(Style), ID(ID), SourceMgr(SourceMgr),
- Whitespaces(SourceMgr, Style,
- inputUsesCRLF(SourceMgr.getBufferData(ID))),
- Ranges(Ranges.begin(), Ranges.end()), UnwrappedLines(1),
- Encoding(encoding::detectEncoding(SourceMgr.getBufferData(ID))) {
- DEBUG(llvm::dbgs() << "File encoding: "
- << (Encoding == encoding::Encoding_UTF8 ? "UTF8"
- : "unknown")
- << "\n");
- DEBUG(llvm::dbgs() << "Language: " << getLanguageName(Style.Language)
- << "\n");
- }
+ Formatter(const Environment &Env, const FormatStyle &Style,
+ bool *IncompleteFormat)
+ : TokenAnalyzer(Env, Style), IncompleteFormat(IncompleteFormat) {}
+
+ tooling::Replacements
+ analyze(TokenAnnotator &Annotator,
+ SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
+ FormatTokenLexer &Tokens, tooling::Replacements &Result) override {
+ deriveLocalStyle(AnnotatedLines);
+ AffectedRangeMgr.computeAffectedLines(AnnotatedLines.begin(),
+ AnnotatedLines.end());
- tooling::Replacements format(bool *IncompleteFormat) {
- tooling::Replacements Result;
- FormatTokenLexer Tokens(SourceMgr, ID, Style, Encoding);
-
- UnwrappedLineParser Parser(Style, Tokens.getKeywords(), Tokens.lex(),
- *this);
- Parser.parse();
- assert(UnwrappedLines.rbegin()->empty());
- for (unsigned Run = 0, RunE = UnwrappedLines.size(); Run + 1 != RunE;
- ++Run) {
- DEBUG(llvm::dbgs() << "Run " << Run << "...\n");
- SmallVector<AnnotatedLine *, 16> AnnotatedLines;
- for (unsigned i = 0, e = UnwrappedLines[Run].size(); i != e; ++i) {
- AnnotatedLines.push_back(new AnnotatedLine(UnwrappedLines[Run][i]));
- }
- tooling::Replacements RunResult =
- format(AnnotatedLines, Tokens, IncompleteFormat);
- DEBUG({
- llvm::dbgs() << "Replacements for run " << Run << ":\n";
- for (tooling::Replacements::iterator I = RunResult.begin(),
- E = RunResult.end();
- I != E; ++I) {
- llvm::dbgs() << I->toString() << "\n";
- }
- });
- for (unsigned i = 0, e = AnnotatedLines.size(); i != e; ++i) {
- delete AnnotatedLines[i];
- }
- Result.insert(RunResult.begin(), RunResult.end());
- Whitespaces.reset();
- }
- return Result;
- }
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ Style.JavaScriptQuotes != FormatStyle::JSQS_Leave)
+ requoteJSStringLiteral(AnnotatedLines, Result);
- tooling::Replacements format(SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
- FormatTokenLexer &Tokens,
- bool *IncompleteFormat) {
- TokenAnnotator Annotator(Style, Tokens.getKeywords());
- for (unsigned i = 0, e = AnnotatedLines.size(); i != e; ++i) {
- Annotator.annotate(*AnnotatedLines[i]);
- }
- deriveLocalStyle(AnnotatedLines);
for (unsigned i = 0, e = AnnotatedLines.size(); i != e; ++i) {
Annotator.calculateFormattingInformation(*AnnotatedLines[i]);
}
- computeAffectedLines(AnnotatedLines.begin(), AnnotatedLines.end());
Annotator.setCommentLineLevels(AnnotatedLines);
- ContinuationIndenter Indenter(Style, Tokens.getKeywords(), SourceMgr,
- Whitespaces, Encoding,
+
+ WhitespaceManager Whitespaces(
+ Env.getSourceManager(), Style,
+ inputUsesCRLF(Env.getSourceManager().getBufferData(Env.getFileID())));
+ ContinuationIndenter Indenter(Style, Tokens.getKeywords(),
+ Env.getSourceManager(), Whitespaces, Encoding,
BinPackInconclusiveFunctions);
UnwrappedLineFormatter(&Indenter, &Whitespaces, Style, Tokens.getKeywords(),
IncompleteFormat)
@@ -1495,137 +828,80 @@ public:
}
private:
- // Determines which lines are affected by the SourceRanges given as input.
- // Returns \c true if at least one line between I and E or one of their
- // children is affected.
- bool computeAffectedLines(SmallVectorImpl<AnnotatedLine *>::iterator I,
- SmallVectorImpl<AnnotatedLine *>::iterator E) {
- bool SomeLineAffected = false;
- const AnnotatedLine *PreviousLine = nullptr;
- while (I != E) {
- AnnotatedLine *Line = *I;
- Line->LeadingEmptyLinesAffected = affectsLeadingEmptyLines(*Line->First);
-
- // If a line is part of a preprocessor directive, it needs to be formatted
- // if any token within the directive is affected.
- if (Line->InPPDirective) {
- FormatToken *Last = Line->Last;
- SmallVectorImpl<AnnotatedLine *>::iterator PPEnd = I + 1;
- while (PPEnd != E && !(*PPEnd)->First->HasUnescapedNewline) {
- Last = (*PPEnd)->Last;
- ++PPEnd;
- }
-
- if (affectsTokenRange(*Line->First, *Last,
- /*IncludeLeadingNewlines=*/false)) {
- SomeLineAffected = true;
- markAllAsAffected(I, PPEnd);
- }
- I = PPEnd;
+ // If the last token is a double/single-quoted string literal, generates a
+ // replacement with a single/double quoted string literal, re-escaping the
+ // contents in the process.
+ void requoteJSStringLiteral(SmallVectorImpl<AnnotatedLine *> &Lines,
+ tooling::Replacements &Result) {
+ for (AnnotatedLine *Line : Lines) {
+ requoteJSStringLiteral(Line->Children, Result);
+ if (!Line->Affected)
continue;
- }
-
- if (nonPPLineAffected(Line, PreviousLine))
- SomeLineAffected = true;
-
- PreviousLine = Line;
- ++I;
- }
- return SomeLineAffected;
- }
-
- // Determines whether 'Line' is affected by the SourceRanges given as input.
- // Returns \c true if line or one if its children is affected.
- bool nonPPLineAffected(AnnotatedLine *Line,
- const AnnotatedLine *PreviousLine) {
- bool SomeLineAffected = false;
- Line->ChildrenAffected =
- computeAffectedLines(Line->Children.begin(), Line->Children.end());
- if (Line->ChildrenAffected)
- SomeLineAffected = true;
-
- // Stores whether one of the line's tokens is directly affected.
- bool SomeTokenAffected = false;
- // Stores whether we need to look at the leading newlines of the next token
- // in order to determine whether it was affected.
- bool IncludeLeadingNewlines = false;
-
- // Stores whether the first child line of any of this line's tokens is
- // affected.
- bool SomeFirstChildAffected = false;
-
- for (FormatToken *Tok = Line->First; Tok; Tok = Tok->Next) {
- // Determine whether 'Tok' was affected.
- if (affectsTokenRange(*Tok, *Tok, IncludeLeadingNewlines))
- SomeTokenAffected = true;
-
- // Determine whether the first child of 'Tok' was affected.
- if (!Tok->Children.empty() && Tok->Children.front()->Affected)
- SomeFirstChildAffected = true;
-
- IncludeLeadingNewlines = Tok->Children.empty();
- }
-
- // Was this line moved, i.e. has it previously been on the same line as an
- // affected line?
- bool LineMoved = PreviousLine && PreviousLine->Affected &&
- Line->First->NewlinesBefore == 0;
-
- bool IsContinuedComment =
- Line->First->is(tok::comment) && Line->First->Next == nullptr &&
- Line->First->NewlinesBefore < 2 && PreviousLine &&
- PreviousLine->Affected && PreviousLine->Last->is(tok::comment);
-
- if (SomeTokenAffected || SomeFirstChildAffected || LineMoved ||
- IsContinuedComment) {
- Line->Affected = true;
- SomeLineAffected = true;
- }
- return SomeLineAffected;
- }
-
- // Marks all lines between I and E as well as all their children as affected.
- void markAllAsAffected(SmallVectorImpl<AnnotatedLine *>::iterator I,
- SmallVectorImpl<AnnotatedLine *>::iterator E) {
- while (I != E) {
- (*I)->Affected = true;
- markAllAsAffected((*I)->Children.begin(), (*I)->Children.end());
- ++I;
- }
- }
-
- // Returns true if the range from 'First' to 'Last' intersects with one of the
- // input ranges.
- bool affectsTokenRange(const FormatToken &First, const FormatToken &Last,
- bool IncludeLeadingNewlines) {
- SourceLocation Start = First.WhitespaceRange.getBegin();
- if (!IncludeLeadingNewlines)
- Start = Start.getLocWithOffset(First.LastNewlineOffset);
- SourceLocation End = Last.getStartOfNonWhitespace();
- End = End.getLocWithOffset(Last.TokenText.size());
- CharSourceRange Range = CharSourceRange::getCharRange(Start, End);
- return affectsCharSourceRange(Range);
- }
+ for (FormatToken *FormatTok = Line->First; FormatTok;
+ FormatTok = FormatTok->Next) {
+ StringRef Input = FormatTok->TokenText;
+ if (FormatTok->Finalized || !FormatTok->isStringLiteral() ||
+ // NB: testing for not starting with a double quote to avoid
+ // breaking
+ // `template strings`.
+ (Style.JavaScriptQuotes == FormatStyle::JSQS_Single &&
+ !Input.startswith("\"")) ||
+ (Style.JavaScriptQuotes == FormatStyle::JSQS_Double &&
+ !Input.startswith("\'")))
+ continue;
- // Returns true if one of the input ranges intersect the leading empty lines
- // before 'Tok'.
- bool affectsLeadingEmptyLines(const FormatToken &Tok) {
- CharSourceRange EmptyLineRange = CharSourceRange::getCharRange(
- Tok.WhitespaceRange.getBegin(),
- Tok.WhitespaceRange.getBegin().getLocWithOffset(Tok.LastNewlineOffset));
- return affectsCharSourceRange(EmptyLineRange);
- }
+ // Change start and end quote.
+ bool IsSingle = Style.JavaScriptQuotes == FormatStyle::JSQS_Single;
+ SourceLocation Start = FormatTok->Tok.getLocation();
+ auto Replace = [&](SourceLocation Start, unsigned Length,
+ StringRef ReplacementText) {
+ Result.insert(tooling::Replacement(Env.getSourceManager(), Start,
+ Length, ReplacementText));
+ };
+ Replace(Start, 1, IsSingle ? "'" : "\"");
+ Replace(FormatTok->Tok.getEndLoc().getLocWithOffset(-1), 1,
+ IsSingle ? "'" : "\"");
+
+ // Escape internal quotes.
+ size_t ColumnWidth = FormatTok->TokenText.size();
+ bool Escaped = false;
+ for (size_t i = 1; i < Input.size() - 1; i++) {
+ switch (Input[i]) {
+ case '\\':
+ if (!Escaped && i + 1 < Input.size() &&
+ ((IsSingle && Input[i + 1] == '"') ||
+ (!IsSingle && Input[i + 1] == '\''))) {
+ // Remove this \, it's escaping a " or ' that no longer needs
+ // escaping
+ ColumnWidth--;
+ Replace(Start.getLocWithOffset(i), 1, "");
+ continue;
+ }
+ Escaped = !Escaped;
+ break;
+ case '\"':
+ case '\'':
+ if (!Escaped && IsSingle == (Input[i] == '\'')) {
+ // Escape the quote.
+ Replace(Start.getLocWithOffset(i), 0, "\\");
+ ColumnWidth++;
+ }
+ Escaped = false;
+ break;
+ default:
+ Escaped = false;
+ break;
+ }
+ }
- // Returns true if 'Range' intersects with one of the input ranges.
- bool affectsCharSourceRange(const CharSourceRange &Range) {
- for (SmallVectorImpl<CharSourceRange>::const_iterator I = Ranges.begin(),
- E = Ranges.end();
- I != E; ++I) {
- if (!SourceMgr.isBeforeInTranslationUnit(Range.getEnd(), I->getBegin()) &&
- !SourceMgr.isBeforeInTranslationUnit(I->getEnd(), Range.getBegin()))
- return true;
+ // For formatting, count the number of non-escaped single quotes in them
+ // and adjust ColumnWidth to take the added escapes into account.
+ // FIXME(martinprobst): this might conflict with code breaking a long
+ // string literal (which clang-format doesn't do, yet). For that to
+ // work, this code would have to modify TokenText directly.
+ FormatTok->ColumnWidth = ColumnWidth;
+ }
}
- return false;
}
static bool inputUsesCRLF(StringRef Text) {
@@ -1634,7 +910,7 @@ private:
bool
hasCpp03IncompatibleFormat(const SmallVectorImpl<AnnotatedLine *> &Lines) {
- for (const AnnotatedLine* Line : Lines) {
+ for (const AnnotatedLine *Line : Lines) {
if (hasCpp03IncompatibleFormat(Line->Children))
return true;
for (FormatToken *Tok = Line->First->Next; Tok; Tok = Tok->Next) {
@@ -1652,7 +928,7 @@ private:
int countVariableAlignments(const SmallVectorImpl<AnnotatedLine *> &Lines) {
int AlignmentDiff = 0;
- for (const AnnotatedLine* Line : Lines) {
+ for (const AnnotatedLine *Line : Lines) {
AlignmentDiff += countVariableAlignments(Line->Children);
for (FormatToken *Tok = Line->First; Tok && Tok->Next; Tok = Tok->Next) {
if (!Tok->is(TT_PointerOrReference))
@@ -1699,24 +975,219 @@ private:
HasBinPackedFunction || !HasOnePerLineFunction;
}
- void consumeUnwrappedLine(const UnwrappedLine &TheLine) override {
- assert(!UnwrappedLines.empty());
- UnwrappedLines.back().push_back(TheLine);
+ bool BinPackInconclusiveFunctions;
+ bool *IncompleteFormat;
+};
+
+// This class clean up the erroneous/redundant code around the given ranges in
+// file.
+class Cleaner : public TokenAnalyzer {
+public:
+ Cleaner(const Environment &Env, const FormatStyle &Style)
+ : TokenAnalyzer(Env, Style),
+ DeletedTokens(FormatTokenLess(Env.getSourceManager())) {}
+
+ // FIXME: eliminate unused parameters.
+ tooling::Replacements
+ analyze(TokenAnnotator &Annotator,
+ SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
+ FormatTokenLexer &Tokens, tooling::Replacements &Result) override {
+ // FIXME: in the current implementation the granularity of affected range
+ // is an annotated line. However, this is not sufficient. Furthermore,
+ // redundant code introduced by replacements does not necessarily
+ // intercept with ranges of replacements that result in the redundancy.
+ // To determine if some redundant code is actually introduced by
+ // replacements(e.g. deletions), we need to come up with a more
+ // sophisticated way of computing affected ranges.
+ AffectedRangeMgr.computeAffectedLines(AnnotatedLines.begin(),
+ AnnotatedLines.end());
+
+ checkEmptyNamespace(AnnotatedLines);
+
+ for (auto &Line : AnnotatedLines) {
+ if (Line->Affected) {
+ cleanupRight(Line->First, tok::comma, tok::comma);
+ cleanupRight(Line->First, TT_CtorInitializerColon, tok::comma);
+ cleanupLeft(Line->First, TT_CtorInitializerComma, tok::l_brace);
+ cleanupLeft(Line->First, TT_CtorInitializerColon, tok::l_brace);
+ }
+ }
+
+ return generateFixes();
}
- void finishRun() override {
- UnwrappedLines.push_back(SmallVector<UnwrappedLine, 16>());
+private:
+ bool containsOnlyComments(const AnnotatedLine &Line) {
+ for (FormatToken *Tok = Line.First; Tok != nullptr; Tok = Tok->Next) {
+ if (Tok->isNot(tok::comment))
+ return false;
+ }
+ return true;
}
- FormatStyle Style;
- FileID ID;
- SourceManager &SourceMgr;
- WhitespaceManager Whitespaces;
- SmallVector<CharSourceRange, 8> Ranges;
- SmallVector<SmallVector<UnwrappedLine, 16>, 2> UnwrappedLines;
+ // Iterate through all lines and remove any empty (nested) namespaces.
+ void checkEmptyNamespace(SmallVectorImpl<AnnotatedLine *> &AnnotatedLines) {
+ for (unsigned i = 0, e = AnnotatedLines.size(); i != e; ++i) {
+ auto &Line = *AnnotatedLines[i];
+ if (Line.startsWith(tok::kw_namespace) ||
+ Line.startsWith(tok::kw_inline, tok::kw_namespace)) {
+ checkEmptyNamespace(AnnotatedLines, i, i);
+ }
+ }
- encoding::Encoding Encoding;
- bool BinPackInconclusiveFunctions;
+ for (auto Line : DeletedLines) {
+ FormatToken *Tok = AnnotatedLines[Line]->First;
+ while (Tok) {
+ deleteToken(Tok);
+ Tok = Tok->Next;
+ }
+ }
+ }
+
+ // The function checks if the namespace, which starts from \p CurrentLine, and
+ // its nested namespaces are empty and delete them if they are empty. It also
+ // sets \p NewLine to the last line checked.
+ // Returns true if the current namespace is empty.
+ bool checkEmptyNamespace(SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
+ unsigned CurrentLine, unsigned &NewLine) {
+ unsigned InitLine = CurrentLine, End = AnnotatedLines.size();
+ if (Style.BraceWrapping.AfterNamespace) {
+ // If the left brace is in a new line, we should consume it first so that
+ // it does not make the namespace non-empty.
+ // FIXME: error handling if there is no left brace.
+ if (!AnnotatedLines[++CurrentLine]->startsWith(tok::l_brace)) {
+ NewLine = CurrentLine;
+ return false;
+ }
+ } else if (!AnnotatedLines[CurrentLine]->endsWith(tok::l_brace)) {
+ return false;
+ }
+ while (++CurrentLine < End) {
+ if (AnnotatedLines[CurrentLine]->startsWith(tok::r_brace))
+ break;
+
+ if (AnnotatedLines[CurrentLine]->startsWith(tok::kw_namespace) ||
+ AnnotatedLines[CurrentLine]->startsWith(tok::kw_inline,
+ tok::kw_namespace)) {
+ if (!checkEmptyNamespace(AnnotatedLines, CurrentLine, NewLine))
+ return false;
+ CurrentLine = NewLine;
+ continue;
+ }
+
+ if (containsOnlyComments(*AnnotatedLines[CurrentLine]))
+ continue;
+
+ // If there is anything other than comments or nested namespaces in the
+ // current namespace, the namespace cannot be empty.
+ NewLine = CurrentLine;
+ return false;
+ }
+
+ NewLine = CurrentLine;
+ if (CurrentLine >= End)
+ return false;
+
+ // Check if the empty namespace is actually affected by changed ranges.
+ if (!AffectedRangeMgr.affectsCharSourceRange(CharSourceRange::getCharRange(
+ AnnotatedLines[InitLine]->First->Tok.getLocation(),
+ AnnotatedLines[CurrentLine]->Last->Tok.getEndLoc())))
+ return false;
+
+ for (unsigned i = InitLine; i <= CurrentLine; ++i) {
+ DeletedLines.insert(i);
+ }
+
+ return true;
+ }
+
+ // Checks pairs {start, start->next},..., {end->previous, end} and deletes one
+ // of the token in the pair if the left token has \p LK token kind and the
+ // right token has \p RK token kind. If \p DeleteLeft is true, the left token
+ // is deleted on match; otherwise, the right token is deleted.
+ template <typename LeftKind, typename RightKind>
+ void cleanupPair(FormatToken *Start, LeftKind LK, RightKind RK,
+ bool DeleteLeft) {
+ auto NextNotDeleted = [this](const FormatToken &Tok) -> FormatToken * {
+ for (auto *Res = Tok.Next; Res; Res = Res->Next)
+ if (!Res->is(tok::comment) &&
+ DeletedTokens.find(Res) == DeletedTokens.end())
+ return Res;
+ return nullptr;
+ };
+ for (auto *Left = Start; Left;) {
+ auto *Right = NextNotDeleted(*Left);
+ if (!Right)
+ break;
+ if (Left->is(LK) && Right->is(RK)) {
+ deleteToken(DeleteLeft ? Left : Right);
+ // If the right token is deleted, we should keep the left token
+ // unchanged and pair it with the new right token.
+ if (!DeleteLeft)
+ continue;
+ }
+ Left = Right;
+ }
+ }
+
+ template <typename LeftKind, typename RightKind>
+ void cleanupLeft(FormatToken *Start, LeftKind LK, RightKind RK) {
+ cleanupPair(Start, LK, RK, /*DeleteLeft=*/true);
+ }
+
+ template <typename LeftKind, typename RightKind>
+ void cleanupRight(FormatToken *Start, LeftKind LK, RightKind RK) {
+ cleanupPair(Start, LK, RK, /*DeleteLeft=*/false);
+ }
+
+ // Delete the given token.
+ inline void deleteToken(FormatToken *Tok) {
+ if (Tok)
+ DeletedTokens.insert(Tok);
+ }
+
+ tooling::Replacements generateFixes() {
+ tooling::Replacements Fixes;
+ std::vector<FormatToken *> Tokens;
+ std::copy(DeletedTokens.begin(), DeletedTokens.end(),
+ std::back_inserter(Tokens));
+
+ // Merge multiple continuous token deletions into one big deletion so that
+ // the number of replacements can be reduced. This makes computing affected
+ // ranges more efficient when we run reformat on the changed code.
+ unsigned Idx = 0;
+ while (Idx < Tokens.size()) {
+ unsigned St = Idx, End = Idx;
+ while ((End + 1) < Tokens.size() &&
+ Tokens[End]->Next == Tokens[End + 1]) {
+ End++;
+ }
+ auto SR = CharSourceRange::getCharRange(Tokens[St]->Tok.getLocation(),
+ Tokens[End]->Tok.getEndLoc());
+ Fixes.insert(tooling::Replacement(Env.getSourceManager(), SR, ""));
+ Idx = End + 1;
+ }
+
+ return Fixes;
+ }
+
+ // Class for less-than inequality comparason for the set `RedundantTokens`.
+ // We store tokens in the order they appear in the translation unit so that
+ // we do not need to sort them in `generateFixes()`.
+ struct FormatTokenLess {
+ FormatTokenLess(const SourceManager &SM) : SM(SM) {}
+
+ bool operator()(const FormatToken *LHS, const FormatToken *RHS) const {
+ return SM.isBeforeInTranslationUnit(LHS->Tok.getLocation(),
+ RHS->Tok.getLocation());
+ }
+ const SourceManager &SM;
+ };
+
+ // Tokens to be deleted.
+ std::set<FormatToken *, FormatTokenLess> DeletedTokens;
+ // The line numbers of lines to be deleted.
+ std::set<unsigned> DeletedLines;
};
struct IncludeDirective {
@@ -1742,7 +1213,7 @@ static bool affectsRange(ArrayRef<tooling::Range> Ranges, unsigned Start,
// Sorts a block of includes given by 'Includes' alphabetically adding the
// necessary replacement to 'Replaces'. 'Includes' must be in strict source
// order.
-static void sortIncludes(const FormatStyle &Style,
+static void sortCppIncludes(const FormatStyle &Style,
const SmallVectorImpl<IncludeDirective> &Includes,
ArrayRef<tooling::Range> Ranges, StringRef FileName,
tooling::Replacements &Replaces, unsigned *Cursor) {
@@ -1752,21 +1223,15 @@ static void sortIncludes(const FormatStyle &Style,
SmallVector<unsigned, 16> Indices;
for (unsigned i = 0, e = Includes.size(); i != e; ++i)
Indices.push_back(i);
- std::sort(Indices.begin(), Indices.end(), [&](unsigned LHSI, unsigned RHSI) {
- return std::tie(Includes[LHSI].Category, Includes[LHSI].Filename) <
- std::tie(Includes[RHSI].Category, Includes[RHSI].Filename);
- });
+ std::stable_sort(
+ Indices.begin(), Indices.end(), [&](unsigned LHSI, unsigned RHSI) {
+ return std::tie(Includes[LHSI].Category, Includes[LHSI].Filename) <
+ std::tie(Includes[RHSI].Category, Includes[RHSI].Filename);
+ });
// If the #includes are out of order, we generate a single replacement fixing
// the entire block. Otherwise, no replacement is generated.
- bool OutOfOrder = false;
- for (unsigned i = 1, e = Indices.size(); i != e; ++i) {
- if (Indices[i] != i) {
- OutOfOrder = true;
- break;
- }
- }
- if (!OutOfOrder)
+ if (std::is_sorted(Indices.begin(), Indices.end()))
return;
std::string result;
@@ -1796,17 +1261,73 @@ static void sortIncludes(const FormatStyle &Style,
result.size(), result));
}
-tooling::Replacements sortIncludes(const FormatStyle &Style, StringRef Code,
- ArrayRef<tooling::Range> Ranges,
- StringRef FileName, unsigned *Cursor) {
- tooling::Replacements Replaces;
- if (!Style.SortIncludes)
- return Replaces;
+namespace {
+
+// This class manages priorities of #include categories and calculates
+// priorities for headers.
+class IncludeCategoryManager {
+public:
+ IncludeCategoryManager(const FormatStyle &Style, StringRef FileName)
+ : Style(Style), FileName(FileName) {
+ FileStem = llvm::sys::path::stem(FileName);
+ for (const auto &Category : Style.IncludeCategories)
+ CategoryRegexs.emplace_back(Category.Regex);
+ IsMainFile = FileName.endswith(".c") || FileName.endswith(".cc") ||
+ FileName.endswith(".cpp") || FileName.endswith(".c++") ||
+ FileName.endswith(".cxx") || FileName.endswith(".m") ||
+ FileName.endswith(".mm");
+ }
+
+ // Returns the priority of the category which \p IncludeName belongs to.
+ // If \p CheckMainHeader is true and \p IncludeName is a main header, returns
+ // 0. Otherwise, returns the priority of the matching category or INT_MAX.
+ int getIncludePriority(StringRef IncludeName, bool CheckMainHeader) {
+ int Ret = INT_MAX;
+ for (unsigned i = 0, e = CategoryRegexs.size(); i != e; ++i)
+ if (CategoryRegexs[i].match(IncludeName)) {
+ Ret = Style.IncludeCategories[i].Priority;
+ break;
+ }
+ if (CheckMainHeader && IsMainFile && Ret > 0 && isMainHeader(IncludeName))
+ Ret = 0;
+ return Ret;
+ }
+
+private:
+ bool isMainHeader(StringRef IncludeName) const {
+ if (!IncludeName.startswith("\""))
+ return false;
+ StringRef HeaderStem =
+ llvm::sys::path::stem(IncludeName.drop_front(1).drop_back(1));
+ if (FileStem.startswith(HeaderStem)) {
+ llvm::Regex MainIncludeRegex(
+ (HeaderStem + Style.IncludeIsMainRegex).str());
+ if (MainIncludeRegex.match(FileStem))
+ return true;
+ }
+ return false;
+ }
+
+ const FormatStyle &Style;
+ bool IsMainFile;
+ StringRef FileName;
+ StringRef FileStem;
+ SmallVector<llvm::Regex, 4> CategoryRegexs;
+};
+
+const char IncludeRegexPattern[] =
+ R"(^[\t\ ]*#[\t\ ]*(import|include)[^"<]*(["<][^">]*[">]))";
+
+} // anonymous namespace
+tooling::Replacements sortCppIncludes(const FormatStyle &Style, StringRef Code,
+ ArrayRef<tooling::Range> Ranges,
+ StringRef FileName,
+ tooling::Replacements &Replaces,
+ unsigned *Cursor) {
unsigned Prev = 0;
unsigned SearchFrom = 0;
- llvm::Regex IncludeRegex(
- R"(^[\t\ ]*#[\t\ ]*(import|include)[^"<]*(["<][^">]*[">]))");
+ llvm::Regex IncludeRegex(IncludeRegexPattern);
SmallVector<StringRef, 4> Matches;
SmallVector<IncludeDirective, 16> IncludesInBlock;
@@ -1817,19 +1338,9 @@ tooling::Replacements sortIncludes(const FormatStyle &Style, StringRef Code,
//
// FIXME: Do some sanity checking, e.g. edit distance of the base name, to fix
// cases where the first #include is unlikely to be the main header.
- bool IsSource = FileName.endswith(".c") || FileName.endswith(".cc") ||
- FileName.endswith(".cpp") || FileName.endswith(".c++") ||
- FileName.endswith(".cxx") || FileName.endswith(".m") ||
- FileName.endswith(".mm");
- StringRef FileStem = llvm::sys::path::stem(FileName);
+ IncludeCategoryManager Categories(Style, FileName);
bool FirstIncludeBlock = true;
bool MainIncludeFound = false;
-
- // Create pre-compiled regular expressions for the #include categories.
- SmallVector<llvm::Regex, 4> CategoryRegexs;
- for (const auto &Category : Style.IncludeCategories)
- CategoryRegexs.emplace_back(Category.Regex);
-
bool FormattingOff = false;
for (;;) {
@@ -1846,26 +1357,15 @@ tooling::Replacements sortIncludes(const FormatStyle &Style, StringRef Code,
if (!FormattingOff && !Line.endswith("\\")) {
if (IncludeRegex.match(Line, &Matches)) {
StringRef IncludeName = Matches[2];
- int Category = INT_MAX;
- for (unsigned i = 0, e = CategoryRegexs.size(); i != e; ++i) {
- if (CategoryRegexs[i].match(IncludeName)) {
- Category = Style.IncludeCategories[i].Priority;
- break;
- }
- }
- if (IsSource && !MainIncludeFound && Category > 0 &&
- FirstIncludeBlock && IncludeName.startswith("\"")) {
- StringRef HeaderStem =
- llvm::sys::path::stem(IncludeName.drop_front(1).drop_back(1));
- if (FileStem.startswith(HeaderStem)) {
- Category = 0;
- MainIncludeFound = true;
- }
- }
+ int Category = Categories.getIncludePriority(
+ IncludeName,
+ /*CheckMainHeader=*/!MainIncludeFound && FirstIncludeBlock);
+ if (Category == 0)
+ MainIncludeFound = true;
IncludesInBlock.push_back({IncludeName, Line, Prev, Category});
} else if (!IncludesInBlock.empty()) {
- sortIncludes(Style, IncludesInBlock, Ranges, FileName, Replaces,
- Cursor);
+ sortCppIncludes(Style, IncludesInBlock, Ranges, FileName, Replaces,
+ Cursor);
IncludesInBlock.clear();
FirstIncludeBlock = false;
}
@@ -1876,47 +1376,280 @@ tooling::Replacements sortIncludes(const FormatStyle &Style, StringRef Code,
SearchFrom = Pos + 1;
}
if (!IncludesInBlock.empty())
- sortIncludes(Style, IncludesInBlock, Ranges, FileName, Replaces, Cursor);
+ sortCppIncludes(Style, IncludesInBlock, Ranges, FileName, Replaces, Cursor);
return Replaces;
}
-tooling::Replacements reformat(const FormatStyle &Style,
- SourceManager &SourceMgr, FileID ID,
- ArrayRef<CharSourceRange> Ranges,
+tooling::Replacements sortIncludes(const FormatStyle &Style, StringRef Code,
+ ArrayRef<tooling::Range> Ranges,
+ StringRef FileName, unsigned *Cursor) {
+ tooling::Replacements Replaces;
+ if (!Style.SortIncludes)
+ return Replaces;
+ if (Style.Language == FormatStyle::LanguageKind::LK_JavaScript)
+ return sortJavaScriptImports(Style, Code, Ranges, FileName);
+ sortCppIncludes(Style, Code, Ranges, FileName, Replaces, Cursor);
+ return Replaces;
+}
+
+template <typename T>
+static llvm::Expected<tooling::Replacements>
+processReplacements(T ProcessFunc, StringRef Code,
+ const tooling::Replacements &Replaces,
+ const FormatStyle &Style) {
+ if (Replaces.empty())
+ return tooling::Replacements();
+
+ auto NewCode = applyAllReplacements(Code, Replaces);
+ if (!NewCode)
+ return NewCode.takeError();
+ std::vector<tooling::Range> ChangedRanges =
+ tooling::calculateChangedRanges(Replaces);
+ StringRef FileName = Replaces.begin()->getFilePath();
+
+ tooling::Replacements FormatReplaces =
+ ProcessFunc(Style, *NewCode, ChangedRanges, FileName);
+
+ return mergeReplacements(Replaces, FormatReplaces);
+}
+
+llvm::Expected<tooling::Replacements>
+formatReplacements(StringRef Code, const tooling::Replacements &Replaces,
+ const FormatStyle &Style) {
+ // We need to use lambda function here since there are two versions of
+ // `sortIncludes`.
+ auto SortIncludes = [](const FormatStyle &Style, StringRef Code,
+ std::vector<tooling::Range> Ranges,
+ StringRef FileName) -> tooling::Replacements {
+ return sortIncludes(Style, Code, Ranges, FileName);
+ };
+ auto SortedReplaces =
+ processReplacements(SortIncludes, Code, Replaces, Style);
+ if (!SortedReplaces)
+ return SortedReplaces.takeError();
+
+ // We need to use lambda function here since there are two versions of
+ // `reformat`.
+ auto Reformat = [](const FormatStyle &Style, StringRef Code,
+ std::vector<tooling::Range> Ranges,
+ StringRef FileName) -> tooling::Replacements {
+ return reformat(Style, Code, Ranges, FileName);
+ };
+ return processReplacements(Reformat, Code, *SortedReplaces, Style);
+}
+
+namespace {
+
+inline bool isHeaderInsertion(const tooling::Replacement &Replace) {
+ return Replace.getOffset() == UINT_MAX &&
+ llvm::Regex(IncludeRegexPattern).match(Replace.getReplacementText());
+}
+
+void skipComments(Lexer &Lex, Token &Tok) {
+ while (Tok.is(tok::comment))
+ if (Lex.LexFromRawLexer(Tok))
+ return;
+}
+
+// Check if a sequence of tokens is like "#<Name> <raw_identifier>". If it is,
+// \p Tok will be the token after this directive; otherwise, it can be any token
+// after the given \p Tok (including \p Tok).
+bool checkAndConsumeDirectiveWithName(Lexer &Lex, StringRef Name, Token &Tok) {
+ bool Matched = Tok.is(tok::hash) && !Lex.LexFromRawLexer(Tok) &&
+ Tok.is(tok::raw_identifier) &&
+ Tok.getRawIdentifier() == Name && !Lex.LexFromRawLexer(Tok) &&
+ Tok.is(tok::raw_identifier);
+ if (Matched)
+ Lex.LexFromRawLexer(Tok);
+ return Matched;
+}
+
+unsigned getOffsetAfterHeaderGuardsAndComments(StringRef FileName,
+ StringRef Code,
+ const FormatStyle &Style) {
+ std::unique_ptr<Environment> Env =
+ Environment::CreateVirtualEnvironment(Code, FileName, /*Ranges=*/{});
+ const SourceManager &SourceMgr = Env->getSourceManager();
+ Lexer Lex(Env->getFileID(), SourceMgr.getBuffer(Env->getFileID()), SourceMgr,
+ getFormattingLangOpts(Style));
+ Token Tok;
+ // Get the first token.
+ Lex.LexFromRawLexer(Tok);
+ skipComments(Lex, Tok);
+ unsigned AfterComments = SourceMgr.getFileOffset(Tok.getLocation());
+ if (checkAndConsumeDirectiveWithName(Lex, "ifndef", Tok)) {
+ skipComments(Lex, Tok);
+ if (checkAndConsumeDirectiveWithName(Lex, "define", Tok))
+ return SourceMgr.getFileOffset(Tok.getLocation());
+ }
+ return AfterComments;
+}
+
+// FIXME: we also need to insert a '\n' at the end of the code if we have an
+// insertion with offset Code.size(), and there is no '\n' at the end of the
+// code.
+// FIXME: do not insert headers into conditional #include blocks, e.g. #includes
+// surrounded by compile condition "#if...".
+// FIXME: insert empty lines between newly created blocks.
+tooling::Replacements
+fixCppIncludeInsertions(StringRef Code, const tooling::Replacements &Replaces,
+ const FormatStyle &Style) {
+ if (Style.Language != FormatStyle::LanguageKind::LK_Cpp)
+ return Replaces;
+
+ tooling::Replacements HeaderInsertions;
+ for (const auto &R : Replaces) {
+ if (isHeaderInsertion(R))
+ HeaderInsertions.insert(R);
+ else if (R.getOffset() == UINT_MAX)
+ llvm::errs() << "Insertions other than header #include insertion are "
+ "not supported! "
+ << R.getReplacementText() << "\n";
+ }
+ if (HeaderInsertions.empty())
+ return Replaces;
+ tooling::Replacements Result;
+ std::set_difference(Replaces.begin(), Replaces.end(),
+ HeaderInsertions.begin(), HeaderInsertions.end(),
+ std::inserter(Result, Result.begin()));
+
+ llvm::Regex IncludeRegex(IncludeRegexPattern);
+ llvm::Regex DefineRegex(R"(^[\t\ ]*#[\t\ ]*define[\t\ ]*[^\\]*$)");
+ SmallVector<StringRef, 4> Matches;
+
+ StringRef FileName = Replaces.begin()->getFilePath();
+ IncludeCategoryManager Categories(Style, FileName);
+
+ // Record the offset of the end of the last include in each category.
+ std::map<int, int> CategoryEndOffsets;
+ // All possible priorities.
+ // Add 0 for main header and INT_MAX for headers that are not in any category.
+ std::set<int> Priorities = {0, INT_MAX};
+ for (const auto &Category : Style.IncludeCategories)
+ Priorities.insert(Category.Priority);
+ int FirstIncludeOffset = -1;
+ // All new headers should be inserted after this offset.
+ unsigned MinInsertOffset =
+ getOffsetAfterHeaderGuardsAndComments(FileName, Code, Style);
+ StringRef TrimmedCode = Code.drop_front(MinInsertOffset);
+ SmallVector<StringRef, 32> Lines;
+ TrimmedCode.split(Lines, '\n');
+ unsigned Offset = MinInsertOffset;
+ unsigned NextLineOffset;
+ std::set<StringRef> ExistingIncludes;
+ for (auto Line : Lines) {
+ NextLineOffset = std::min(Code.size(), Offset + Line.size() + 1);
+ if (IncludeRegex.match(Line, &Matches)) {
+ StringRef IncludeName = Matches[2];
+ ExistingIncludes.insert(IncludeName);
+ int Category = Categories.getIncludePriority(
+ IncludeName, /*CheckMainHeader=*/FirstIncludeOffset < 0);
+ CategoryEndOffsets[Category] = NextLineOffset;
+ if (FirstIncludeOffset < 0)
+ FirstIncludeOffset = Offset;
+ }
+ Offset = NextLineOffset;
+ }
+
+ // Populate CategoryEndOfssets:
+ // - Ensure that CategoryEndOffset[Highest] is always populated.
+ // - If CategoryEndOffset[Priority] isn't set, use the next higher value that
+ // is set, up to CategoryEndOffset[Highest].
+ auto Highest = Priorities.begin();
+ if (CategoryEndOffsets.find(*Highest) == CategoryEndOffsets.end()) {
+ if (FirstIncludeOffset >= 0)
+ CategoryEndOffsets[*Highest] = FirstIncludeOffset;
+ else
+ CategoryEndOffsets[*Highest] = MinInsertOffset;
+ }
+ // By this point, CategoryEndOffset[Highest] is always set appropriately:
+ // - to an appropriate location before/after existing #includes, or
+ // - to right after the header guard, or
+ // - to the beginning of the file.
+ for (auto I = ++Priorities.begin(), E = Priorities.end(); I != E; ++I)
+ if (CategoryEndOffsets.find(*I) == CategoryEndOffsets.end())
+ CategoryEndOffsets[*I] = CategoryEndOffsets[*std::prev(I)];
+
+ for (const auto &R : HeaderInsertions) {
+ auto IncludeDirective = R.getReplacementText();
+ bool Matched = IncludeRegex.match(IncludeDirective, &Matches);
+ assert(Matched && "Header insertion replacement must have replacement text "
+ "'#include ...'");
+ (void)Matched;
+ auto IncludeName = Matches[2];
+ if (ExistingIncludes.find(IncludeName) != ExistingIncludes.end()) {
+ DEBUG(llvm::dbgs() << "Skip adding existing include : " << IncludeName
+ << "\n");
+ continue;
+ }
+ int Category =
+ Categories.getIncludePriority(IncludeName, /*CheckMainHeader=*/true);
+ Offset = CategoryEndOffsets[Category];
+ std::string NewInclude = !IncludeDirective.endswith("\n")
+ ? (IncludeDirective + "\n").str()
+ : IncludeDirective.str();
+ Result.insert(tooling::Replacement(FileName, Offset, 0, NewInclude));
+ }
+ return Result;
+}
+
+} // anonymous namespace
+
+llvm::Expected<tooling::Replacements>
+cleanupAroundReplacements(StringRef Code, const tooling::Replacements &Replaces,
+ const FormatStyle &Style) {
+ // We need to use lambda function here since there are two versions of
+ // `cleanup`.
+ auto Cleanup = [](const FormatStyle &Style, StringRef Code,
+ std::vector<tooling::Range> Ranges,
+ StringRef FileName) -> tooling::Replacements {
+ return cleanup(Style, Code, Ranges, FileName);
+ };
+ // Make header insertion replacements insert new headers into correct blocks.
+ tooling::Replacements NewReplaces =
+ fixCppIncludeInsertions(Code, Replaces, Style);
+ return processReplacements(Cleanup, Code, NewReplaces, Style);
+}
+
+tooling::Replacements reformat(const FormatStyle &Style, SourceManager &SM,
+ FileID ID, ArrayRef<CharSourceRange> Ranges,
bool *IncompleteFormat) {
FormatStyle Expanded = expandPresets(Style);
if (Expanded.DisableFormat)
return tooling::Replacements();
- Formatter formatter(Expanded, SourceMgr, ID, Ranges);
- return formatter.format(IncompleteFormat);
+
+ Environment Env(SM, ID, Ranges);
+ Formatter Format(Env, Expanded, IncompleteFormat);
+ return Format.process();
}
tooling::Replacements reformat(const FormatStyle &Style, StringRef Code,
ArrayRef<tooling::Range> Ranges,
StringRef FileName, bool *IncompleteFormat) {
- if (Style.DisableFormat)
+ FormatStyle Expanded = expandPresets(Style);
+ if (Expanded.DisableFormat)
return tooling::Replacements();
- IntrusiveRefCntPtr<vfs::InMemoryFileSystem> InMemoryFileSystem(
- new vfs::InMemoryFileSystem);
- FileManager Files(FileSystemOptions(), InMemoryFileSystem);
- DiagnosticsEngine Diagnostics(
- IntrusiveRefCntPtr<DiagnosticIDs>(new DiagnosticIDs),
- new DiagnosticOptions);
- SourceManager SourceMgr(Diagnostics, Files);
- InMemoryFileSystem->addFile(
- FileName, 0, llvm::MemoryBuffer::getMemBuffer(
- Code, FileName, /*RequiresNullTerminator=*/false));
- FileID ID = SourceMgr.createFileID(Files.getFile(FileName), SourceLocation(),
- clang::SrcMgr::C_User);
- SourceLocation StartOfFile = SourceMgr.getLocForStartOfFile(ID);
- std::vector<CharSourceRange> CharRanges;
- for (const tooling::Range &Range : Ranges) {
- SourceLocation Start = StartOfFile.getLocWithOffset(Range.getOffset());
- SourceLocation End = Start.getLocWithOffset(Range.getLength());
- CharRanges.push_back(CharSourceRange::getCharRange(Start, End));
- }
- return reformat(Style, SourceMgr, ID, CharRanges, IncompleteFormat);
+ std::unique_ptr<Environment> Env =
+ Environment::CreateVirtualEnvironment(Code, FileName, Ranges);
+ Formatter Format(*Env, Expanded, IncompleteFormat);
+ return Format.process();
+}
+
+tooling::Replacements cleanup(const FormatStyle &Style, SourceManager &SM,
+ FileID ID, ArrayRef<CharSourceRange> Ranges) {
+ Environment Env(SM, ID, Ranges);
+ Cleaner Clean(Env, Style);
+ return Clean.process();
+}
+
+tooling::Replacements cleanup(const FormatStyle &Style, StringRef Code,
+ ArrayRef<tooling::Range> Ranges,
+ StringRef FileName) {
+ std::unique_ptr<Environment> Env =
+ Environment::CreateVirtualEnvironment(Code, FileName, Ranges);
+ Cleaner Clean(*Env, Style);
+ return Clean.process();
}
LangOptions getFormattingLangOpts(const FormatStyle &Style) {
@@ -1930,7 +1663,7 @@ LangOptions getFormattingLangOpts(const FormatStyle &Style) {
LangOpts.Bool = 1;
LangOpts.ObjC1 = 1;
LangOpts.ObjC2 = 1;
- LangOpts.MicrosoftExt = 1; // To get kw___try, kw___finally.
+ LangOpts.MicrosoftExt = 1; // To get kw___try, kw___finally.
LangOpts.DeclSpecKeyword = 1; // To get __declspec.
return LangOpts;
}
@@ -1960,7 +1693,10 @@ static FormatStyle::LanguageKind getLanguageByFileName(StringRef FileName) {
}
FormatStyle getStyle(StringRef StyleName, StringRef FileName,
- StringRef FallbackStyle) {
+ StringRef FallbackStyle, vfs::FileSystem *FS) {
+ if (!FS) {
+ FS = vfs::getRealFileSystem().get();
+ }
FormatStyle Style = getLLVMStyle();
Style.Language = getLanguageByFileName(FileName);
if (!getPredefinedStyle(FallbackStyle, Style.Language, &Style)) {
@@ -1991,28 +1727,34 @@ FormatStyle getStyle(StringRef StyleName, StringRef FileName,
llvm::sys::fs::make_absolute(Path);
for (StringRef Directory = Path; !Directory.empty();
Directory = llvm::sys::path::parent_path(Directory)) {
- if (!llvm::sys::fs::is_directory(Directory))
+
+ auto Status = FS->status(Directory);
+ if (!Status ||
+ Status->getType() != llvm::sys::fs::file_type::directory_file) {
continue;
+ }
+
SmallString<128> ConfigFile(Directory);
llvm::sys::path::append(ConfigFile, ".clang-format");
DEBUG(llvm::dbgs() << "Trying " << ConfigFile << "...\n");
- bool IsFile = false;
- // Ignore errors from is_regular_file: we only need to know if we can read
- // the file or not.
- llvm::sys::fs::is_regular_file(Twine(ConfigFile), IsFile);
+ Status = FS->status(ConfigFile.str());
+ bool IsFile =
+ Status && (Status->getType() == llvm::sys::fs::file_type::regular_file);
if (!IsFile) {
// Try _clang-format too, since dotfiles are not commonly used on Windows.
ConfigFile = Directory;
llvm::sys::path::append(ConfigFile, "_clang-format");
DEBUG(llvm::dbgs() << "Trying " << ConfigFile << "...\n");
- llvm::sys::fs::is_regular_file(Twine(ConfigFile), IsFile);
+ Status = FS->status(ConfigFile.str());
+ IsFile = Status &&
+ (Status->getType() == llvm::sys::fs::file_type::regular_file);
}
if (IsFile) {
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> Text =
- llvm::MemoryBuffer::getFile(ConfigFile.c_str());
+ FS->getBufferForFile(ConfigFile.str());
if (std::error_code EC = Text.getError()) {
llvm::errs() << EC.message() << "\n";
break;
diff --git a/contrib/llvm/tools/clang/lib/Format/FormatToken.cpp b/contrib/llvm/tools/clang/lib/Format/FormatToken.cpp
index d6cd450..2ae4ddc 100644
--- a/contrib/llvm/tools/clang/lib/Format/FormatToken.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/FormatToken.cpp
@@ -53,6 +53,7 @@ bool FormatToken::isSimpleTypeSpecifier() const {
case tok::kw_half:
case tok::kw_float:
case tok::kw_double:
+ case tok::kw___float128:
case tok::kw_wchar_t:
case tok::kw_bool:
case tok::kw___underlying_type:
diff --git a/contrib/llvm/tools/clang/lib/Format/FormatToken.h b/contrib/llvm/tools/clang/lib/Format/FormatToken.h
index b683660..43b1625 100644
--- a/contrib/llvm/tools/clang/lib/Format/FormatToken.h
+++ b/contrib/llvm/tools/clang/lib/Format/FormatToken.h
@@ -54,6 +54,7 @@ namespace format {
TYPE(JsComputedPropertyName) \
TYPE(JsFatArrow) \
TYPE(JsTypeColon) \
+ TYPE(JsTypeOperator) \
TYPE(JsTypeOptionalQuestion) \
TYPE(LambdaArrow) \
TYPE(LambdaLSquare) \
@@ -144,7 +145,7 @@ struct FormatToken {
/// \brief Whether the token text contains newlines (escaped or not).
bool IsMultiline = false;
- /// \brief Indicates that this is the first token.
+ /// \brief Indicates that this is the first token of the file.
bool IsFirst = false;
/// \brief Whether there must be a line break before this token.
@@ -296,6 +297,20 @@ struct FormatToken {
}
template <typename T> bool isNot(T Kind) const { return !is(Kind); }
+ /// \c true if this token starts a sequence with the given tokens in order,
+ /// following the ``Next`` pointers, ignoring comments.
+ template <typename A, typename... Ts>
+ bool startsSequence(A K1, Ts... Tokens) const {
+ return startsSequenceInternal(K1, Tokens...);
+ }
+
+ /// \c true if this token ends a sequence with the given tokens in order,
+ /// following the ``Previous`` pointers, ignoring comments.
+ template <typename A, typename... Ts>
+ bool endsSequence(A K1, Ts... Tokens) const {
+ return endsSequenceInternal(K1, Tokens...);
+ }
+
bool isStringLiteral() const { return tok::isStringLiteral(Tok.getKind()); }
bool isObjCAtKeyword(tok::ObjCKeywordKind Kind) const {
@@ -428,6 +443,34 @@ private:
// Disallow copying.
FormatToken(const FormatToken &) = delete;
void operator=(const FormatToken &) = delete;
+
+ template <typename A, typename... Ts>
+ bool startsSequenceInternal(A K1, Ts... Tokens) const {
+ if (is(tok::comment) && Next)
+ return Next->startsSequenceInternal(K1, Tokens...);
+ return is(K1) && Next && Next->startsSequenceInternal(Tokens...);
+ }
+
+ template <typename A>
+ bool startsSequenceInternal(A K1) const {
+ if (is(tok::comment) && Next)
+ return Next->startsSequenceInternal(K1);
+ return is(K1);
+ }
+
+ template <typename A, typename... Ts>
+ bool endsSequenceInternal(A K1) const {
+ if (is(tok::comment) && Previous)
+ return Previous->endsSequenceInternal(K1);
+ return is(K1);
+ }
+
+ template <typename A, typename... Ts>
+ bool endsSequenceInternal(A K1, Ts... Tokens) const {
+ if (is(tok::comment) && Previous)
+ return Previous->endsSequenceInternal(K1, Tokens...);
+ return is(K1) && Previous && Previous->endsSequenceInternal(Tokens...);
+ }
};
class ContinuationIndenter;
@@ -528,17 +571,24 @@ struct AdditionalKeywords {
kw_final = &IdentTable.get("final");
kw_override = &IdentTable.get("override");
kw_in = &IdentTable.get("in");
+ kw_of = &IdentTable.get("of");
kw_CF_ENUM = &IdentTable.get("CF_ENUM");
kw_CF_OPTIONS = &IdentTable.get("CF_OPTIONS");
kw_NS_ENUM = &IdentTable.get("NS_ENUM");
kw_NS_OPTIONS = &IdentTable.get("NS_OPTIONS");
+ kw_as = &IdentTable.get("as");
+ kw_async = &IdentTable.get("async");
+ kw_await = &IdentTable.get("await");
kw_finally = &IdentTable.get("finally");
+ kw_from = &IdentTable.get("from");
kw_function = &IdentTable.get("function");
kw_import = &IdentTable.get("import");
kw_is = &IdentTable.get("is");
kw_let = &IdentTable.get("let");
+ kw_type = &IdentTable.get("type");
kw_var = &IdentTable.get("var");
+ kw_yield = &IdentTable.get("yield");
kw_abstract = &IdentTable.get("abstract");
kw_assert = &IdentTable.get("assert");
@@ -571,6 +621,7 @@ struct AdditionalKeywords {
IdentifierInfo *kw_final;
IdentifierInfo *kw_override;
IdentifierInfo *kw_in;
+ IdentifierInfo *kw_of;
IdentifierInfo *kw_CF_ENUM;
IdentifierInfo *kw_CF_OPTIONS;
IdentifierInfo *kw_NS_ENUM;
@@ -578,12 +629,18 @@ struct AdditionalKeywords {
IdentifierInfo *kw___except;
// JavaScript keywords.
+ IdentifierInfo *kw_as;
+ IdentifierInfo *kw_async;
+ IdentifierInfo *kw_await;
IdentifierInfo *kw_finally;
+ IdentifierInfo *kw_from;
IdentifierInfo *kw_function;
IdentifierInfo *kw_import;
IdentifierInfo *kw_is;
IdentifierInfo *kw_let;
+ IdentifierInfo *kw_type;
IdentifierInfo *kw_var;
+ IdentifierInfo *kw_yield;
// Java keywords.
IdentifierInfo *kw_abstract;
diff --git a/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.cpp b/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.cpp
new file mode 100644
index 0000000..9778f84
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.cpp
@@ -0,0 +1,597 @@
+//===--- FormatTokenLexer.cpp - Lex FormatTokens -------------*- C++ ----*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file implements FormatTokenLexer, which tokenizes a source file
+/// into a FormatToken stream suitable for ClangFormat.
+///
+//===----------------------------------------------------------------------===//
+
+#include "FormatTokenLexer.h"
+#include "FormatToken.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Format/Format.h"
+#include "llvm/Support/Regex.h"
+
+namespace clang {
+namespace format {
+
+FormatTokenLexer::FormatTokenLexer(const SourceManager &SourceMgr, FileID ID,
+ const FormatStyle &Style,
+ encoding::Encoding Encoding)
+ : FormatTok(nullptr), IsFirstToken(true), GreaterStashed(false),
+ LessStashed(false), Column(0), TrailingWhitespace(0),
+ SourceMgr(SourceMgr), ID(ID), Style(Style),
+ IdentTable(getFormattingLangOpts(Style)), Keywords(IdentTable),
+ Encoding(Encoding), FirstInLineIndex(0), FormattingDisabled(false),
+ MacroBlockBeginRegex(Style.MacroBlockBegin),
+ MacroBlockEndRegex(Style.MacroBlockEnd) {
+ Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,
+ getFormattingLangOpts(Style)));
+ Lex->SetKeepWhitespaceMode(true);
+
+ for (const std::string &ForEachMacro : Style.ForEachMacros)
+ ForEachMacros.push_back(&IdentTable.get(ForEachMacro));
+ std::sort(ForEachMacros.begin(), ForEachMacros.end());
+}
+
+ArrayRef<FormatToken *> FormatTokenLexer::lex() {
+ assert(Tokens.empty());
+ assert(FirstInLineIndex == 0);
+ do {
+ Tokens.push_back(getNextToken());
+ if (Style.Language == FormatStyle::LK_JavaScript) {
+ tryParseJSRegexLiteral();
+ tryParseTemplateString();
+ }
+ tryMergePreviousTokens();
+ if (Tokens.back()->NewlinesBefore > 0 || Tokens.back()->IsMultiline)
+ FirstInLineIndex = Tokens.size() - 1;
+ } while (Tokens.back()->Tok.isNot(tok::eof));
+ return Tokens;
+}
+
+void FormatTokenLexer::tryMergePreviousTokens() {
+ if (tryMerge_TMacro())
+ return;
+ if (tryMergeConflictMarkers())
+ return;
+ if (tryMergeLessLess())
+ return;
+
+ if (Style.Language == FormatStyle::LK_JavaScript) {
+ static const tok::TokenKind JSIdentity[] = {tok::equalequal, tok::equal};
+ static const tok::TokenKind JSNotIdentity[] = {tok::exclaimequal,
+ tok::equal};
+ static const tok::TokenKind JSShiftEqual[] = {tok::greater, tok::greater,
+ tok::greaterequal};
+ static const tok::TokenKind JSRightArrow[] = {tok::equal, tok::greater};
+ // FIXME: Investigate what token type gives the correct operator priority.
+ if (tryMergeTokens(JSIdentity, TT_BinaryOperator))
+ return;
+ if (tryMergeTokens(JSNotIdentity, TT_BinaryOperator))
+ return;
+ if (tryMergeTokens(JSShiftEqual, TT_BinaryOperator))
+ return;
+ if (tryMergeTokens(JSRightArrow, TT_JsFatArrow))
+ return;
+ }
+}
+
+bool FormatTokenLexer::tryMergeLessLess() {
+ // Merge X,less,less,Y into X,lessless,Y unless X or Y is less.
+ if (Tokens.size() < 3)
+ return false;
+
+ bool FourthTokenIsLess = false;
+ if (Tokens.size() > 3)
+ FourthTokenIsLess = (Tokens.end() - 4)[0]->is(tok::less);
+
+ auto First = Tokens.end() - 3;
+ if (First[2]->is(tok::less) || First[1]->isNot(tok::less) ||
+ First[0]->isNot(tok::less) || FourthTokenIsLess)
+ return false;
+
+ // Only merge if there currently is no whitespace between the two "<".
+ if (First[1]->WhitespaceRange.getBegin() !=
+ First[1]->WhitespaceRange.getEnd())
+ return false;
+
+ First[0]->Tok.setKind(tok::lessless);
+ First[0]->TokenText = "<<";
+ First[0]->ColumnWidth += 1;
+ Tokens.erase(Tokens.end() - 2);
+ return true;
+}
+
+bool FormatTokenLexer::tryMergeTokens(ArrayRef<tok::TokenKind> Kinds,
+ TokenType NewType) {
+ if (Tokens.size() < Kinds.size())
+ return false;
+
+ SmallVectorImpl<FormatToken *>::const_iterator First =
+ Tokens.end() - Kinds.size();
+ if (!First[0]->is(Kinds[0]))
+ return false;
+ unsigned AddLength = 0;
+ for (unsigned i = 1; i < Kinds.size(); ++i) {
+ if (!First[i]->is(Kinds[i]) ||
+ First[i]->WhitespaceRange.getBegin() !=
+ First[i]->WhitespaceRange.getEnd())
+ return false;
+ AddLength += First[i]->TokenText.size();
+ }
+ Tokens.resize(Tokens.size() - Kinds.size() + 1);
+ First[0]->TokenText = StringRef(First[0]->TokenText.data(),
+ First[0]->TokenText.size() + AddLength);
+ First[0]->ColumnWidth += AddLength;
+ First[0]->Type = NewType;
+ return true;
+}
+
+// Returns \c true if \p Tok can only be followed by an operand in JavaScript.
+bool FormatTokenLexer::precedesOperand(FormatToken *Tok) {
+ // NB: This is not entirely correct, as an r_paren can introduce an operand
+ // location in e.g. `if (foo) /bar/.exec(...);`. That is a rare enough
+ // corner case to not matter in practice, though.
+ return Tok->isOneOf(tok::period, tok::l_paren, tok::comma, tok::l_brace,
+ tok::r_brace, tok::l_square, tok::semi, tok::exclaim,
+ tok::colon, tok::question, tok::tilde) ||
+ Tok->isOneOf(tok::kw_return, tok::kw_do, tok::kw_case, tok::kw_throw,
+ tok::kw_else, tok::kw_new, tok::kw_delete, tok::kw_void,
+ tok::kw_typeof, Keywords.kw_instanceof, Keywords.kw_in) ||
+ Tok->isBinaryOperator();
+}
+
+bool FormatTokenLexer::canPrecedeRegexLiteral(FormatToken *Prev) {
+ if (!Prev)
+ return true;
+
+ // Regex literals can only follow after prefix unary operators, not after
+ // postfix unary operators. If the '++' is followed by a non-operand
+ // introducing token, the slash here is the operand and not the start of a
+ // regex.
+ if (Prev->isOneOf(tok::plusplus, tok::minusminus))
+ return (Tokens.size() < 3 || precedesOperand(Tokens[Tokens.size() - 3]));
+
+ // The previous token must introduce an operand location where regex
+ // literals can occur.
+ if (!precedesOperand(Prev))
+ return false;
+
+ return true;
+}
+
+// Tries to parse a JavaScript Regex literal starting at the current token,
+// if that begins with a slash and is in a location where JavaScript allows
+// regex literals. Changes the current token to a regex literal and updates
+// its text if successful.
+void FormatTokenLexer::tryParseJSRegexLiteral() {
+ FormatToken *RegexToken = Tokens.back();
+ if (!RegexToken->isOneOf(tok::slash, tok::slashequal))
+ return;
+
+ FormatToken *Prev = nullptr;
+ for (auto I = Tokens.rbegin() + 1, E = Tokens.rend(); I != E; ++I) {
+ // NB: Because previous pointers are not initialized yet, this cannot use
+ // Token.getPreviousNonComment.
+ if ((*I)->isNot(tok::comment)) {
+ Prev = *I;
+ break;
+ }
+ }
+
+ if (!canPrecedeRegexLiteral(Prev))
+ return;
+
+ // 'Manually' lex ahead in the current file buffer.
+ const char *Offset = Lex->getBufferLocation();
+ const char *RegexBegin = Offset - RegexToken->TokenText.size();
+ StringRef Buffer = Lex->getBuffer();
+ bool InCharacterClass = false;
+ bool HaveClosingSlash = false;
+ for (; !HaveClosingSlash && Offset != Buffer.end(); ++Offset) {
+ // Regular expressions are terminated with a '/', which can only be
+ // escaped using '\' or a character class between '[' and ']'.
+ // See http://www.ecma-international.org/ecma-262/5.1/#sec-7.8.5.
+ switch (*Offset) {
+ case '\\':
+ // Skip the escaped character.
+ ++Offset;
+ break;
+ case '[':
+ InCharacterClass = true;
+ break;
+ case ']':
+ InCharacterClass = false;
+ break;
+ case '/':
+ if (!InCharacterClass)
+ HaveClosingSlash = true;
+ break;
+ }
+ }
+
+ RegexToken->Type = TT_RegexLiteral;
+ // Treat regex literals like other string_literals.
+ RegexToken->Tok.setKind(tok::string_literal);
+ RegexToken->TokenText = StringRef(RegexBegin, Offset - RegexBegin);
+ RegexToken->ColumnWidth = RegexToken->TokenText.size();
+
+ resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset)));
+}
+
+void FormatTokenLexer::tryParseTemplateString() {
+ FormatToken *BacktickToken = Tokens.back();
+ if (!BacktickToken->is(tok::unknown) || BacktickToken->TokenText != "`")
+ return;
+
+ // 'Manually' lex ahead in the current file buffer.
+ const char *Offset = Lex->getBufferLocation();
+ const char *TmplBegin = Offset - BacktickToken->TokenText.size(); // at "`"
+ for (; Offset != Lex->getBuffer().end() && *Offset != '`'; ++Offset) {
+ if (*Offset == '\\')
+ ++Offset; // Skip the escaped character.
+ }
+
+ StringRef LiteralText(TmplBegin, Offset - TmplBegin + 1);
+ BacktickToken->Type = TT_TemplateString;
+ BacktickToken->Tok.setKind(tok::string_literal);
+ BacktickToken->TokenText = LiteralText;
+
+ // Adjust width for potentially multiline string literals.
+ size_t FirstBreak = LiteralText.find('\n');
+ StringRef FirstLineText = FirstBreak == StringRef::npos
+ ? LiteralText
+ : LiteralText.substr(0, FirstBreak);
+ BacktickToken->ColumnWidth = encoding::columnWidthWithTabs(
+ FirstLineText, BacktickToken->OriginalColumn, Style.TabWidth, Encoding);
+ size_t LastBreak = LiteralText.rfind('\n');
+ if (LastBreak != StringRef::npos) {
+ BacktickToken->IsMultiline = true;
+ unsigned StartColumn = 0; // The template tail spans the entire line.
+ BacktickToken->LastLineColumnWidth = encoding::columnWidthWithTabs(
+ LiteralText.substr(LastBreak + 1, LiteralText.size()), StartColumn,
+ Style.TabWidth, Encoding);
+ }
+
+ resetLexer(SourceMgr.getFileOffset(Lex->getSourceLocation(Offset + 1)));
+}
+
+bool FormatTokenLexer::tryMerge_TMacro() {
+ if (Tokens.size() < 4)
+ return false;
+ FormatToken *Last = Tokens.back();
+ if (!Last->is(tok::r_paren))
+ return false;
+
+ FormatToken *String = Tokens[Tokens.size() - 2];
+ if (!String->is(tok::string_literal) || String->IsMultiline)
+ return false;
+
+ if (!Tokens[Tokens.size() - 3]->is(tok::l_paren))
+ return false;
+
+ FormatToken *Macro = Tokens[Tokens.size() - 4];
+ if (Macro->TokenText != "_T")
+ return false;
+
+ const char *Start = Macro->TokenText.data();
+ const char *End = Last->TokenText.data() + Last->TokenText.size();
+ String->TokenText = StringRef(Start, End - Start);
+ String->IsFirst = Macro->IsFirst;
+ String->LastNewlineOffset = Macro->LastNewlineOffset;
+ String->WhitespaceRange = Macro->WhitespaceRange;
+ String->OriginalColumn = Macro->OriginalColumn;
+ String->ColumnWidth = encoding::columnWidthWithTabs(
+ String->TokenText, String->OriginalColumn, Style.TabWidth, Encoding);
+ String->NewlinesBefore = Macro->NewlinesBefore;
+ String->HasUnescapedNewline = Macro->HasUnescapedNewline;
+
+ Tokens.pop_back();
+ Tokens.pop_back();
+ Tokens.pop_back();
+ Tokens.back() = String;
+ return true;
+}
+
+bool FormatTokenLexer::tryMergeConflictMarkers() {
+ if (Tokens.back()->NewlinesBefore == 0 && Tokens.back()->isNot(tok::eof))
+ return false;
+
+ // Conflict lines look like:
+ // <marker> <text from the vcs>
+ // For example:
+ // >>>>>>> /file/in/file/system at revision 1234
+ //
+ // We merge all tokens in a line that starts with a conflict marker
+ // into a single token with a special token type that the unwrapped line
+ // parser will use to correctly rebuild the underlying code.
+
+ FileID ID;
+ // Get the position of the first token in the line.
+ unsigned FirstInLineOffset;
+ std::tie(ID, FirstInLineOffset) = SourceMgr.getDecomposedLoc(
+ Tokens[FirstInLineIndex]->getStartOfNonWhitespace());
+ StringRef Buffer = SourceMgr.getBuffer(ID)->getBuffer();
+ // Calculate the offset of the start of the current line.
+ auto LineOffset = Buffer.rfind('\n', FirstInLineOffset);
+ if (LineOffset == StringRef::npos) {
+ LineOffset = 0;
+ } else {
+ ++LineOffset;
+ }
+
+ auto FirstSpace = Buffer.find_first_of(" \n", LineOffset);
+ StringRef LineStart;
+ if (FirstSpace == StringRef::npos) {
+ LineStart = Buffer.substr(LineOffset);
+ } else {
+ LineStart = Buffer.substr(LineOffset, FirstSpace - LineOffset);
+ }
+
+ TokenType Type = TT_Unknown;
+ if (LineStart == "<<<<<<<" || LineStart == ">>>>") {
+ Type = TT_ConflictStart;
+ } else if (LineStart == "|||||||" || LineStart == "=======" ||
+ LineStart == "====") {
+ Type = TT_ConflictAlternative;
+ } else if (LineStart == ">>>>>>>" || LineStart == "<<<<") {
+ Type = TT_ConflictEnd;
+ }
+
+ if (Type != TT_Unknown) {
+ FormatToken *Next = Tokens.back();
+
+ Tokens.resize(FirstInLineIndex + 1);
+ // We do not need to build a complete token here, as we will skip it
+ // during parsing anyway (as we must not touch whitespace around conflict
+ // markers).
+ Tokens.back()->Type = Type;
+ Tokens.back()->Tok.setKind(tok::kw___unknown_anytype);
+
+ Tokens.push_back(Next);
+ return true;
+ }
+
+ return false;
+}
+
+FormatToken *FormatTokenLexer::getStashedToken() {
+ // Create a synthesized second '>' or '<' token.
+ Token Tok = FormatTok->Tok;
+ StringRef TokenText = FormatTok->TokenText;
+
+ unsigned OriginalColumn = FormatTok->OriginalColumn;
+ FormatTok = new (Allocator.Allocate()) FormatToken;
+ FormatTok->Tok = Tok;
+ SourceLocation TokLocation =
+ FormatTok->Tok.getLocation().getLocWithOffset(Tok.getLength() - 1);
+ FormatTok->Tok.setLocation(TokLocation);
+ FormatTok->WhitespaceRange = SourceRange(TokLocation, TokLocation);
+ FormatTok->TokenText = TokenText;
+ FormatTok->ColumnWidth = 1;
+ FormatTok->OriginalColumn = OriginalColumn + 1;
+
+ return FormatTok;
+}
+
+FormatToken *FormatTokenLexer::getNextToken() {
+ if (GreaterStashed) {
+ GreaterStashed = false;
+ return getStashedToken();
+ }
+ if (LessStashed) {
+ LessStashed = false;
+ return getStashedToken();
+ }
+
+ FormatTok = new (Allocator.Allocate()) FormatToken;
+ readRawToken(*FormatTok);
+ SourceLocation WhitespaceStart =
+ FormatTok->Tok.getLocation().getLocWithOffset(-TrailingWhitespace);
+ FormatTok->IsFirst = IsFirstToken;
+ IsFirstToken = false;
+
+ // Consume and record whitespace until we find a significant token.
+ unsigned WhitespaceLength = TrailingWhitespace;
+ while (FormatTok->Tok.is(tok::unknown)) {
+ StringRef Text = FormatTok->TokenText;
+ auto EscapesNewline = [&](int pos) {
+ // A '\r' here is just part of '\r\n'. Skip it.
+ if (pos >= 0 && Text[pos] == '\r')
+ --pos;
+ // See whether there is an odd number of '\' before this.
+ unsigned count = 0;
+ for (; pos >= 0; --pos, ++count)
+ if (Text[pos] != '\\')
+ break;
+ return count & 1;
+ };
+ // FIXME: This miscounts tok:unknown tokens that are not just
+ // whitespace, e.g. a '`' character.
+ for (int i = 0, e = Text.size(); i != e; ++i) {
+ switch (Text[i]) {
+ case '\n':
+ ++FormatTok->NewlinesBefore;
+ FormatTok->HasUnescapedNewline = !EscapesNewline(i - 1);
+ FormatTok->LastNewlineOffset = WhitespaceLength + i + 1;
+ Column = 0;
+ break;
+ case '\r':
+ FormatTok->LastNewlineOffset = WhitespaceLength + i + 1;
+ Column = 0;
+ break;
+ case '\f':
+ case '\v':
+ Column = 0;
+ break;
+ case ' ':
+ ++Column;
+ break;
+ case '\t':
+ Column += Style.TabWidth - Column % Style.TabWidth;
+ break;
+ case '\\':
+ if (i + 1 == e || (Text[i + 1] != '\r' && Text[i + 1] != '\n'))
+ FormatTok->Type = TT_ImplicitStringLiteral;
+ break;
+ default:
+ FormatTok->Type = TT_ImplicitStringLiteral;
+ break;
+ }
+ if (FormatTok->Type == TT_ImplicitStringLiteral)
+ break;
+ }
+
+ if (FormatTok->is(TT_ImplicitStringLiteral))
+ break;
+ WhitespaceLength += FormatTok->Tok.getLength();
+
+ readRawToken(*FormatTok);
+ }
+
+ // In case the token starts with escaped newlines, we want to
+ // take them into account as whitespace - this pattern is quite frequent
+ // in macro definitions.
+ // FIXME: Add a more explicit test.
+ while (FormatTok->TokenText.size() > 1 && FormatTok->TokenText[0] == '\\' &&
+ FormatTok->TokenText[1] == '\n') {
+ ++FormatTok->NewlinesBefore;
+ WhitespaceLength += 2;
+ FormatTok->LastNewlineOffset = 2;
+ Column = 0;
+ FormatTok->TokenText = FormatTok->TokenText.substr(2);
+ }
+
+ FormatTok->WhitespaceRange = SourceRange(
+ WhitespaceStart, WhitespaceStart.getLocWithOffset(WhitespaceLength));
+
+ FormatTok->OriginalColumn = Column;
+
+ TrailingWhitespace = 0;
+ if (FormatTok->Tok.is(tok::comment)) {
+ // FIXME: Add the trimmed whitespace to Column.
+ StringRef UntrimmedText = FormatTok->TokenText;
+ FormatTok->TokenText = FormatTok->TokenText.rtrim(" \t\v\f");
+ TrailingWhitespace = UntrimmedText.size() - FormatTok->TokenText.size();
+ } else if (FormatTok->Tok.is(tok::raw_identifier)) {
+ IdentifierInfo &Info = IdentTable.get(FormatTok->TokenText);
+ FormatTok->Tok.setIdentifierInfo(&Info);
+ FormatTok->Tok.setKind(Info.getTokenID());
+ if (Style.Language == FormatStyle::LK_Java &&
+ FormatTok->isOneOf(tok::kw_struct, tok::kw_union, tok::kw_delete,
+ tok::kw_operator)) {
+ FormatTok->Tok.setKind(tok::identifier);
+ FormatTok->Tok.setIdentifierInfo(nullptr);
+ } else if (Style.Language == FormatStyle::LK_JavaScript &&
+ FormatTok->isOneOf(tok::kw_struct, tok::kw_union,
+ tok::kw_operator)) {
+ FormatTok->Tok.setKind(tok::identifier);
+ FormatTok->Tok.setIdentifierInfo(nullptr);
+ }
+ } else if (FormatTok->Tok.is(tok::greatergreater)) {
+ FormatTok->Tok.setKind(tok::greater);
+ FormatTok->TokenText = FormatTok->TokenText.substr(0, 1);
+ GreaterStashed = true;
+ } else if (FormatTok->Tok.is(tok::lessless)) {
+ FormatTok->Tok.setKind(tok::less);
+ FormatTok->TokenText = FormatTok->TokenText.substr(0, 1);
+ LessStashed = true;
+ }
+
+ // Now FormatTok is the next non-whitespace token.
+
+ StringRef Text = FormatTok->TokenText;
+ size_t FirstNewlinePos = Text.find('\n');
+ if (FirstNewlinePos == StringRef::npos) {
+ // FIXME: ColumnWidth actually depends on the start column, we need to
+ // take this into account when the token is moved.
+ FormatTok->ColumnWidth =
+ encoding::columnWidthWithTabs(Text, Column, Style.TabWidth, Encoding);
+ Column += FormatTok->ColumnWidth;
+ } else {
+ FormatTok->IsMultiline = true;
+ // FIXME: ColumnWidth actually depends on the start column, we need to
+ // take this into account when the token is moved.
+ FormatTok->ColumnWidth = encoding::columnWidthWithTabs(
+ Text.substr(0, FirstNewlinePos), Column, Style.TabWidth, Encoding);
+
+ // The last line of the token always starts in column 0.
+ // Thus, the length can be precomputed even in the presence of tabs.
+ FormatTok->LastLineColumnWidth = encoding::columnWidthWithTabs(
+ Text.substr(Text.find_last_of('\n') + 1), 0, Style.TabWidth, Encoding);
+ Column = FormatTok->LastLineColumnWidth;
+ }
+
+ if (Style.Language == FormatStyle::LK_Cpp) {
+ if (!(Tokens.size() > 0 && Tokens.back()->Tok.getIdentifierInfo() &&
+ Tokens.back()->Tok.getIdentifierInfo()->getPPKeywordID() ==
+ tok::pp_define) &&
+ std::find(ForEachMacros.begin(), ForEachMacros.end(),
+ FormatTok->Tok.getIdentifierInfo()) != ForEachMacros.end()) {
+ FormatTok->Type = TT_ForEachMacro;
+ } else if (FormatTok->is(tok::identifier)) {
+ if (MacroBlockBeginRegex.match(Text)) {
+ FormatTok->Type = TT_MacroBlockBegin;
+ } else if (MacroBlockEndRegex.match(Text)) {
+ FormatTok->Type = TT_MacroBlockEnd;
+ }
+ }
+ }
+
+ return FormatTok;
+}
+
+void FormatTokenLexer::readRawToken(FormatToken &Tok) {
+ Lex->LexFromRawLexer(Tok.Tok);
+ Tok.TokenText = StringRef(SourceMgr.getCharacterData(Tok.Tok.getLocation()),
+ Tok.Tok.getLength());
+ // For formatting, treat unterminated string literals like normal string
+ // literals.
+ if (Tok.is(tok::unknown)) {
+ if (!Tok.TokenText.empty() && Tok.TokenText[0] == '"') {
+ Tok.Tok.setKind(tok::string_literal);
+ Tok.IsUnterminatedLiteral = true;
+ } else if (Style.Language == FormatStyle::LK_JavaScript &&
+ Tok.TokenText == "''") {
+ Tok.Tok.setKind(tok::string_literal);
+ }
+ }
+
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ Tok.is(tok::char_constant)) {
+ Tok.Tok.setKind(tok::string_literal);
+ }
+
+ if (Tok.is(tok::comment) && (Tok.TokenText == "// clang-format on" ||
+ Tok.TokenText == "/* clang-format on */")) {
+ FormattingDisabled = false;
+ }
+
+ Tok.Finalized = FormattingDisabled;
+
+ if (Tok.is(tok::comment) && (Tok.TokenText == "// clang-format off" ||
+ Tok.TokenText == "/* clang-format off */")) {
+ FormattingDisabled = true;
+ }
+}
+
+void FormatTokenLexer::resetLexer(unsigned Offset) {
+ StringRef Buffer = SourceMgr.getBufferData(ID);
+ Lex.reset(new Lexer(SourceMgr.getLocForStartOfFile(ID),
+ getFormattingLangOpts(Style), Buffer.begin(),
+ Buffer.begin() + Offset, Buffer.end()));
+ Lex->SetKeepWhitespaceMode(true);
+ TrailingWhitespace = 0;
+}
+
+} // namespace format
+} // namespace clang
diff --git a/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.h b/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.h
new file mode 100644
index 0000000..fa8c888
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/FormatTokenLexer.h
@@ -0,0 +1,97 @@
+//===--- FormatTokenLexer.h - Format C++ code ----------------*- C++ ----*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file contains FormatTokenLexer, which tokenizes a source file
+/// into a token stream suitable for ClangFormat.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_LIB_FORMAT_FORMATTOKENLEXER_H
+#define LLVM_CLANG_LIB_FORMAT_FORMATTOKENLEXER_H
+
+#include "Encoding.h"
+#include "FormatToken.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Format/Format.h"
+#include "llvm/Support/Regex.h"
+
+namespace clang {
+namespace format {
+
+class FormatTokenLexer {
+public:
+ FormatTokenLexer(const SourceManager &SourceMgr, FileID ID,
+ const FormatStyle &Style, encoding::Encoding Encoding);
+
+ ArrayRef<FormatToken *> lex();
+
+ const AdditionalKeywords &getKeywords() { return Keywords; }
+
+private:
+ void tryMergePreviousTokens();
+
+ bool tryMergeLessLess();
+
+ bool tryMergeTokens(ArrayRef<tok::TokenKind> Kinds, TokenType NewType);
+
+ // Returns \c true if \p Tok can only be followed by an operand in JavaScript.
+ bool precedesOperand(FormatToken *Tok);
+
+ bool canPrecedeRegexLiteral(FormatToken *Prev);
+
+ // Tries to parse a JavaScript Regex literal starting at the current token,
+ // if that begins with a slash and is in a location where JavaScript allows
+ // regex literals. Changes the current token to a regex literal and updates
+ // its text if successful.
+ void tryParseJSRegexLiteral();
+
+ void tryParseTemplateString();
+
+ bool tryMerge_TMacro();
+
+ bool tryMergeConflictMarkers();
+
+ FormatToken *getStashedToken();
+
+ FormatToken *getNextToken();
+
+ FormatToken *FormatTok;
+ bool IsFirstToken;
+ bool GreaterStashed, LessStashed;
+ unsigned Column;
+ unsigned TrailingWhitespace;
+ std::unique_ptr<Lexer> Lex;
+ const SourceManager &SourceMgr;
+ FileID ID;
+ const FormatStyle &Style;
+ IdentifierTable IdentTable;
+ AdditionalKeywords Keywords;
+ encoding::Encoding Encoding;
+ llvm::SpecificBumpPtrAllocator<FormatToken> Allocator;
+ // Index (in 'Tokens') of the last token that starts a new line.
+ unsigned FirstInLineIndex;
+ SmallVector<FormatToken *, 16> Tokens;
+ SmallVector<IdentifierInfo *, 8> ForEachMacros;
+
+ bool FormattingDisabled;
+
+ llvm::Regex MacroBlockBeginRegex;
+ llvm::Regex MacroBlockEndRegex;
+
+ void readRawToken(FormatToken &Tok);
+
+ void resetLexer(unsigned Offset);
+};
+
+} // namespace format
+} // namespace clang
+
+#endif
diff --git a/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.cpp b/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.cpp
new file mode 100644
index 0000000..32d5d75
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.cpp
@@ -0,0 +1,442 @@
+//===--- SortJavaScriptImports.h - Sort ES6 Imports -------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file implements a sort operation for JavaScript ES6 imports.
+///
+//===----------------------------------------------------------------------===//
+
+#include "SortJavaScriptImports.h"
+#include "SortJavaScriptImports.h"
+#include "TokenAnalyzer.h"
+#include "TokenAnnotator.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/LLVM.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Format/Format.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/Support/Debug.h"
+#include <algorithm>
+#include <string>
+
+#define DEBUG_TYPE "format-formatter"
+
+namespace clang {
+namespace format {
+
+class FormatTokenLexer;
+
+using clang::format::FormatStyle;
+
+// An imported symbol in a JavaScript ES6 import/export, possibly aliased.
+struct JsImportedSymbol {
+ StringRef Symbol;
+ StringRef Alias;
+ SourceRange Range;
+
+ bool operator==(const JsImportedSymbol &RHS) const {
+ // Ignore Range for comparison, it is only used to stitch code together,
+ // but imports at different code locations are still conceptually the same.
+ return Symbol == RHS.Symbol && Alias == RHS.Alias;
+ }
+};
+
+// An ES6 module reference.
+//
+// ES6 implements a module system, where individual modules (~= source files)
+// can reference other modules, either importing symbols from them, or exporting
+// symbols from them:
+// import {foo} from 'foo';
+// export {foo};
+// export {bar} from 'bar';
+//
+// `export`s with URLs are syntactic sugar for an import of the symbol from the
+// URL, followed by an export of the symbol, allowing this code to treat both
+// statements more or less identically, with the exception being that `export`s
+// are sorted last.
+//
+// imports and exports support individual symbols, but also a wildcard syntax:
+// import * as prefix from 'foo';
+// export * from 'bar';
+//
+// This struct represents both exports and imports to build up the information
+// required for sorting module references.
+struct JsModuleReference {
+ bool IsExport = false;
+ // Module references are sorted into these categories, in order.
+ enum ReferenceCategory {
+ SIDE_EFFECT, // "import 'something';"
+ ABSOLUTE, // from 'something'
+ RELATIVE_PARENT, // from '../*'
+ RELATIVE, // from './*'
+ };
+ ReferenceCategory Category = ReferenceCategory::SIDE_EFFECT;
+ // The URL imported, e.g. `import .. from 'url';`. Empty for `export {a, b};`.
+ StringRef URL;
+ // Prefix from "import * as prefix". Empty for symbol imports and `export *`.
+ // Implies an empty names list.
+ StringRef Prefix;
+ // Symbols from `import {SymbolA, SymbolB, ...} from ...;`.
+ SmallVector<JsImportedSymbol, 1> Symbols;
+ // Textual position of the import/export, including preceding and trailing
+ // comments.
+ SourceRange Range;
+};
+
+bool operator<(const JsModuleReference &LHS, const JsModuleReference &RHS) {
+ if (LHS.IsExport != RHS.IsExport)
+ return LHS.IsExport < RHS.IsExport;
+ if (LHS.Category != RHS.Category)
+ return LHS.Category < RHS.Category;
+ if (LHS.Category == JsModuleReference::ReferenceCategory::SIDE_EFFECT)
+ // Side effect imports might be ordering sensitive. Consider them equal so
+ // that they maintain their relative order in the stable sort below.
+ // This retains transitivity because LHS.Category == RHS.Category here.
+ return false;
+ // Empty URLs sort *last* (for export {...};).
+ if (LHS.URL.empty() != RHS.URL.empty())
+ return LHS.URL.empty() < RHS.URL.empty();
+ if (int Res = LHS.URL.compare_lower(RHS.URL))
+ return Res < 0;
+ // '*' imports (with prefix) sort before {a, b, ...} imports.
+ if (LHS.Prefix.empty() != RHS.Prefix.empty())
+ return LHS.Prefix.empty() < RHS.Prefix.empty();
+ if (LHS.Prefix != RHS.Prefix)
+ return LHS.Prefix > RHS.Prefix;
+ return false;
+}
+
+// JavaScriptImportSorter sorts JavaScript ES6 imports and exports. It is
+// implemented as a TokenAnalyzer because ES6 imports have substantial syntactic
+// structure, making it messy to sort them using regular expressions.
+class JavaScriptImportSorter : public TokenAnalyzer {
+public:
+ JavaScriptImportSorter(const Environment &Env, const FormatStyle &Style)
+ : TokenAnalyzer(Env, Style),
+ FileContents(Env.getSourceManager().getBufferData(Env.getFileID())) {}
+
+ tooling::Replacements
+ analyze(TokenAnnotator &Annotator,
+ SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
+ FormatTokenLexer &Tokens, tooling::Replacements &Result) override {
+ AffectedRangeMgr.computeAffectedLines(AnnotatedLines.begin(),
+ AnnotatedLines.end());
+
+ const AdditionalKeywords &Keywords = Tokens.getKeywords();
+ SmallVector<JsModuleReference, 16> References;
+ AnnotatedLine *FirstNonImportLine;
+ std::tie(References, FirstNonImportLine) =
+ parseModuleReferences(Keywords, AnnotatedLines);
+
+ if (References.empty())
+ return Result;
+
+ SmallVector<unsigned, 16> Indices;
+ for (unsigned i = 0, e = References.size(); i != e; ++i)
+ Indices.push_back(i);
+ std::stable_sort(Indices.begin(), Indices.end(),
+ [&](unsigned LHSI, unsigned RHSI) {
+ return References[LHSI] < References[RHSI];
+ });
+ bool ReferencesInOrder = std::is_sorted(Indices.begin(), Indices.end());
+
+ std::string ReferencesText;
+ bool SymbolsInOrder = true;
+ for (unsigned i = 0, e = Indices.size(); i != e; ++i) {
+ JsModuleReference Reference = References[Indices[i]];
+ if (appendReference(ReferencesText, Reference))
+ SymbolsInOrder = false;
+ if (i + 1 < e) {
+ // Insert breaks between imports and exports.
+ ReferencesText += "\n";
+ // Separate imports groups with two line breaks, but keep all exports
+ // in a single group.
+ if (!Reference.IsExport &&
+ (Reference.IsExport != References[Indices[i + 1]].IsExport ||
+ Reference.Category != References[Indices[i + 1]].Category))
+ ReferencesText += "\n";
+ }
+ }
+
+ if (ReferencesInOrder && SymbolsInOrder)
+ return Result;
+
+ SourceRange InsertionPoint = References[0].Range;
+ InsertionPoint.setEnd(References[References.size() - 1].Range.getEnd());
+
+ // The loop above might collapse previously existing line breaks between
+ // import blocks, and thus shrink the file. SortIncludes must not shrink
+ // overall source length as there is currently no re-calculation of ranges
+ // after applying source sorting.
+ // This loop just backfills trailing spaces after the imports, which are
+ // harmless and will be stripped by the subsequent formatting pass.
+ // FIXME: A better long term fix is to re-calculate Ranges after sorting.
+ unsigned PreviousSize = getSourceText(InsertionPoint).size();
+ while (ReferencesText.size() < PreviousSize) {
+ ReferencesText += " ";
+ }
+
+ // Separate references from the main code body of the file.
+ if (FirstNonImportLine && FirstNonImportLine->First->NewlinesBefore < 2)
+ ReferencesText += "\n";
+
+ DEBUG(llvm::dbgs() << "Replacing imports:\n"
+ << getSourceText(InsertionPoint) << "\nwith:\n"
+ << ReferencesText << "\n");
+ Result.insert(tooling::Replacement(
+ Env.getSourceManager(), CharSourceRange::getCharRange(InsertionPoint),
+ ReferencesText));
+
+ return Result;
+ }
+
+private:
+ FormatToken *Current;
+ FormatToken *LineEnd;
+
+ FormatToken invalidToken;
+
+ StringRef FileContents;
+
+ void skipComments() { Current = skipComments(Current); }
+
+ FormatToken *skipComments(FormatToken *Tok) {
+ while (Tok && Tok->is(tok::comment))
+ Tok = Tok->Next;
+ return Tok;
+ }
+
+ void nextToken() {
+ Current = Current->Next;
+ skipComments();
+ if (!Current || Current == LineEnd->Next) {
+ // Set the current token to an invalid token, so that further parsing on
+ // this line fails.
+ invalidToken.Tok.setKind(tok::unknown);
+ Current = &invalidToken;
+ }
+ }
+
+ StringRef getSourceText(SourceRange Range) {
+ return getSourceText(Range.getBegin(), Range.getEnd());
+ }
+
+ StringRef getSourceText(SourceLocation Begin, SourceLocation End) {
+ const SourceManager &SM = Env.getSourceManager();
+ return FileContents.substr(SM.getFileOffset(Begin),
+ SM.getFileOffset(End) - SM.getFileOffset(Begin));
+ }
+
+ // Appends ``Reference`` to ``Buffer``, returning true if text within the
+ // ``Reference`` changed (e.g. symbol order).
+ bool appendReference(std::string &Buffer, JsModuleReference &Reference) {
+ // Sort the individual symbols within the import.
+ // E.g. `import {b, a} from 'x';` -> `import {a, b} from 'x';`
+ SmallVector<JsImportedSymbol, 1> Symbols = Reference.Symbols;
+ std::stable_sort(
+ Symbols.begin(), Symbols.end(),
+ [&](const JsImportedSymbol &LHS, const JsImportedSymbol &RHS) {
+ return LHS.Symbol.compare_lower(RHS.Symbol) < 0;
+ });
+ if (Symbols == Reference.Symbols) {
+ // No change in symbol order.
+ StringRef ReferenceStmt = getSourceText(Reference.Range);
+ Buffer += ReferenceStmt;
+ return false;
+ }
+ // Stitch together the module reference start...
+ SourceLocation SymbolsStart = Reference.Symbols.front().Range.getBegin();
+ SourceLocation SymbolsEnd = Reference.Symbols.back().Range.getEnd();
+ Buffer += getSourceText(Reference.Range.getBegin(), SymbolsStart);
+ // ... then the references in order ...
+ for (auto I = Symbols.begin(), E = Symbols.end(); I != E; ++I) {
+ if (I != Symbols.begin())
+ Buffer += ",";
+ Buffer += getSourceText(I->Range);
+ }
+ // ... followed by the module reference end.
+ Buffer += getSourceText(SymbolsEnd, Reference.Range.getEnd());
+ return true;
+ }
+
+ // Parses module references in the given lines. Returns the module references,
+ // and a pointer to the first "main code" line if that is adjacent to the
+ // affected lines of module references, nullptr otherwise.
+ std::pair<SmallVector<JsModuleReference, 16>, AnnotatedLine*>
+ parseModuleReferences(const AdditionalKeywords &Keywords,
+ SmallVectorImpl<AnnotatedLine *> &AnnotatedLines) {
+ SmallVector<JsModuleReference, 16> References;
+ SourceLocation Start;
+ bool FoundLines = false;
+ AnnotatedLine *FirstNonImportLine = nullptr;
+ for (auto Line : AnnotatedLines) {
+ if (!Line->Affected) {
+ // Only sort the first contiguous block of affected lines.
+ if (FoundLines)
+ break;
+ else
+ continue;
+ }
+ Current = Line->First;
+ LineEnd = Line->Last;
+ skipComments();
+ if (Start.isInvalid() || References.empty())
+ // After the first file level comment, consider line comments to be part
+ // of the import that immediately follows them by using the previously
+ // set Start.
+ Start = Line->First->Tok.getLocation();
+ if (!Current)
+ continue; // Only comments on this line.
+ FoundLines = true;
+ JsModuleReference Reference;
+ Reference.Range.setBegin(Start);
+ if (!parseModuleReference(Keywords, Reference)) {
+ FirstNonImportLine = Line;
+ break;
+ }
+ Reference.Range.setEnd(LineEnd->Tok.getEndLoc());
+ DEBUG({
+ llvm::dbgs() << "JsModuleReference: {"
+ << "is_export: " << Reference.IsExport
+ << ", cat: " << Reference.Category
+ << ", url: " << Reference.URL
+ << ", prefix: " << Reference.Prefix;
+ for (size_t i = 0; i < Reference.Symbols.size(); ++i)
+ llvm::dbgs() << ", " << Reference.Symbols[i].Symbol << " as "
+ << Reference.Symbols[i].Alias;
+ llvm::dbgs() << ", text: " << getSourceText(Reference.Range);
+ llvm::dbgs() << "}\n";
+ });
+ References.push_back(Reference);
+ Start = SourceLocation();
+ }
+ return std::make_pair(References, FirstNonImportLine);
+ }
+
+ // Parses a JavaScript/ECMAScript 6 module reference.
+ // See http://www.ecma-international.org/ecma-262/6.0/#sec-scripts-and-modules
+ // for grammar EBNF (production ModuleItem).
+ bool parseModuleReference(const AdditionalKeywords &Keywords,
+ JsModuleReference &Reference) {
+ if (!Current || !Current->isOneOf(Keywords.kw_import, tok::kw_export))
+ return false;
+ Reference.IsExport = Current->is(tok::kw_export);
+
+ nextToken();
+ if (Current->isStringLiteral() && !Reference.IsExport) {
+ // "import 'side-effect';"
+ Reference.Category = JsModuleReference::ReferenceCategory::SIDE_EFFECT;
+ Reference.URL =
+ Current->TokenText.substr(1, Current->TokenText.size() - 2);
+ return true;
+ }
+
+ if (!parseModuleBindings(Keywords, Reference))
+ return false;
+ nextToken();
+
+ if (Current->is(Keywords.kw_from)) {
+ // imports have a 'from' clause, exports might not.
+ nextToken();
+ if (!Current->isStringLiteral())
+ return false;
+ // URL = TokenText without the quotes.
+ Reference.URL =
+ Current->TokenText.substr(1, Current->TokenText.size() - 2);
+ if (Reference.URL.startswith(".."))
+ Reference.Category =
+ JsModuleReference::ReferenceCategory::RELATIVE_PARENT;
+ else if (Reference.URL.startswith("."))
+ Reference.Category = JsModuleReference::ReferenceCategory::RELATIVE;
+ else
+ Reference.Category = JsModuleReference::ReferenceCategory::ABSOLUTE;
+ } else {
+ // w/o URL groups with "empty".
+ Reference.Category = JsModuleReference::ReferenceCategory::RELATIVE;
+ }
+ return true;
+ }
+
+ bool parseModuleBindings(const AdditionalKeywords &Keywords,
+ JsModuleReference &Reference) {
+ if (parseStarBinding(Keywords, Reference))
+ return true;
+ return parseNamedBindings(Keywords, Reference);
+ }
+
+ bool parseStarBinding(const AdditionalKeywords &Keywords,
+ JsModuleReference &Reference) {
+ // * as prefix from '...';
+ if (Current->isNot(tok::star))
+ return false;
+ nextToken();
+ if (Current->isNot(Keywords.kw_as))
+ return false;
+ nextToken();
+ if (Current->isNot(tok::identifier))
+ return false;
+ Reference.Prefix = Current->TokenText;
+ return true;
+ }
+
+ bool parseNamedBindings(const AdditionalKeywords &Keywords,
+ JsModuleReference &Reference) {
+ if (Current->isNot(tok::l_brace))
+ return false;
+
+ // {sym as alias, sym2 as ...} from '...';
+ nextToken();
+ while (true) {
+ if (Current->is(tok::r_brace))
+ return true;
+ if (Current->isNot(tok::identifier))
+ return false;
+
+ JsImportedSymbol Symbol;
+ Symbol.Symbol = Current->TokenText;
+ // Make sure to include any preceding comments.
+ Symbol.Range.setBegin(
+ Current->getPreviousNonComment()->Next->WhitespaceRange.getBegin());
+ nextToken();
+
+ if (Current->is(Keywords.kw_as)) {
+ nextToken();
+ if (Current->isNot(tok::identifier))
+ return false;
+ Symbol.Alias = Current->TokenText;
+ nextToken();
+ }
+ Symbol.Range.setEnd(Current->Tok.getLocation());
+ Reference.Symbols.push_back(Symbol);
+
+ if (Current->is(tok::r_brace))
+ return true;
+ if (Current->isNot(tok::comma))
+ return false;
+ nextToken();
+ }
+ }
+};
+
+tooling::Replacements sortJavaScriptImports(const FormatStyle &Style,
+ StringRef Code,
+ ArrayRef<tooling::Range> Ranges,
+ StringRef FileName) {
+ // FIXME: Cursor support.
+ std::unique_ptr<Environment> Env =
+ Environment::CreateVirtualEnvironment(Code, FileName, Ranges);
+ JavaScriptImportSorter Sorter(*Env, Style);
+ return Sorter.process();
+}
+
+} // end namespace format
+} // end namespace clang
diff --git a/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.h b/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.h
new file mode 100644
index 0000000..f22a051
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/SortJavaScriptImports.h
@@ -0,0 +1,36 @@
+//===--- SortJavaScriptImports.h - Sort ES6 Imports -------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file implements a sorter for JavaScript ES6 imports.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_LIB_FORMAT_SORTJAVASCRIPTIMPORTS_H
+#define LLVM_CLANG_LIB_FORMAT_SORTJAVASCRIPTIMPORTS_H
+
+#include "clang/Basic/LLVM.h"
+#include "clang/Format/Format.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/StringRef.h"
+
+namespace clang {
+namespace format {
+
+// Sort JavaScript ES6 imports/exports in ``Code``. The generated replacements
+// only monotonically increase the length of the given code.
+tooling::Replacements sortJavaScriptImports(const FormatStyle &Style,
+ StringRef Code,
+ ArrayRef<tooling::Range> Ranges,
+ StringRef FileName);
+
+} // end namespace format
+} // end namespace clang
+
+#endif
diff --git a/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.cpp b/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.cpp
new file mode 100644
index 0000000..89ac35f
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.cpp
@@ -0,0 +1,138 @@
+//===--- TokenAnalyzer.cpp - Analyze Token Streams --------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file implements an abstract TokenAnalyzer and associated helper
+/// classes. TokenAnalyzer can be extended to generate replacements based on
+/// an annotated and pre-processed token stream.
+///
+//===----------------------------------------------------------------------===//
+
+#include "TokenAnalyzer.h"
+#include "AffectedRangeManager.h"
+#include "Encoding.h"
+#include "FormatToken.h"
+#include "FormatTokenLexer.h"
+#include "TokenAnnotator.h"
+#include "UnwrappedLineParser.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Format/Format.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/Support/Debug.h"
+
+#define DEBUG_TYPE "format-formatter"
+
+namespace clang {
+namespace format {
+
+// This sets up an virtual file system with file \p FileName containing \p
+// Code.
+std::unique_ptr<Environment>
+Environment::CreateVirtualEnvironment(StringRef Code, StringRef FileName,
+ ArrayRef<tooling::Range> Ranges) {
+ // This is referenced by `FileMgr` and will be released by `FileMgr` when it
+ // is deleted.
+ IntrusiveRefCntPtr<vfs::InMemoryFileSystem> InMemoryFileSystem(
+ new vfs::InMemoryFileSystem);
+ // This is passed to `SM` as reference, so the pointer has to be referenced
+ // in `Environment` so that `FileMgr` can out-live this function scope.
+ std::unique_ptr<FileManager> FileMgr(
+ new FileManager(FileSystemOptions(), InMemoryFileSystem));
+ // This is passed to `SM` as reference, so the pointer has to be referenced
+ // by `Environment` due to the same reason above.
+ std::unique_ptr<DiagnosticsEngine> Diagnostics(new DiagnosticsEngine(
+ IntrusiveRefCntPtr<DiagnosticIDs>(new DiagnosticIDs),
+ new DiagnosticOptions));
+ // This will be stored as reference, so the pointer has to be stored in
+ // due to the same reason above.
+ std::unique_ptr<SourceManager> VirtualSM(
+ new SourceManager(*Diagnostics, *FileMgr));
+ InMemoryFileSystem->addFile(
+ FileName, 0, llvm::MemoryBuffer::getMemBuffer(
+ Code, FileName, /*RequiresNullTerminator=*/false));
+ FileID ID = VirtualSM->createFileID(FileMgr->getFile(FileName),
+ SourceLocation(), clang::SrcMgr::C_User);
+ assert(ID.isValid());
+ SourceLocation StartOfFile = VirtualSM->getLocForStartOfFile(ID);
+ std::vector<CharSourceRange> CharRanges;
+ for (const tooling::Range &Range : Ranges) {
+ SourceLocation Start = StartOfFile.getLocWithOffset(Range.getOffset());
+ SourceLocation End = Start.getLocWithOffset(Range.getLength());
+ CharRanges.push_back(CharSourceRange::getCharRange(Start, End));
+ }
+ return llvm::make_unique<Environment>(ID, std::move(FileMgr),
+ std::move(VirtualSM),
+ std::move(Diagnostics), CharRanges);
+}
+
+TokenAnalyzer::TokenAnalyzer(const Environment &Env, const FormatStyle &Style)
+ : Style(Style), Env(Env),
+ AffectedRangeMgr(Env.getSourceManager(), Env.getCharRanges()),
+ UnwrappedLines(1),
+ Encoding(encoding::detectEncoding(
+ Env.getSourceManager().getBufferData(Env.getFileID()))) {
+ DEBUG(
+ llvm::dbgs() << "File encoding: "
+ << (Encoding == encoding::Encoding_UTF8 ? "UTF8" : "unknown")
+ << "\n");
+ DEBUG(llvm::dbgs() << "Language: " << getLanguageName(Style.Language)
+ << "\n");
+}
+
+tooling::Replacements TokenAnalyzer::process() {
+ tooling::Replacements Result;
+ FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(), Style,
+ Encoding);
+
+ UnwrappedLineParser Parser(Style, Tokens.getKeywords(), Tokens.lex(), *this);
+ Parser.parse();
+ assert(UnwrappedLines.rbegin()->empty());
+ for (unsigned Run = 0, RunE = UnwrappedLines.size(); Run + 1 != RunE; ++Run) {
+ DEBUG(llvm::dbgs() << "Run " << Run << "...\n");
+ SmallVector<AnnotatedLine *, 16> AnnotatedLines;
+
+ TokenAnnotator Annotator(Style, Tokens.getKeywords());
+ for (unsigned i = 0, e = UnwrappedLines[Run].size(); i != e; ++i) {
+ AnnotatedLines.push_back(new AnnotatedLine(UnwrappedLines[Run][i]));
+ Annotator.annotate(*AnnotatedLines.back());
+ }
+
+ tooling::Replacements RunResult =
+ analyze(Annotator, AnnotatedLines, Tokens, Result);
+
+ DEBUG({
+ llvm::dbgs() << "Replacements for run " << Run << ":\n";
+ for (tooling::Replacements::iterator I = RunResult.begin(),
+ E = RunResult.end();
+ I != E; ++I) {
+ llvm::dbgs() << I->toString() << "\n";
+ }
+ });
+ for (unsigned i = 0, e = AnnotatedLines.size(); i != e; ++i) {
+ delete AnnotatedLines[i];
+ }
+ Result.insert(RunResult.begin(), RunResult.end());
+ }
+ return Result;
+}
+
+void TokenAnalyzer::consumeUnwrappedLine(const UnwrappedLine &TheLine) {
+ assert(!UnwrappedLines.empty());
+ UnwrappedLines.back().push_back(TheLine);
+}
+
+void TokenAnalyzer::finishRun() {
+ UnwrappedLines.push_back(SmallVector<UnwrappedLine, 16>());
+}
+
+} // end namespace format
+} // end namespace clang
diff --git a/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.h b/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.h
new file mode 100644
index 0000000..c1aa9c5
--- /dev/null
+++ b/contrib/llvm/tools/clang/lib/Format/TokenAnalyzer.h
@@ -0,0 +1,108 @@
+//===--- TokenAnalyzer.h - Analyze Token Streams ----------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+/// \brief This file declares an abstract TokenAnalyzer, and associated helper
+/// classes. TokenAnalyzer can be extended to generate replacements based on
+/// an annotated and pre-processed token stream.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_LIB_FORMAT_TOKENANALYZER_H
+#define LLVM_CLANG_LIB_FORMAT_TOKENANALYZER_H
+
+#include "AffectedRangeManager.h"
+#include "Encoding.h"
+#include "FormatToken.h"
+#include "FormatTokenLexer.h"
+#include "TokenAnnotator.h"
+#include "UnwrappedLineParser.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/DiagnosticOptions.h"
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+#include "clang/Format/Format.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/Support/Debug.h"
+
+#define DEBUG_TYPE "format-formatter"
+
+namespace clang {
+namespace format {
+
+class Environment {
+public:
+ Environment(SourceManager &SM, FileID ID, ArrayRef<CharSourceRange> Ranges)
+ : ID(ID), CharRanges(Ranges.begin(), Ranges.end()), SM(SM) {}
+
+ Environment(FileID ID, std::unique_ptr<FileManager> FileMgr,
+ std::unique_ptr<SourceManager> VirtualSM,
+ std::unique_ptr<DiagnosticsEngine> Diagnostics,
+ const std::vector<CharSourceRange> &CharRanges)
+ : ID(ID), CharRanges(CharRanges.begin(), CharRanges.end()),
+ SM(*VirtualSM), FileMgr(std::move(FileMgr)),
+ VirtualSM(std::move(VirtualSM)), Diagnostics(std::move(Diagnostics)) {}
+
+ // This sets up an virtual file system with file \p FileName containing \p
+ // Code.
+ static std::unique_ptr<Environment>
+ CreateVirtualEnvironment(StringRef Code, StringRef FileName,
+ ArrayRef<tooling::Range> Ranges);
+
+ FileID getFileID() const { return ID; }
+
+ StringRef getFileName() const { return FileName; }
+
+ ArrayRef<CharSourceRange> getCharRanges() const { return CharRanges; }
+
+ const SourceManager &getSourceManager() const { return SM; }
+
+private:
+ FileID ID;
+ StringRef FileName;
+ SmallVector<CharSourceRange, 8> CharRanges;
+ SourceManager &SM;
+
+ // The order of these fields are important - they should be in the same order
+ // as they are created in `CreateVirtualEnvironment` so that they can be
+ // deleted in the reverse order as they are created.
+ std::unique_ptr<FileManager> FileMgr;
+ std::unique_ptr<SourceManager> VirtualSM;
+ std::unique_ptr<DiagnosticsEngine> Diagnostics;
+};
+
+class TokenAnalyzer : public UnwrappedLineConsumer {
+public:
+ TokenAnalyzer(const Environment &Env, const FormatStyle &Style);
+
+ tooling::Replacements process();
+
+protected:
+ virtual tooling::Replacements
+ analyze(TokenAnnotator &Annotator,
+ SmallVectorImpl<AnnotatedLine *> &AnnotatedLines,
+ FormatTokenLexer &Tokens, tooling::Replacements &Result) = 0;
+
+ void consumeUnwrappedLine(const UnwrappedLine &TheLine) override;
+
+ void finishRun() override;
+
+ FormatStyle Style;
+ // Stores Style, FileID and SourceManager etc.
+ const Environment &Env;
+ // AffectedRangeMgr stores ranges to be fixed.
+ AffectedRangeManager AffectedRangeMgr;
+ SmallVector<SmallVector<UnwrappedLine, 16>, 2> UnwrappedLines;
+ encoding::Encoding Encoding;
+};
+
+} // end namespace format
+} // end namespace clang
+
+#endif
diff --git a/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.cpp b/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.cpp
index 8fbb43b..4a90522 100644
--- a/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.cpp
@@ -42,11 +42,24 @@ public:
private:
bool parseAngle() {
- if (!CurrentToken)
+ if (!CurrentToken || !CurrentToken->Previous)
+ return false;
+ if (NonTemplateLess.count(CurrentToken->Previous))
return false;
+
+ const FormatToken& Previous = *CurrentToken->Previous;
+ if (Previous.Previous) {
+ if (Previous.Previous->Tok.isLiteral())
+ return false;
+ if (Previous.Previous->is(tok::r_paren) && Contexts.size() > 1 &&
+ (!Previous.Previous->MatchingParen ||
+ !Previous.Previous->MatchingParen->is(TT_OverloadedOperatorLParen)))
+ return false;
+ }
+
FormatToken *Left = CurrentToken->Previous;
Left->ParentBracket = Contexts.back().ContextKind;
- ScopedContextCreator ContextCreator(*this, tok::less, 10);
+ ScopedContextCreator ContextCreator(*this, tok::less, 12);
// If this angle is in the context of an expression, we need to be more
// hesitant to detect it as opening template parameters.
@@ -121,6 +134,10 @@ private:
if (Left->is(TT_OverloadedOperatorLParen)) {
Contexts.back().IsExpression = false;
+ } else if (Style.Language == FormatStyle::LK_JavaScript &&
+ Line.startsWith(Keywords.kw_type, tok::identifier)) {
+ // type X = (...);
+ Contexts.back().IsExpression = false;
} else if (Left->Previous &&
(Left->Previous->isOneOf(tok::kw_static_assert, tok::kw_decltype,
tok::kw_if, tok::kw_while, tok::l_paren,
@@ -128,6 +145,16 @@ private:
Left->Previous->is(TT_BinaryOperator))) {
// static_assert, if and while usually contain expressions.
Contexts.back().IsExpression = true;
+ } else if (Style.Language == FormatStyle::LK_JavaScript && Left->Previous &&
+ (Left->Previous->is(Keywords.kw_function) ||
+ (Left->Previous->endsSequence(tok::identifier,
+ Keywords.kw_function)))) {
+ // function(...) or function f(...)
+ Contexts.back().IsExpression = false;
+ } else if (Style.Language == FormatStyle::LK_JavaScript && Left->Previous &&
+ Left->Previous->is(TT_JsTypeColon)) {
+ // let x: (SomeType);
+ Contexts.back().IsExpression = false;
} else if (Left->Previous && Left->Previous->is(tok::r_square) &&
Left->Previous->MatchingParen &&
Left->Previous->MatchingParen->is(TT_LambdaLSquare)) {
@@ -159,8 +186,8 @@ private:
Left->Type = TT_ObjCMethodExpr;
}
- bool MightBeFunctionType = CurrentToken->isOneOf(tok::star, tok::amp) &&
- !Contexts[Contexts.size() - 2].IsExpression;
+ bool MightBeFunctionType = !Contexts[Contexts.size() - 2].IsExpression;
+ bool ProbablyFunctionType = CurrentToken->isOneOf(tok::star, tok::amp);
bool HasMultipleLines = false;
bool HasMultipleParametersOnALine = false;
bool MightBeObjCForRangeLoop =
@@ -187,14 +214,15 @@ private:
if (CurrentToken->Previous->is(TT_PointerOrReference) &&
CurrentToken->Previous->Previous->isOneOf(tok::l_paren,
tok::coloncolon))
- MightBeFunctionType = true;
+ ProbablyFunctionType = true;
+ if (CurrentToken->is(tok::comma))
+ MightBeFunctionType = false;
if (CurrentToken->Previous->is(TT_BinaryOperator))
Contexts.back().IsExpression = true;
if (CurrentToken->is(tok::r_paren)) {
- if (MightBeFunctionType && CurrentToken->Next &&
+ if (MightBeFunctionType && ProbablyFunctionType && CurrentToken->Next &&
(CurrentToken->Next->is(tok::l_paren) ||
- (CurrentToken->Next->is(tok::l_square) &&
- Line.MustBeDeclaration)))
+ (CurrentToken->Next->is(tok::l_square) && Line.MustBeDeclaration)))
Left->Type = TT_FunctionTypeLParen;
Left->MatchingParen = CurrentToken;
CurrentToken->MatchingParen = Left;
@@ -299,9 +327,9 @@ private:
Left->Type = TT_JsComputedPropertyName;
} else if (Style.Language == FormatStyle::LK_Proto ||
(Parent &&
- Parent->isOneOf(TT_BinaryOperator, tok::at, tok::comma,
- tok::l_paren, tok::l_square, tok::question,
- tok::colon, tok::kw_return,
+ Parent->isOneOf(TT_BinaryOperator, TT_TemplateCloser, tok::at,
+ tok::comma, tok::l_paren, tok::l_square,
+ tok::question, tok::colon, tok::kw_return,
// Should only be relevant to JavaScript:
tok::kw_default))) {
Left->Type = TT_ArrayInitializerLSquare;
@@ -396,7 +424,8 @@ private:
(!Contexts.back().ColonIsDictLiteral ||
Style.Language != FormatStyle::LK_Cpp)) ||
Style.Language == FormatStyle::LK_Proto) &&
- Previous->Tok.getIdentifierInfo())
+ (Previous->Tok.getIdentifierInfo() ||
+ Previous->is(tok::string_literal)))
Previous->Type = TT_SelectorName;
if (CurrentToken->is(tok::colon) ||
Style.Language == FormatStyle::LK_JavaScript)
@@ -410,7 +439,7 @@ private:
}
void updateParameterCount(FormatToken *Left, FormatToken *Current) {
- if (Current->is(tok::l_brace) && !Current->is(TT_DictLiteral))
+ if (Current->is(tok::l_brace) && Current->BlockKind == BK_Block)
++Left->BlockParameterCount;
if (Current->is(tok::comma)) {
++Left->ParameterCount;
@@ -491,7 +520,7 @@ private:
Tok->Type = TT_BitFieldColon;
} else if (Contexts.size() == 1 &&
!Line.First->isOneOf(tok::kw_enum, tok::kw_case)) {
- if (Tok->Previous->is(tok::r_paren))
+ if (Tok->Previous->isOneOf(tok::r_paren, tok::kw_noexcept))
Tok->Type = TT_CtorInitializerColon;
else
Tok->Type = TT_InheritanceColon;
@@ -504,6 +533,14 @@ private:
Tok->Type = TT_InlineASMColon;
}
break;
+ case tok::pipe:
+ case tok::amp:
+ // | and & in declarations/type expressions represent union and
+ // intersection types, respectively.
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ !Contexts.back().IsExpression)
+ Tok->Type = TT_JsTypeOperator;
+ break;
case tok::kw_if:
case tok::kw_while:
if (CurrentToken && CurrentToken->is(tok::l_paren)) {
@@ -513,6 +550,9 @@ private:
}
break;
case tok::kw_for:
+ if (Style.Language == FormatStyle::LK_JavaScript && Tok->Previous &&
+ Tok->Previous->is(tok::period))
+ break;
Contexts.back().ColonIsForRangeExpr = true;
next();
if (!parseParens())
@@ -550,11 +590,7 @@ private:
return false;
break;
case tok::less:
- if (!NonTemplateLess.count(Tok) &&
- (!Tok->Previous ||
- (!Tok->Previous->Tok.isLiteral() &&
- !(Tok->Previous->is(tok::r_paren) && Contexts.size() > 1))) &&
- parseAngle()) {
+ if (parseAngle()) {
Tok->Type = TT_TemplateOpener;
} else {
Tok->Type = TT_BinaryOperator;
@@ -603,7 +639,7 @@ private:
}
// Declarations cannot be conditional expressions, this can only be part
// of a type declaration.
- if (Line.MustBeDeclaration &&
+ if (Line.MustBeDeclaration && !Contexts.back().IsExpression &&
Style.Language == FormatStyle::LK_JavaScript)
break;
parseConditional();
@@ -666,10 +702,24 @@ private:
}
LineType parsePreprocessorDirective() {
+ bool IsFirstToken = CurrentToken->IsFirst;
LineType Type = LT_PreprocessorDirective;
next();
if (!CurrentToken)
return Type;
+
+ if (Style.Language == FormatStyle::LK_JavaScript && IsFirstToken) {
+ // JavaScript files can contain shebang lines of the form:
+ // #!/usr/bin/env node
+ // Treat these like C++ #include directives.
+ while (CurrentToken) {
+ // Tokens cannot be comments here.
+ CurrentToken->Type = TT_ImplicitStringLiteral;
+ next();
+ }
+ return LT_ImportStatement;
+ }
+
if (CurrentToken->Tok.is(tok::numeric_constant)) {
CurrentToken->SpacesRequiredBefore = 1;
return Type;
@@ -745,11 +795,29 @@ public:
bool KeywordVirtualFound = false;
bool ImportStatement = false;
+
+ // import {...} from '...';
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ CurrentToken->is(Keywords.kw_import))
+ ImportStatement = true;
+
while (CurrentToken) {
if (CurrentToken->is(tok::kw_virtual))
KeywordVirtualFound = true;
- if (isImportStatement(*CurrentToken))
- ImportStatement = true;
+ if (Style.Language == FormatStyle::LK_JavaScript) {
+ // export {...} from '...';
+ // An export followed by "from 'some string';" is a re-export from
+ // another module identified by a URI and is treated as a
+ // LT_ImportStatement (i.e. prevent wraps on it for long URIs).
+ // Just "export {...};" or "export class ..." should not be treated as
+ // an import in this sense.
+ if (Line.First->is(tok::kw_export) &&
+ CurrentToken->is(Keywords.kw_from) && CurrentToken->Next &&
+ CurrentToken->Next->isStringLiteral())
+ ImportStatement = true;
+ if (isClosureImportStatement(*CurrentToken))
+ ImportStatement = true;
+ }
if (!consumeToken())
return LT_Invalid;
}
@@ -769,15 +837,15 @@ public:
}
private:
- bool isImportStatement(const FormatToken &Tok) {
+ bool isClosureImportStatement(const FormatToken &Tok) {
// FIXME: Closure-library specific stuff should not be hard-coded but be
// configurable.
- return Style.Language == FormatStyle::LK_JavaScript &&
- Tok.TokenText == "goog" && Tok.Next && Tok.Next->is(tok::period) &&
+ return Tok.TokenText == "goog" && Tok.Next && Tok.Next->is(tok::period) &&
Tok.Next->Next && (Tok.Next->Next->TokenText == "module" ||
Tok.Next->Next->TokenText == "provide" ||
Tok.Next->Next->TokenText == "require" ||
- Tok.Next->Next->TokenText == "setTestOnly") &&
+ Tok.Next->Next->TokenText == "setTestOnly" ||
+ Tok.Next->Next->TokenText == "forwardDeclare") &&
Tok.Next->Next->Next && Tok.Next->Next->Next->is(tok::l_paren);
}
@@ -853,6 +921,9 @@ private:
void modifyContext(const FormatToken &Current) {
if (Current.getPrecedence() == prec::Assignment &&
!Line.First->isOneOf(tok::kw_template, tok::kw_using, tok::kw_return) &&
+ // Type aliases use `type X = ...;` in TypeScript.
+ !(Style.Language == FormatStyle::LK_JavaScript &&
+ Line.startsWith(Keywords.kw_type, tok::identifier)) &&
(!Current.Previous || Current.Previous->isNot(tok::kw_operator))) {
Contexts.back().IsExpression = true;
if (!Line.startsWith(TT_UnaryOperator)) {
@@ -882,17 +953,17 @@ private:
Contexts.back().IsExpression = false;
} else if (Current.is(TT_LambdaArrow) || Current.is(Keywords.kw_assert)) {
Contexts.back().IsExpression = Style.Language == FormatStyle::LK_Java;
+ } else if (Current.Previous &&
+ Current.Previous->is(TT_CtorInitializerColon)) {
+ Contexts.back().IsExpression = true;
+ Contexts.back().InCtorInitializer = true;
} else if (Current.isOneOf(tok::r_paren, tok::greater, tok::comma)) {
for (FormatToken *Previous = Current.Previous;
Previous && Previous->isOneOf(tok::star, tok::amp);
Previous = Previous->Previous)
Previous->Type = TT_PointerOrReference;
- if (Line.MustBeDeclaration)
- Contexts.back().IsExpression = Contexts.front().InCtorInitializer;
- } else if (Current.Previous &&
- Current.Previous->is(TT_CtorInitializerColon)) {
- Contexts.back().IsExpression = true;
- Contexts.back().InCtorInitializer = true;
+ if (Line.MustBeDeclaration && !Contexts.front().InCtorInitializer)
+ Contexts.back().IsExpression = false;
} else if (Current.is(tok::kw_new)) {
Contexts.back().CanBeExpression = false;
} else if (Current.isOneOf(tok::semi, tok::exclaim)) {
@@ -938,7 +1009,7 @@ private:
Current.Type = TT_UnaryOperator;
} else if (Current.is(tok::question)) {
if (Style.Language == FormatStyle::LK_JavaScript &&
- Line.MustBeDeclaration) {
+ Line.MustBeDeclaration && !Contexts.back().IsExpression) {
// In JavaScript, `interface X { foo?(): bar; }` is an optional method
// on the interface, not a ternary expression.
Current.Type = TT_JsTypeOptionalQuestion;
@@ -964,7 +1035,8 @@ private:
Current.Type = TT_CastRParen;
if (Current.MatchingParen && Current.Next &&
!Current.Next->isBinaryOperator() &&
- !Current.Next->isOneOf(tok::semi, tok::colon, tok::l_brace))
+ !Current.Next->isOneOf(tok::semi, tok::colon, tok::l_brace,
+ tok::period, tok::arrow, tok::coloncolon))
if (FormatToken *BeforeParen = Current.MatchingParen->Previous)
if (BeforeParen->is(tok::identifier) &&
BeforeParen->TokenText == BeforeParen->TokenText.upper() &&
@@ -1035,6 +1107,9 @@ private:
if (Tok.Previous->isOneOf(TT_LeadingJavaAnnotation, Keywords.kw_instanceof))
return false;
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ Tok.Previous->is(Keywords.kw_in))
+ return false;
// Skip "const" as it does not have an influence on whether this is a name.
FormatToken *PreviousNotConst = Tok.Previous;
@@ -1078,7 +1153,7 @@ private:
FormatToken *LeftOfParens = Tok.MatchingParen->getPreviousNonComment();
if (LeftOfParens) {
- // If there is an opening parenthesis left of the current parentheses,
+ // If there is a closing parenthesis left of the current parentheses,
// look past it as these might be chained casts.
if (LeftOfParens->is(tok::r_paren)) {
if (!LeftOfParens->MatchingParen ||
@@ -1097,7 +1172,7 @@ private:
// Certain other tokens right before the parentheses are also signals that
// this cannot be a cast.
if (LeftOfParens->isOneOf(tok::at, tok::r_square, TT_OverloadedOperator,
- TT_TemplateCloser))
+ TT_TemplateCloser, tok::ellipsis))
return false;
}
@@ -1131,9 +1206,9 @@ private:
if (!LeftOfParens)
return false;
- // If the following token is an identifier, this is a cast. All cases where
- // this can be something else are handled above.
- if (Tok.Next->is(tok::identifier))
+ // If the following token is an identifier or 'this', this is a cast. All
+ // cases where this can be something else are handled above.
+ if (Tok.Next->isOneOf(tok::identifier, tok::kw_this))
return true;
if (!Tok.Next->Next)
@@ -1390,11 +1465,15 @@ private:
Style.Language == FormatStyle::LK_JavaScript) &&
Current->is(Keywords.kw_instanceof))
return prec::Relational;
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ Current->is(Keywords.kw_in))
+ return prec::Relational;
if (Current->is(TT_BinaryOperator) || Current->is(tok::comma))
return Current->getPrecedence();
if (Current->isOneOf(tok::period, tok::arrow))
return PrecedenceArrowAndPeriod;
- if (Style.Language == FormatStyle::LK_Java &&
+ if ((Style.Language == FormatStyle::LK_Java ||
+ Style.Language == FormatStyle::LK_JavaScript) &&
Current->isOneOf(Keywords.kw_extends, Keywords.kw_implements,
Keywords.kw_throws))
return 0;
@@ -1508,7 +1587,8 @@ void TokenAnnotator::annotate(AnnotatedLine &Line) {
// This function heuristically determines whether 'Current' starts the name of a
// function declaration.
-static bool isFunctionDeclarationName(const FormatToken &Current) {
+static bool isFunctionDeclarationName(const FormatToken &Current,
+ const AnnotatedLine &Line) {
auto skipOperatorName = [](const FormatToken* Next) -> const FormatToken* {
for (; Next; Next = Next->Next) {
if (Next->is(TT_OverloadedOperatorLParen))
@@ -1528,6 +1608,7 @@ static bool isFunctionDeclarationName(const FormatToken &Current) {
return nullptr;
};
+ // Find parentheses of parameter list.
const FormatToken *Next = Current.Next;
if (Current.is(tok::kw_operator)) {
if (Current.Previous && Current.Previous->is(tok::coloncolon))
@@ -1557,14 +1638,22 @@ static bool isFunctionDeclarationName(const FormatToken &Current) {
}
}
- if (!Next || !Next->is(tok::l_paren))
+ // Check whether parameter list can be long to a function declaration.
+ if (!Next || !Next->is(tok::l_paren) || !Next->MatchingParen)
return false;
+ // If the lines ends with "{", this is likely an function definition.
+ if (Line.Last->is(tok::l_brace))
+ return true;
if (Next->Next == Next->MatchingParen)
+ return true; // Empty parentheses.
+ // If there is an &/&& after the r_paren, this is likely a function.
+ if (Next->MatchingParen->Next &&
+ Next->MatchingParen->Next->is(TT_PointerOrReference))
return true;
for (const FormatToken *Tok = Next->Next; Tok && Tok != Next->MatchingParen;
Tok = Tok->Next) {
if (Tok->is(tok::kw_const) || Tok->isSimpleTypeSpecifier() ||
- Tok->isOneOf(TT_PointerOrReference, TT_StartOfName))
+ Tok->isOneOf(TT_PointerOrReference, TT_StartOfName, tok::ellipsis))
return true;
if (Tok->isOneOf(tok::l_brace, tok::string_literal, TT_ObjCMethodExpr) ||
Tok->Tok.isLiteral())
@@ -1610,7 +1699,7 @@ void TokenAnnotator::calculateFormattingInformation(AnnotatedLine &Line) {
FormatToken *Current = Line.First->Next;
bool InFunctionDecl = Line.MightBeFunctionDecl;
while (Current) {
- if (isFunctionDeclarationName(*Current))
+ if (isFunctionDeclarationName(*Current, Line))
Current->Type = TT_FunctionDeclarationName;
if (Current->is(TT_LineComment)) {
if (Current->Previous->BlockKind == BK_BracedInit &&
@@ -1736,7 +1825,7 @@ unsigned TokenAnnotator::splitPenalty(const AnnotatedLine &Line,
if (Style.Language == FormatStyle::LK_Proto)
return 1;
if (Left.is(tok::r_square))
- return 25;
+ return 200;
// Slightly prefer formatting local lambda definitions like functions.
if (Right.is(TT_LambdaLSquare) && Left.is(tok::equal))
return 35;
@@ -1768,6 +1857,8 @@ unsigned TokenAnnotator::splitPenalty(const AnnotatedLine &Line,
return 500;
if (Left.isOneOf(tok::kw_class, tok::kw_struct))
return 5000;
+ if (Left.is(tok::comment))
+ return 1000;
if (Left.isOneOf(TT_RangeBasedForLoopColon, TT_InheritanceColon))
return 2;
@@ -1910,15 +2001,14 @@ bool TokenAnnotator::spaceRequiredBetween(const AnnotatedLine &Line,
if (Left.is(tok::less) || Right.isOneOf(tok::greater, tok::less))
return false;
if (Right.is(tok::ellipsis))
- return Left.Tok.isLiteral();
+ return Left.Tok.isLiteral() || (Left.is(tok::identifier) && Left.Previous &&
+ Left.Previous->is(tok::kw_case));
if (Left.is(tok::l_square) && Right.is(tok::amp))
return false;
if (Right.is(TT_PointerOrReference))
- return (Left.is(tok::r_paren) && Left.MatchingParen &&
- (Left.MatchingParen->is(TT_OverloadedOperatorLParen) ||
- (Left.MatchingParen->Previous &&
- Left.MatchingParen->Previous->is(TT_FunctionDeclarationName)))) ||
- (Left.Tok.isLiteral() ||
+ return (Left.is(tok::r_paren) && Line.MightBeFunctionDecl) ||
+ (Left.Tok.isLiteral() || (Left.is(tok::kw_const) && Left.Previous &&
+ Left.Previous->is(tok::r_paren)) ||
(!Left.isOneOf(TT_PointerOrReference, tok::l_paren) &&
(Style.PointerAlignment != FormatStyle::PAS_Left ||
Line.IsMultiVariableDeclStmt)));
@@ -2021,8 +2111,14 @@ bool TokenAnnotator::spaceRequiredBefore(const AnnotatedLine &Line,
Left.isOneOf(Keywords.kw_returns, Keywords.kw_option))
return true;
} else if (Style.Language == FormatStyle::LK_JavaScript) {
- if (Left.isOneOf(Keywords.kw_let, Keywords.kw_var, TT_JsFatArrow,
- Keywords.kw_in))
+ if (Left.is(TT_JsFatArrow))
+ return true;
+ if (Right.is(tok::star) &&
+ Left.isOneOf(Keywords.kw_function, Keywords.kw_yield))
+ return false;
+ if (Left.isOneOf(Keywords.kw_let, Keywords.kw_var, Keywords.kw_in,
+ Keywords.kw_of, tok::kw_const) &&
+ (!Left.Previous || !Left.Previous->is(tok::period)))
return true;
if (Left.is(tok::kw_default) && Left.Previous &&
Left.Previous->is(tok::kw_export))
@@ -2031,6 +2127,8 @@ bool TokenAnnotator::spaceRequiredBefore(const AnnotatedLine &Line,
return true;
if (Right.isOneOf(TT_JsTypeColon, TT_JsTypeOptionalQuestion))
return false;
+ if (Left.is(TT_JsTypeOperator) || Right.is(TT_JsTypeOperator))
+ return false;
if ((Left.is(tok::l_brace) || Right.is(tok::r_brace)) &&
Line.First->isOneOf(Keywords.kw_import, tok::kw_export))
return false;
@@ -2043,6 +2141,11 @@ bool TokenAnnotator::spaceRequiredBefore(const AnnotatedLine &Line,
// locations that should have whitespace following are identified by the
// above set of follower tokens.
return false;
+ // Postfix non-null assertion operator, as in `foo!.bar()`.
+ if (Right.is(tok::exclaim) && (Left.isOneOf(tok::identifier, tok::r_paren,
+ tok::r_square, tok::r_brace) ||
+ Left.Tok.isLiteral()))
+ return false;
} else if (Style.Language == FormatStyle::LK_Java) {
if (Left.is(tok::r_square) && Right.is(tok::l_brace))
return true;
@@ -2111,10 +2214,11 @@ bool TokenAnnotator::spaceRequiredBefore(const AnnotatedLine &Line,
if (!Style.SpaceBeforeAssignmentOperators &&
Right.getPrecedence() == prec::Assignment)
return false;
- if (Right.is(tok::coloncolon) && Left.isNot(tok::l_brace))
+ if (Right.is(tok::coloncolon) && !Left.isOneOf(tok::l_brace, tok::comment))
return (Left.is(TT_TemplateOpener) &&
Style.Standard == FormatStyle::LS_Cpp03) ||
- !(Left.isOneOf(tok::identifier, tok::l_paren, tok::r_paren) ||
+ !(Left.isOneOf(tok::identifier, tok::l_paren, tok::r_paren,
+ tok::l_square) ||
Left.isOneOf(TT_TemplateCloser, TT_TemplateOpener));
if ((Left.is(TT_TemplateOpener)) != (Right.is(TT_TemplateCloser)))
return Style.SpacesInAngles;
@@ -2152,8 +2256,8 @@ bool TokenAnnotator::mustBreakBefore(const AnnotatedLine &Line,
if (Style.Language == FormatStyle::LK_JavaScript) {
// FIXME: This might apply to other languages and token kinds.
- if (Right.is(tok::char_constant) && Left.is(tok::plus) && Left.Previous &&
- Left.Previous->is(tok::char_constant))
+ if (Right.is(tok::string_literal) && Left.is(tok::plus) && Left.Previous &&
+ Left.Previous->is(tok::string_literal))
return true;
if (Left.is(TT_DictLiteral) && Left.is(tok::l_brace) && Line.Level == 0 &&
Left.Previous && Left.Previous->is(tok::equal) &&
@@ -2239,9 +2343,6 @@ bool TokenAnnotator::mustBreakBefore(const AnnotatedLine &Line,
return (Line.startsWith(tok::kw_enum) && Style.BraceWrapping.AfterEnum) ||
(Line.startsWith(tok::kw_class) && Style.BraceWrapping.AfterClass) ||
(Line.startsWith(tok::kw_struct) && Style.BraceWrapping.AfterStruct);
- if (Style.Language == FormatStyle::LK_Proto && Left.isNot(tok::l_brace) &&
- Right.is(TT_SelectorName))
- return true;
if (Left.is(TT_ObjCBlockLBrace) && !Style.AllowShortBlocksOnASingleLine)
return true;
@@ -2268,12 +2369,20 @@ bool TokenAnnotator::canBreakBefore(const AnnotatedLine &Line,
Keywords.kw_implements))
return true;
} else if (Style.Language == FormatStyle::LK_JavaScript) {
+ if (Left.is(tok::kw_return))
+ return false; // Otherwise a semicolon is inserted.
if (Left.is(TT_JsFatArrow) && Right.is(tok::l_brace))
return false;
if (Left.is(TT_JsTypeColon))
return true;
if (Right.NestingLevel == 0 && Right.is(Keywords.kw_is))
return false;
+ if (Left.is(Keywords.kw_in))
+ return Style.BreakBeforeBinaryOperators == FormatStyle::BOS_None;
+ if (Right.is(Keywords.kw_in))
+ return Style.BreakBeforeBinaryOperators != FormatStyle::BOS_None;
+ if (Right.is(Keywords.kw_as))
+ return false; // must not break before as in 'x as type' casts
}
if (Left.is(tok::at))
@@ -2390,7 +2499,7 @@ bool TokenAnnotator::canBreakBefore(const AnnotatedLine &Line,
Left.getPrecedence() == prec::Assignment))
return true;
return Left.isOneOf(tok::comma, tok::coloncolon, tok::semi, tok::l_brace,
- tok::kw_class, tok::kw_struct) ||
+ tok::kw_class, tok::kw_struct, tok::comment) ||
Right.isMemberAccess() ||
Right.isOneOf(TT_TrailingReturnArrow, TT_LambdaArrow, tok::lessless,
tok::colon, tok::l_square, tok::at) ||
diff --git a/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.h b/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.h
index 5329f1f..baa68de 100644
--- a/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.h
+++ b/contrib/llvm/tools/clang/lib/Format/TokenAnnotator.h
@@ -83,7 +83,15 @@ public:
/// \c true if this line starts with the given tokens in order, ignoring
/// comments.
template <typename... Ts> bool startsWith(Ts... Tokens) const {
- return startsWith(First, Tokens...);
+ return First && First->startsSequence(Tokens...);
+ }
+
+ /// \c true if this line ends with the given tokens in reversed order,
+ /// ignoring comments.
+ /// For example, given tokens [T1, T2, T3, ...], the function returns true if
+ /// this line is like "... T3 T2 T1".
+ template <typename... Ts> bool endsWith(Ts... Tokens) const {
+ return Last && Last->endsSequence(Tokens...);
}
/// \c true if this line looks like a function definition instead of a
@@ -122,18 +130,6 @@ private:
// Disallow copying.
AnnotatedLine(const AnnotatedLine &) = delete;
void operator=(const AnnotatedLine &) = delete;
-
- template <typename A, typename... Ts>
- bool startsWith(FormatToken *Tok, A K1) const {
- while (Tok && Tok->is(tok::comment))
- Tok = Tok->Next;
- return Tok && Tok->is(K1);
- }
-
- template <typename A, typename... Ts>
- bool startsWith(FormatToken *Tok, A K1, Ts... Tokens) const {
- return startsWith(Tok, K1) && startsWith(Tok->Next, Tokens...);
- }
};
/// \brief Determines extra information about the tokens comprising an
diff --git a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineFormatter.cpp b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineFormatter.cpp
index f650569..35035ea 100644
--- a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineFormatter.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -847,7 +847,9 @@ UnwrappedLineFormatter::format(const SmallVectorImpl<AnnotatedLine *> &Lines,
unsigned ColumnLimit = getColumnLimit(TheLine.InPPDirective, NextLine);
bool FitsIntoOneLine =
TheLine.Last->TotalLength + Indent <= ColumnLimit ||
- TheLine.Type == LT_ImportStatement;
+ (TheLine.Type == LT_ImportStatement &&
+ (Style.Language != FormatStyle::LK_JavaScript ||
+ !Style.JavaScriptWrapImports));
if (Style.ColumnLimit == 0)
NoColumnLimitLineFormatter(Indenter, Whitespaces, Style, this)
@@ -863,7 +865,9 @@ UnwrappedLineFormatter::format(const SmallVectorImpl<AnnotatedLine *> &Lines,
// If no token in the current line is affected, we still need to format
// affected children.
if (TheLine.ChildrenAffected)
- format(TheLine.Children, DryRun);
+ for (const FormatToken *Tok = TheLine.First; Tok; Tok = Tok->Next)
+ if (!Tok->Children.empty())
+ format(Tok->Children, DryRun);
// Adapt following lines on the current indent level to the same level
// unless the current \c AnnotatedLine is not at the beginning of a line.
diff --git a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.cpp b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.cpp
index 7b8f6e6..2fe7298 100644
--- a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.cpp
@@ -363,6 +363,8 @@ void UnwrappedLineParser::calculateBraceTypes(bool ExpectClassBody) {
//
// We exclude + and - as they can be ObjC visibility modifiers.
ProbablyBracedList =
+ (Style.Language == FormatStyle::LK_JavaScript &&
+ NextTok->isOneOf(Keywords.kw_of, Keywords.kw_in)) ||
NextTok->isOneOf(tok::comma, tok::period, tok::colon,
tok::r_paren, tok::r_square, tok::l_brace,
tok::l_square, tok::l_paren, tok::ellipsis) ||
@@ -428,6 +430,9 @@ void UnwrappedLineParser::parseBlock(bool MustBeDeclaration, bool AddLevel,
++Line->Level;
parseLevel(/*HasOpeningBrace=*/true);
+ if (eof())
+ return;
+
if (MacroBlock ? !FormatTok->is(TT_MacroBlockEnd)
: !FormatTok->is(tok::r_brace)) {
Line->Level = InitialLevel;
@@ -658,6 +663,85 @@ static bool tokenCanStartNewLine(const clang::Token &Tok) {
Tok.isNot(tok::kw_noexcept);
}
+static bool mustBeJSIdent(const AdditionalKeywords &Keywords,
+ const FormatToken *FormatTok) {
+ // FIXME: This returns true for C/C++ keywords like 'struct'.
+ return FormatTok->is(tok::identifier) &&
+ (FormatTok->Tok.getIdentifierInfo() == nullptr ||
+ !FormatTok->isOneOf(Keywords.kw_in, Keywords.kw_of, Keywords.kw_as,
+ Keywords.kw_async, Keywords.kw_await,
+ Keywords.kw_yield, Keywords.kw_finally,
+ Keywords.kw_function, Keywords.kw_import,
+ Keywords.kw_is, Keywords.kw_let, Keywords.kw_var,
+ Keywords.kw_abstract, Keywords.kw_extends,
+ Keywords.kw_implements, Keywords.kw_instanceof,
+ Keywords.kw_interface, Keywords.kw_throws));
+}
+
+static bool mustBeJSIdentOrValue(const AdditionalKeywords &Keywords,
+ const FormatToken *FormatTok) {
+ return FormatTok->Tok.isLiteral() || mustBeJSIdent(Keywords, FormatTok);
+}
+
+// isJSDeclOrStmt returns true if |FormatTok| starts a declaration or statement
+// when encountered after a value (see mustBeJSIdentOrValue).
+static bool isJSDeclOrStmt(const AdditionalKeywords &Keywords,
+ const FormatToken *FormatTok) {
+ return FormatTok->isOneOf(
+ tok::kw_return, Keywords.kw_yield,
+ // conditionals
+ tok::kw_if, tok::kw_else,
+ // loops
+ tok::kw_for, tok::kw_while, tok::kw_do, tok::kw_continue, tok::kw_break,
+ // switch/case
+ tok::kw_switch, tok::kw_case,
+ // exceptions
+ tok::kw_throw, tok::kw_try, tok::kw_catch, Keywords.kw_finally,
+ // declaration
+ tok::kw_const, tok::kw_class, Keywords.kw_var, Keywords.kw_let,
+ Keywords.kw_async, Keywords.kw_function,
+ // import/export
+ Keywords.kw_import, tok::kw_export);
+}
+
+// readTokenWithJavaScriptASI reads the next token and terminates the current
+// line if JavaScript Automatic Semicolon Insertion must
+// happen between the current token and the next token.
+//
+// This method is conservative - it cannot cover all edge cases of JavaScript,
+// but only aims to correctly handle certain well known cases. It *must not*
+// return true in speculative cases.
+void UnwrappedLineParser::readTokenWithJavaScriptASI() {
+ FormatToken *Previous = FormatTok;
+ readToken();
+ FormatToken *Next = FormatTok;
+
+ bool IsOnSameLine =
+ CommentsBeforeNextToken.empty()
+ ? Next->NewlinesBefore == 0
+ : CommentsBeforeNextToken.front()->NewlinesBefore == 0;
+ if (IsOnSameLine)
+ return;
+
+ bool PreviousMustBeValue = mustBeJSIdentOrValue(Keywords, Previous);
+ if (PreviousMustBeValue && Line && Line->Tokens.size() > 1) {
+ // If the token before the previous one is an '@', the previous token is an
+ // annotation and can precede another identifier/value.
+ const FormatToken *PrePrevious = std::prev(Line->Tokens.end(), 2)->Tok;
+ if (PrePrevious->is(tok::at))
+ return;
+ }
+ if (Next->is(tok::exclaim) && PreviousMustBeValue)
+ addUnwrappedLine();
+ bool NextMustBeValue = mustBeJSIdentOrValue(Keywords, Next);
+ if (NextMustBeValue && (PreviousMustBeValue ||
+ Previous->isOneOf(tok::r_square, tok::r_paren,
+ tok::plusplus, tok::minusminus)))
+ addUnwrappedLine();
+ if (PreviousMustBeValue && isJSDeclOrStmt(Keywords, Next))
+ addUnwrappedLine();
+}
+
void UnwrappedLineParser::parseStructuralElement() {
assert(!FormatTok->is(tok::l_brace));
if (Style.Language == FormatStyle::LK_TableGen &&
@@ -798,10 +882,23 @@ void UnwrappedLineParser::parseStructuralElement() {
/*MunchSemi=*/false);
return;
}
- if (Style.Language == FormatStyle::LK_JavaScript &&
- FormatTok->is(Keywords.kw_import)) {
- parseJavaScriptEs6ImportExport();
- return;
+ if (FormatTok->is(Keywords.kw_import)) {
+ if (Style.Language == FormatStyle::LK_JavaScript) {
+ parseJavaScriptEs6ImportExport();
+ return;
+ }
+ if (Style.Language == FormatStyle::LK_Proto) {
+ nextToken();
+ if (FormatTok->is(tok::kw_public))
+ nextToken();
+ if (!FormatTok->is(tok::string_literal))
+ return;
+ nextToken();
+ if (FormatTok->is(tok::semi))
+ nextToken();
+ addUnwrappedLine();
+ return;
+ }
}
if (FormatTok->isOneOf(Keywords.kw_signals, Keywords.kw_qsignals,
Keywords.kw_slots, Keywords.kw_qslots)) {
@@ -818,6 +915,7 @@ void UnwrappedLineParser::parseStructuralElement() {
break;
}
do {
+ const FormatToken *Previous = getPreviousToken();
switch (FormatTok->Tok.getKind()) {
case tok::at:
nextToken();
@@ -825,6 +923,12 @@ void UnwrappedLineParser::parseStructuralElement() {
parseBracedList();
break;
case tok::kw_enum:
+ // Ignore if this is part of "template <enum ...".
+ if (Previous && Previous->is(tok::less)) {
+ nextToken();
+ break;
+ }
+
// parseEnum falls through and does not yet add an unwrapped line as an
// enum definition can start a structural element.
if (!parseEnum())
@@ -922,18 +1026,35 @@ void UnwrappedLineParser::parseStructuralElement() {
// Parse function literal unless 'function' is the first token in a line
// in which case this should be treated as a free-standing function.
if (Style.Language == FormatStyle::LK_JavaScript &&
- FormatTok->is(Keywords.kw_function) && Line->Tokens.size() > 0) {
+ (FormatTok->is(Keywords.kw_function) ||
+ FormatTok->startsSequence(Keywords.kw_async,
+ Keywords.kw_function)) &&
+ Line->Tokens.size() > 0) {
tryToParseJSFunction();
break;
}
if ((Style.Language == FormatStyle::LK_JavaScript ||
Style.Language == FormatStyle::LK_Java) &&
FormatTok->is(Keywords.kw_interface)) {
+ if (Style.Language == FormatStyle::LK_JavaScript) {
+ // In JavaScript/TypeScript, "interface" can be used as a standalone
+ // identifier, e.g. in `var interface = 1;`. If "interface" is
+ // followed by another identifier, it is very like to be an actual
+ // interface declaration.
+ unsigned StoredPosition = Tokens->getPosition();
+ FormatToken *Next = Tokens->getNextToken();
+ FormatTok = Tokens->setPosition(StoredPosition);
+ if (Next && !mustBeJSIdent(Keywords, Next)) {
+ nextToken();
+ break;
+ }
+ }
parseRecord();
addUnwrappedLine();
return;
}
+ // See if the following token should start a new unwrapped line.
StringRef Text = FormatTok->TokenText;
nextToken();
if (Line->Tokens.size() == 1 &&
@@ -941,6 +1062,7 @@ void UnwrappedLineParser::parseStructuralElement() {
// not labels.
Style.Language != FormatStyle::LK_JavaScript) {
if (FormatTok->Tok.is(tok::colon) && !Line->MustBeDeclaration) {
+ Line->Tokens.begin()->Tok->MustBreakBefore = true;
parseLabel();
return;
}
@@ -1093,8 +1215,17 @@ bool UnwrappedLineParser::tryToParseLambdaIntroducer() {
}
void UnwrappedLineParser::tryToParseJSFunction() {
+ assert(FormatTok->is(Keywords.kw_function) ||
+ FormatTok->startsSequence(Keywords.kw_async, Keywords.kw_function));
+ if (FormatTok->is(Keywords.kw_async))
+ nextToken();
+ // Consume "function".
nextToken();
+ // Consume * (generator function).
+ if (FormatTok->is(tok::star))
+ nextToken();
+
// Consume function name.
if (FormatTok->is(tok::identifier))
nextToken();
@@ -1139,7 +1270,8 @@ bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons) {
// replace this by using parseAssigmentExpression() inside.
do {
if (Style.Language == FormatStyle::LK_JavaScript) {
- if (FormatTok->is(Keywords.kw_function)) {
+ if (FormatTok->is(Keywords.kw_function) ||
+ FormatTok->startsSequence(Keywords.kw_async, Keywords.kw_function)) {
tryToParseJSFunction();
continue;
}
@@ -1237,7 +1369,8 @@ void UnwrappedLineParser::parseParens() {
break;
case tok::identifier:
if (Style.Language == FormatStyle::LK_JavaScript &&
- FormatTok->is(Keywords.kw_function))
+ (FormatTok->is(Keywords.kw_function) ||
+ FormatTok->startsSequence(Keywords.kw_async, Keywords.kw_function)))
tryToParseJSFunction();
else
nextToken();
@@ -1315,6 +1448,8 @@ void UnwrappedLineParser::parseIfThenElse() {
addUnwrappedLine();
++Line->Level;
parseStructuralElement();
+ if (FormatTok->is(tok::eof))
+ addUnwrappedLine();
--Line->Level;
}
} else if (NeedsUnwrappedLine) {
@@ -1503,6 +1638,10 @@ void UnwrappedLineParser::parseLabel() {
addUnwrappedLine();
}
Line->Level = OldLineLevel;
+ if (FormatTok->isNot(tok::l_brace)) {
+ parseStructuralElement();
+ addUnwrappedLine();
+ }
}
void UnwrappedLineParser::parseCaseLabel() {
@@ -1550,7 +1689,8 @@ bool UnwrappedLineParser::parseEnum() {
// In TypeScript, "enum" can also be used as property name, e.g. in interface
// declarations. An "enum" keyword followed by a colon would be a syntax
// error and thus assume it is just an identifier.
- if (Style.Language == FormatStyle::LK_JavaScript && FormatTok->is(tok::colon))
+ if (Style.Language == FormatStyle::LK_JavaScript &&
+ FormatTok->isOneOf(tok::colon, tok::question))
return false;
// Eat up enum class ...
@@ -1795,28 +1935,31 @@ void UnwrappedLineParser::parseObjCProtocol() {
}
void UnwrappedLineParser::parseJavaScriptEs6ImportExport() {
- assert(FormatTok->isOneOf(Keywords.kw_import, tok::kw_export));
+ bool IsImport = FormatTok->is(Keywords.kw_import);
+ assert(IsImport || FormatTok->is(tok::kw_export));
nextToken();
// Consume the "default" in "export default class/function".
if (FormatTok->is(tok::kw_default))
nextToken();
- // Consume "function" and "default function", so that these get parsed as
- // free-standing JS functions, i.e. do not require a trailing semicolon.
+ // Consume "async function", "function" and "default function", so that these
+ // get parsed as free-standing JS functions, i.e. do not require a trailing
+ // semicolon.
+ if (FormatTok->is(Keywords.kw_async))
+ nextToken();
if (FormatTok->is(Keywords.kw_function)) {
nextToken();
return;
}
- // Consume the "abstract" in "export abstract class".
- if (FormatTok->is(Keywords.kw_abstract))
- nextToken();
-
- if (FormatTok->isOneOf(tok::kw_const, tok::kw_class, tok::kw_enum,
- Keywords.kw_interface, Keywords.kw_let,
- Keywords.kw_var))
- return; // Fall through to parsing the corresponding structure.
+ // For imports, `export *`, `export {...}`, consume the rest of the line up
+ // to the terminating `;`. For everything else, just return and continue
+ // parsing the structural element, i.e. the declaration or expression for
+ // `export default`.
+ if (!IsImport && !FormatTok->isOneOf(tok::l_brace, tok::star) &&
+ !FormatTok->isStringLiteral())
+ return;
while (!eof() && FormatTok->isNot(tok::semi)) {
if (FormatTok->is(tok::l_brace)) {
@@ -1895,7 +2038,10 @@ void UnwrappedLineParser::nextToken() {
return;
flushComments(isOnNewLine(*FormatTok));
pushToken(FormatTok);
- readToken();
+ if (Style.Language != FormatStyle::LK_JavaScript)
+ readToken();
+ else
+ readTokenWithJavaScriptASI();
}
const FormatToken *UnwrappedLineParser::getPreviousToken() {
diff --git a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.h b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.h
index 6d40ab4..9c78d33 100644
--- a/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.h
+++ b/contrib/llvm/tools/clang/lib/Format/UnwrappedLineParser.h
@@ -81,6 +81,7 @@ private:
void parsePPElse();
void parsePPEndIf();
void parsePPUnknown();
+ void readTokenWithJavaScriptASI();
void parseStructuralElement();
bool tryToParseBracedList();
bool parseBracedList(bool ContinueOnSemicolons = false);
diff --git a/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.cpp b/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.cpp
index d6e6ed2..9cdba9d 100644
--- a/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.cpp
+++ b/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.cpp
@@ -372,16 +372,20 @@ void WhitespaceManager::alignTrailingComments() {
unsigned CommentColumn = SourceMgr.getSpellingColumnNumber(
Changes[i].OriginalWhitespaceRange.getEnd());
for (unsigned j = i + 1; j != e; ++j) {
- if (Changes[j].Kind != tok::comment) { // Skip over comments.
- unsigned NextColumn = SourceMgr.getSpellingColumnNumber(
- Changes[j].OriginalWhitespaceRange.getEnd());
- // The start of the next token was previously aligned with the
- // start of this comment.
- WasAlignedWithStartOfNextLine =
- CommentColumn == NextColumn ||
- CommentColumn == NextColumn + Style.IndentWidth;
- break;
- }
+ if (Changes[j].Kind == tok::comment ||
+ Changes[j].Kind == tok::unknown)
+ // Skip over comments and unknown tokens. "unknown tokens are used for
+ // the continuation of multiline comments.
+ continue;
+
+ unsigned NextColumn = SourceMgr.getSpellingColumnNumber(
+ Changes[j].OriginalWhitespaceRange.getEnd());
+ // The start of the next token was previously aligned with the
+ // start of this comment.
+ WasAlignedWithStartOfNextLine =
+ CommentColumn == NextColumn ||
+ CommentColumn == NextColumn + Style.IndentWidth;
+ break;
}
}
if (!Style.AlignTrailingComments || FollowsRBraceInColumn0) {
@@ -554,6 +558,14 @@ void WhitespaceManager::appendIndentText(std::string &Text,
}
Text.append(Spaces, ' ');
break;
+ case FormatStyle::UT_ForContinuationAndIndentation:
+ if (WhitespaceStartColumn == 0) {
+ unsigned Tabs = Spaces / Style.TabWidth;
+ Text.append(Tabs, '\t');
+ Spaces -= Tabs * Style.TabWidth;
+ }
+ Text.append(Spaces, ' ');
+ break;
}
}
diff --git a/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.h b/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.h
index 9ca9db6..3562347 100644
--- a/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.h
+++ b/contrib/llvm/tools/clang/lib/Format/WhitespaceManager.h
@@ -37,7 +37,7 @@ namespace format {
/// There may be multiple calls to \c breakToken for a given token.
class WhitespaceManager {
public:
- WhitespaceManager(SourceManager &SourceMgr, const FormatStyle &Style,
+ WhitespaceManager(const SourceManager &SourceMgr, const FormatStyle &Style,
bool UseCRLF)
: SourceMgr(SourceMgr), Style(Style), UseCRLF(UseCRLF) {}
@@ -203,7 +203,7 @@ private:
unsigned Spaces, unsigned WhitespaceStartColumn);
SmallVector<Change, 16> Changes;
- SourceManager &SourceMgr;
+ const SourceManager &SourceMgr;
tooling::Replacements Replaces;
const FormatStyle &Style;
bool UseCRLF;
OpenPOWER on IntegriCloud