diff options
author | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
---|---|---|
committer | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
commit | 60b571e49a90d38697b3aca23020d9da42fc7d7f (patch) | |
tree | 99351324c24d6cb146b6285b6caffa4d26fce188 /contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | |
parent | bea1b22c7a9bce1dfdd73e6e5b65bc4752215180 (diff) | |
download | FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.zip FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.tar.gz |
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste):
Add WITH_LLD_AS_LD build knob
If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not
capable of linking the world and kernel, but can self-host and link many
substantial applications. GNU ld continues to be used for the world and
kernel build, regardless of how this knob is set.
It is on by default for arm64, and off for all other CPU architectures.
Sponsored by: The FreeBSD Foundation
MFC r310840:
Reapply 310775, now it also builds correctly if lldb is disabled:
Move llvm-objdump from CLANG_EXTRAS to installed by default
We currently install three tools from binutils 2.17.50: as, ld, and
objdump. Work is underway to migrate to a permissively-licensed
tool-chain, with one goal being the retirement of binutils 2.17.50.
LLVM's llvm-objdump is intended to be compatible with GNU objdump
although it is currently missing some options and may have formatting
differences. Enable it by default for testing and further investigation.
It may later be changed to install as /usr/bin/objdump, it becomes a
fully viable replacement.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8879
MFC r312855 (by emaste):
Rename LLD_AS_LD to LLD_IS_LD, for consistency with CLANG_IS_CC
Reported by: Dan McGregor <dan.mcgregor usask.ca>
MFC r313559 | glebius | 2017-02-10 18:34:48 +0100 (Fri, 10 Feb 2017) | 5 lines
Don't check struct rtentry on FreeBSD, it is an internal kernel structure.
On other systems it may be API structure for SIOCADDRT/SIOCDELRT.
Reviewed by: emaste, dim
MFC r314152 (by jkim):
Remove an assembler flag, which is redundant since r309124. The upstream
took care of it by introducing a macro NO_EXEC_STACK_DIRECTIVE.
http://llvm.org/viewvc/llvm-project?rev=273500&view=rev
Reviewed by: dim
MFC r314564:
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC r314708:
For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
MFC r314795:
Reapply r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
MFC r314907 | mmel | 2017-03-08 12:40:27 +0100 (Wed, 08 Mar 2017) | 7 lines
Unbreak ARMv6 world.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
MFC r315016:
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release.
We were already very close to the last release candidate, so this is a
pretty minor update.
Relnotes: yes
MFC r316005:
Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by
Weiming Zhao):
builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.
Summary:
Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation
mode (-mthumb, -marm), it reflect's capability of given CPU.
Due to this:
- use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB
- use '.thumb' directive consistently in all affected files
- decorate all thumb functions using
DEFINE_COMPILERRT_THUMB_FUNCTION()
---------
Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 !
Reviewers: weimingz, rengolin, compnerd
Subscribers: aemerson, dim
Differential Revision: https://reviews.llvm.org/D30938
Discussed with: mmel
Diffstat (limited to 'contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp')
-rw-r--r-- | contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp | 537 |
1 files changed, 362 insertions, 175 deletions
diff --git a/contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp index 8f1ff8a..3664484 100644 --- a/contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp +++ b/contrib/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp @@ -15,6 +15,7 @@ #include "llvm/Analysis/ConstantFolding.h" #include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/ValueTracking.h" +#include "llvm/IR/MDBuilder.h" #include "llvm/IR/PatternMatch.h" using namespace llvm; using namespace PatternMatch; @@ -78,7 +79,7 @@ static Value *generateMinMaxSelectPattern(InstCombiner::BuilderTy *Builder, /// a bitmask indicating which operands of this instruction are foldable if they /// equal the other incoming value of the select. /// -static unsigned GetSelectFoldableOperands(Instruction *I) { +static unsigned getSelectFoldableOperands(Instruction *I) { switch (I->getOpcode()) { case Instruction::Add: case Instruction::Mul: @@ -98,7 +99,7 @@ static unsigned GetSelectFoldableOperands(Instruction *I) { /// For the same transformation as the previous function, return the identity /// constant that goes into the select. -static Constant *GetSelectFoldableConstant(Instruction *I) { +static Constant *getSelectFoldableConstant(Instruction *I) { switch (I->getOpcode()) { default: llvm_unreachable("This cannot happen!"); case Instruction::Add: @@ -117,7 +118,7 @@ static Constant *GetSelectFoldableConstant(Instruction *I) { } /// We have (select c, TI, FI), and we know that TI and FI have the same opcode. -Instruction *InstCombiner::FoldSelectOpOp(SelectInst &SI, Instruction *TI, +Instruction *InstCombiner::foldSelectOpOp(SelectInst &SI, Instruction *TI, Instruction *FI) { // If this is a cast from the same type, merge. if (TI->getNumOperands() == 1 && TI->isCast()) { @@ -154,19 +155,19 @@ Instruction *InstCombiner::FoldSelectOpOp(SelectInst &SI, Instruction *TI, } // Fold this by inserting a select from the input values. - Value *NewSI = Builder->CreateSelect(SI.getCondition(), TI->getOperand(0), - FI->getOperand(0), SI.getName()+".v"); + Value *NewSI = + Builder->CreateSelect(SI.getCondition(), TI->getOperand(0), + FI->getOperand(0), SI.getName() + ".v", &SI); return CastInst::Create(Instruction::CastOps(TI->getOpcode()), NewSI, TI->getType()); } - // TODO: This function ends awkwardly in unreachable - fix to be more normal. - // Only handle binary operators with one-use here. As with the cast case // above, it may be possible to relax the one-use constraint, but that needs // be examined carefully since it may not reduce the total number of // instructions. - if (!isa<BinaryOperator>(TI) || !TI->hasOneUse() || !FI->hasOneUse()) + BinaryOperator *BO = dyn_cast<BinaryOperator>(TI); + if (!BO || !TI->hasOneUse() || !FI->hasOneUse()) return nullptr; // Figure out if the operations have any operands in common. @@ -199,16 +200,11 @@ Instruction *InstCombiner::FoldSelectOpOp(SelectInst &SI, Instruction *TI, } // If we reach here, they do have operations in common. - Value *NewSI = Builder->CreateSelect(SI.getCondition(), OtherOpT, - OtherOpF, SI.getName()+".v"); - - if (BinaryOperator *BO = dyn_cast<BinaryOperator>(TI)) { - if (MatchIsOpZero) - return BinaryOperator::Create(BO->getOpcode(), MatchOp, NewSI); - else - return BinaryOperator::Create(BO->getOpcode(), NewSI, MatchOp); - } - llvm_unreachable("Shouldn't get here"); + Value *NewSI = Builder->CreateSelect(SI.getCondition(), OtherOpT, OtherOpF, + SI.getName() + ".v", &SI); + Value *Op0 = MatchIsOpZero ? MatchOp : NewSI; + Value *Op1 = MatchIsOpZero ? NewSI : MatchOp; + return BinaryOperator::Create(BO->getOpcode(), Op0, Op1); } static bool isSelect01(Constant *C1, Constant *C2) { @@ -226,14 +222,14 @@ static bool isSelect01(Constant *C1, Constant *C2) { /// Try to fold the select into one of the operands to allow further /// optimization. -Instruction *InstCombiner::FoldSelectIntoOp(SelectInst &SI, Value *TrueVal, +Instruction *InstCombiner::foldSelectIntoOp(SelectInst &SI, Value *TrueVal, Value *FalseVal) { // See the comment above GetSelectFoldableOperands for a description of the // transformation we are doing here. if (Instruction *TVI = dyn_cast<Instruction>(TrueVal)) { if (TVI->hasOneUse() && TVI->getNumOperands() == 2 && !isa<Constant>(FalseVal)) { - if (unsigned SFO = GetSelectFoldableOperands(TVI)) { + if (unsigned SFO = getSelectFoldableOperands(TVI)) { unsigned OpToFold = 0; if ((SFO & 1) && FalseVal == TVI->getOperand(0)) { OpToFold = 1; @@ -242,7 +238,7 @@ Instruction *InstCombiner::FoldSelectIntoOp(SelectInst &SI, Value *TrueVal, } if (OpToFold) { - Constant *C = GetSelectFoldableConstant(TVI); + Constant *C = getSelectFoldableConstant(TVI); Value *OOp = TVI->getOperand(2-OpToFold); // Avoid creating select between 2 constants unless it's selecting // between 0, 1 and -1. @@ -263,7 +259,7 @@ Instruction *InstCombiner::FoldSelectIntoOp(SelectInst &SI, Value *TrueVal, if (Instruction *FVI = dyn_cast<Instruction>(FalseVal)) { if (FVI->hasOneUse() && FVI->getNumOperands() == 2 && !isa<Constant>(TrueVal)) { - if (unsigned SFO = GetSelectFoldableOperands(FVI)) { + if (unsigned SFO = getSelectFoldableOperands(FVI)) { unsigned OpToFold = 0; if ((SFO & 1) && TrueVal == FVI->getOperand(0)) { OpToFold = 1; @@ -272,7 +268,7 @@ Instruction *InstCombiner::FoldSelectIntoOp(SelectInst &SI, Value *TrueVal, } if (OpToFold) { - Constant *C = GetSelectFoldableConstant(FVI); + Constant *C = getSelectFoldableConstant(FVI); Value *OOp = FVI->getOperand(2-OpToFold); // Avoid creating select between 2 constants unless it's selecting // between 0, 1 and -1. @@ -411,103 +407,151 @@ static Value *foldSelectCttzCtlz(ICmpInst *ICI, Value *TrueVal, Value *FalseVal, return nullptr; } +/// Return true if we find and adjust an icmp+select pattern where the compare +/// is with a constant that can be incremented or decremented to match the +/// minimum or maximum idiom. +static bool adjustMinMax(SelectInst &Sel, ICmpInst &Cmp) { + ICmpInst::Predicate Pred = Cmp.getPredicate(); + Value *CmpLHS = Cmp.getOperand(0); + Value *CmpRHS = Cmp.getOperand(1); + Value *TrueVal = Sel.getTrueValue(); + Value *FalseVal = Sel.getFalseValue(); + + // We may move or edit the compare, so make sure the select is the only user. + const APInt *CmpC; + if (!Cmp.hasOneUse() || !match(CmpRHS, m_APInt(CmpC))) + return false; + + // These transforms only work for selects of integers or vector selects of + // integer vectors. + Type *SelTy = Sel.getType(); + auto *SelEltTy = dyn_cast<IntegerType>(SelTy->getScalarType()); + if (!SelEltTy || SelTy->isVectorTy() != Cmp.getType()->isVectorTy()) + return false; + + Constant *AdjustedRHS; + if (Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_SGT) + AdjustedRHS = ConstantInt::get(CmpRHS->getType(), *CmpC + 1); + else if (Pred == ICmpInst::ICMP_ULT || Pred == ICmpInst::ICMP_SLT) + AdjustedRHS = ConstantInt::get(CmpRHS->getType(), *CmpC - 1); + else + return false; + + // X > C ? X : C+1 --> X < C+1 ? C+1 : X + // X < C ? X : C-1 --> X > C-1 ? C-1 : X + if ((CmpLHS == TrueVal && AdjustedRHS == FalseVal) || + (CmpLHS == FalseVal && AdjustedRHS == TrueVal)) { + ; // Nothing to do here. Values match without any sign/zero extension. + } + // Types do not match. Instead of calculating this with mixed types, promote + // all to the larger type. This enables scalar evolution to analyze this + // expression. + else if (CmpRHS->getType()->getScalarSizeInBits() < SelEltTy->getBitWidth()) { + Constant *SextRHS = ConstantExpr::getSExt(AdjustedRHS, SelTy); + + // X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X + // X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X + // X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X + // X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X + if (match(TrueVal, m_SExt(m_Specific(CmpLHS))) && SextRHS == FalseVal) { + CmpLHS = TrueVal; + AdjustedRHS = SextRHS; + } else if (match(FalseVal, m_SExt(m_Specific(CmpLHS))) && + SextRHS == TrueVal) { + CmpLHS = FalseVal; + AdjustedRHS = SextRHS; + } else if (Cmp.isUnsigned()) { + Constant *ZextRHS = ConstantExpr::getZExt(AdjustedRHS, SelTy); + // X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X + // X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X + // zext + signed compare cannot be changed: + // 0xff <s 0x00, but 0x00ff >s 0x0000 + if (match(TrueVal, m_ZExt(m_Specific(CmpLHS))) && ZextRHS == FalseVal) { + CmpLHS = TrueVal; + AdjustedRHS = ZextRHS; + } else if (match(FalseVal, m_ZExt(m_Specific(CmpLHS))) && + ZextRHS == TrueVal) { + CmpLHS = FalseVal; + AdjustedRHS = ZextRHS; + } else { + return false; + } + } else { + return false; + } + } else { + return false; + } + + Pred = ICmpInst::getSwappedPredicate(Pred); + CmpRHS = AdjustedRHS; + std::swap(FalseVal, TrueVal); + Cmp.setPredicate(Pred); + Cmp.setOperand(0, CmpLHS); + Cmp.setOperand(1, CmpRHS); + Sel.setOperand(1, TrueVal); + Sel.setOperand(2, FalseVal); + Sel.swapProfMetadata(); + + // Move the compare instruction right before the select instruction. Otherwise + // the sext/zext value may be defined after the compare instruction uses it. + Cmp.moveBefore(&Sel); + + return true; +} + +/// If this is an integer min/max where the select's 'true' operand is a +/// constant, canonicalize that constant to the 'false' operand: +/// select (icmp Pred X, C), C, X --> select (icmp Pred' X, C), X, C +static Instruction * +canonicalizeMinMaxWithConstant(SelectInst &Sel, ICmpInst &Cmp, + InstCombiner::BuilderTy &Builder) { + // TODO: We should also canonicalize min/max when the select has a different + // constant value than the cmp constant, but we need to fix the backend first. + if (!Cmp.hasOneUse() || !isa<Constant>(Cmp.getOperand(1)) || + !isa<Constant>(Sel.getTrueValue()) || + isa<Constant>(Sel.getFalseValue()) || + Cmp.getOperand(1) != Sel.getTrueValue()) + return nullptr; + + // Canonicalize the compare predicate based on whether we have min or max. + Value *LHS, *RHS; + ICmpInst::Predicate NewPred; + SelectPatternResult SPR = matchSelectPattern(&Sel, LHS, RHS); + switch (SPR.Flavor) { + case SPF_SMIN: NewPred = ICmpInst::ICMP_SLT; break; + case SPF_UMIN: NewPred = ICmpInst::ICMP_ULT; break; + case SPF_SMAX: NewPred = ICmpInst::ICMP_SGT; break; + case SPF_UMAX: NewPred = ICmpInst::ICMP_UGT; break; + default: return nullptr; + } + + // Canonicalize the constant to the right side. + if (isa<Constant>(LHS)) + std::swap(LHS, RHS); + + Value *NewCmp = Builder.CreateICmp(NewPred, LHS, RHS); + SelectInst *NewSel = SelectInst::Create(NewCmp, LHS, RHS, "", nullptr, &Sel); + + // We swapped the select operands, so swap the metadata too. + NewSel->swapProfMetadata(); + return NewSel; +} + /// Visit a SelectInst that has an ICmpInst as its first operand. -Instruction *InstCombiner::visitSelectInstWithICmp(SelectInst &SI, - ICmpInst *ICI) { - bool Changed = false; +Instruction *InstCombiner::foldSelectInstWithICmp(SelectInst &SI, + ICmpInst *ICI) { + if (Instruction *NewSel = canonicalizeMinMaxWithConstant(SI, *ICI, *Builder)) + return NewSel; + + bool Changed = adjustMinMax(SI, *ICI); + ICmpInst::Predicate Pred = ICI->getPredicate(); Value *CmpLHS = ICI->getOperand(0); Value *CmpRHS = ICI->getOperand(1); Value *TrueVal = SI.getTrueValue(); Value *FalseVal = SI.getFalseValue(); - // Check cases where the comparison is with a constant that - // can be adjusted to fit the min/max idiom. We may move or edit ICI - // here, so make sure the select is the only user. - if (ICI->hasOneUse()) - if (ConstantInt *CI = dyn_cast<ConstantInt>(CmpRHS)) { - switch (Pred) { - default: break; - case ICmpInst::ICMP_ULT: - case ICmpInst::ICMP_SLT: - case ICmpInst::ICMP_UGT: - case ICmpInst::ICMP_SGT: { - // These transformations only work for selects over integers. - IntegerType *SelectTy = dyn_cast<IntegerType>(SI.getType()); - if (!SelectTy) - break; - - Constant *AdjustedRHS; - if (Pred == ICmpInst::ICMP_UGT || Pred == ICmpInst::ICMP_SGT) - AdjustedRHS = ConstantInt::get(CI->getContext(), CI->getValue() + 1); - else // (Pred == ICmpInst::ICMP_ULT || Pred == ICmpInst::ICMP_SLT) - AdjustedRHS = ConstantInt::get(CI->getContext(), CI->getValue() - 1); - - // X > C ? X : C+1 --> X < C+1 ? C+1 : X - // X < C ? X : C-1 --> X > C-1 ? C-1 : X - if ((CmpLHS == TrueVal && AdjustedRHS == FalseVal) || - (CmpLHS == FalseVal && AdjustedRHS == TrueVal)) - ; // Nothing to do here. Values match without any sign/zero extension. - - // Types do not match. Instead of calculating this with mixed types - // promote all to the larger type. This enables scalar evolution to - // analyze this expression. - else if (CmpRHS->getType()->getScalarSizeInBits() - < SelectTy->getBitWidth()) { - Constant *sextRHS = ConstantExpr::getSExt(AdjustedRHS, SelectTy); - - // X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X - // X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X - // X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X - // X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X - if (match(TrueVal, m_SExt(m_Specific(CmpLHS))) && - sextRHS == FalseVal) { - CmpLHS = TrueVal; - AdjustedRHS = sextRHS; - } else if (match(FalseVal, m_SExt(m_Specific(CmpLHS))) && - sextRHS == TrueVal) { - CmpLHS = FalseVal; - AdjustedRHS = sextRHS; - } else if (ICI->isUnsigned()) { - Constant *zextRHS = ConstantExpr::getZExt(AdjustedRHS, SelectTy); - // X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X - // X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X - // zext + signed compare cannot be changed: - // 0xff <s 0x00, but 0x00ff >s 0x0000 - if (match(TrueVal, m_ZExt(m_Specific(CmpLHS))) && - zextRHS == FalseVal) { - CmpLHS = TrueVal; - AdjustedRHS = zextRHS; - } else if (match(FalseVal, m_ZExt(m_Specific(CmpLHS))) && - zextRHS == TrueVal) { - CmpLHS = FalseVal; - AdjustedRHS = zextRHS; - } else - break; - } else - break; - } else - break; - - Pred = ICmpInst::getSwappedPredicate(Pred); - CmpRHS = AdjustedRHS; - std::swap(FalseVal, TrueVal); - ICI->setPredicate(Pred); - ICI->setOperand(0, CmpLHS); - ICI->setOperand(1, CmpRHS); - SI.setOperand(1, TrueVal); - SI.setOperand(2, FalseVal); - - // Move ICI instruction right before the select instruction. Otherwise - // the sext/zext value may be defined after the ICI instruction uses it. - ICI->moveBefore(&SI); - - Changed = true; - break; - } - } - } - // Transform (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 // and (X <s 0) ? C2 : C1 --> ((X >>s 31) & (C2 - C1)) + C1 // FIXME: Type and constness constraints could be lifted, but we have to @@ -623,7 +667,7 @@ Instruction *InstCombiner::visitSelectInstWithICmp(SelectInst &SI, /// /// because Y is not live in BB1/BB2. /// -static bool CanSelectOperandBeMappingIntoPredBlock(const Value *V, +static bool canSelectOperandBeMappingIntoPredBlock(const Value *V, const SelectInst &SI) { // If the value is a non-instruction value like a constant or argument, it // can always be mapped. @@ -651,7 +695,7 @@ static bool CanSelectOperandBeMappingIntoPredBlock(const Value *V, /// We have an SPF (e.g. a min or max) of an SPF of the form: /// SPF2(SPF1(A, B), C) -Instruction *InstCombiner::FoldSPFofSPF(Instruction *Inner, +Instruction *InstCombiner::foldSPFofSPF(Instruction *Inner, SelectPatternFlavor SPF1, Value *A, Value *B, Instruction &Outer, @@ -675,28 +719,24 @@ Instruction *InstCombiner::FoldSPFofSPF(Instruction *Inner, } if (SPF1 == SPF2) { - if (ConstantInt *CB = dyn_cast<ConstantInt>(B)) { - if (ConstantInt *CC = dyn_cast<ConstantInt>(C)) { - const APInt &ACB = CB->getValue(); - const APInt &ACC = CC->getValue(); - - // MIN(MIN(A, 23), 97) -> MIN(A, 23) - // MAX(MAX(A, 97), 23) -> MAX(A, 97) - if ((SPF1 == SPF_UMIN && ACB.ule(ACC)) || - (SPF1 == SPF_SMIN && ACB.sle(ACC)) || - (SPF1 == SPF_UMAX && ACB.uge(ACC)) || - (SPF1 == SPF_SMAX && ACB.sge(ACC))) - return replaceInstUsesWith(Outer, Inner); - - // MIN(MIN(A, 97), 23) -> MIN(A, 23) - // MAX(MAX(A, 23), 97) -> MAX(A, 97) - if ((SPF1 == SPF_UMIN && ACB.ugt(ACC)) || - (SPF1 == SPF_SMIN && ACB.sgt(ACC)) || - (SPF1 == SPF_UMAX && ACB.ult(ACC)) || - (SPF1 == SPF_SMAX && ACB.slt(ACC))) { - Outer.replaceUsesOfWith(Inner, A); - return &Outer; - } + const APInt *CB, *CC; + if (match(B, m_APInt(CB)) && match(C, m_APInt(CC))) { + // MIN(MIN(A, 23), 97) -> MIN(A, 23) + // MAX(MAX(A, 97), 23) -> MAX(A, 97) + if ((SPF1 == SPF_UMIN && CB->ule(*CC)) || + (SPF1 == SPF_SMIN && CB->sle(*CC)) || + (SPF1 == SPF_UMAX && CB->uge(*CC)) || + (SPF1 == SPF_SMAX && CB->sge(*CC))) + return replaceInstUsesWith(Outer, Inner); + + // MIN(MIN(A, 97), 23) -> MIN(A, 23) + // MAX(MAX(A, 23), 97) -> MAX(A, 97) + if ((SPF1 == SPF_UMIN && CB->ugt(*CC)) || + (SPF1 == SPF_SMIN && CB->sgt(*CC)) || + (SPF1 == SPF_UMAX && CB->ult(*CC)) || + (SPF1 == SPF_SMAX && CB->slt(*CC))) { + Outer.replaceUsesOfWith(Inner, A); + return &Outer; } } } @@ -712,8 +752,9 @@ Instruction *InstCombiner::FoldSPFofSPF(Instruction *Inner, if ((SPF1 == SPF_ABS && SPF2 == SPF_NABS) || (SPF1 == SPF_NABS && SPF2 == SPF_ABS)) { SelectInst *SI = cast<SelectInst>(Inner); - Value *NewSI = Builder->CreateSelect( - SI->getCondition(), SI->getFalseValue(), SI->getTrueValue()); + Value *NewSI = + Builder->CreateSelect(SI->getCondition(), SI->getFalseValue(), + SI->getTrueValue(), SI->getName(), SI); return replaceInstUsesWith(Outer, NewSI); } @@ -895,7 +936,7 @@ static Instruction *foldAddSubSelect(SelectInst &SI, if (AddOp != TI) std::swap(NewTrueOp, NewFalseOp); Value *NewSel = Builder.CreateSelect(CondVal, NewTrueOp, NewFalseOp, - SI.getName() + ".p"); + SI.getName() + ".p", &SI); if (SI.getType()->isFPOrFPVectorTy()) { Instruction *RI = @@ -912,6 +953,147 @@ static Instruction *foldAddSubSelect(SelectInst &SI, return nullptr; } +Instruction *InstCombiner::foldSelectExtConst(SelectInst &Sel) { + Instruction *ExtInst; + if (!match(Sel.getTrueValue(), m_Instruction(ExtInst)) && + !match(Sel.getFalseValue(), m_Instruction(ExtInst))) + return nullptr; + + auto ExtOpcode = ExtInst->getOpcode(); + if (ExtOpcode != Instruction::ZExt && ExtOpcode != Instruction::SExt) + return nullptr; + + // TODO: Handle larger types? That requires adjusting FoldOpIntoSelect too. + Value *X = ExtInst->getOperand(0); + Type *SmallType = X->getType(); + if (!SmallType->getScalarType()->isIntegerTy(1)) + return nullptr; + + Constant *C; + if (!match(Sel.getTrueValue(), m_Constant(C)) && + !match(Sel.getFalseValue(), m_Constant(C))) + return nullptr; + + // If the constant is the same after truncation to the smaller type and + // extension to the original type, we can narrow the select. + Value *Cond = Sel.getCondition(); + Type *SelType = Sel.getType(); + Constant *TruncC = ConstantExpr::getTrunc(C, SmallType); + Constant *ExtC = ConstantExpr::getCast(ExtOpcode, TruncC, SelType); + if (ExtC == C) { + Value *TruncCVal = cast<Value>(TruncC); + if (ExtInst == Sel.getFalseValue()) + std::swap(X, TruncCVal); + + // select Cond, (ext X), C --> ext(select Cond, X, C') + // select Cond, C, (ext X) --> ext(select Cond, C', X) + Value *NewSel = Builder->CreateSelect(Cond, X, TruncCVal, "narrow", &Sel); + return CastInst::Create(Instruction::CastOps(ExtOpcode), NewSel, SelType); + } + + // If one arm of the select is the extend of the condition, replace that arm + // with the extension of the appropriate known bool value. + if (Cond == X) { + if (ExtInst == Sel.getTrueValue()) { + // select X, (sext X), C --> select X, -1, C + // select X, (zext X), C --> select X, 1, C + Constant *One = ConstantInt::getTrue(SmallType); + Constant *AllOnesOrOne = ConstantExpr::getCast(ExtOpcode, One, SelType); + return SelectInst::Create(Cond, AllOnesOrOne, C, "", nullptr, &Sel); + } else { + // select X, C, (sext X) --> select X, C, 0 + // select X, C, (zext X) --> select X, C, 0 + Constant *Zero = ConstantInt::getNullValue(SelType); + return SelectInst::Create(Cond, C, Zero, "", nullptr, &Sel); + } + } + + return nullptr; +} + +/// Try to transform a vector select with a constant condition vector into a +/// shuffle for easier combining with other shuffles and insert/extract. +static Instruction *canonicalizeSelectToShuffle(SelectInst &SI) { + Value *CondVal = SI.getCondition(); + Constant *CondC; + if (!CondVal->getType()->isVectorTy() || !match(CondVal, m_Constant(CondC))) + return nullptr; + + unsigned NumElts = CondVal->getType()->getVectorNumElements(); + SmallVector<Constant *, 16> Mask; + Mask.reserve(NumElts); + Type *Int32Ty = Type::getInt32Ty(CondVal->getContext()); + for (unsigned i = 0; i != NumElts; ++i) { + Constant *Elt = CondC->getAggregateElement(i); + if (!Elt) + return nullptr; + + if (Elt->isOneValue()) { + // If the select condition element is true, choose from the 1st vector. + Mask.push_back(ConstantInt::get(Int32Ty, i)); + } else if (Elt->isNullValue()) { + // If the select condition element is false, choose from the 2nd vector. + Mask.push_back(ConstantInt::get(Int32Ty, i + NumElts)); + } else if (isa<UndefValue>(Elt)) { + // If the select condition element is undef, the shuffle mask is undef. + Mask.push_back(UndefValue::get(Int32Ty)); + } else { + // Bail out on a constant expression. + return nullptr; + } + } + + return new ShuffleVectorInst(SI.getTrueValue(), SI.getFalseValue(), + ConstantVector::get(Mask)); +} + +/// Reuse bitcasted operands between a compare and select: +/// select (cmp (bitcast C), (bitcast D)), (bitcast' C), (bitcast' D) --> +/// bitcast (select (cmp (bitcast C), (bitcast D)), (bitcast C), (bitcast D)) +static Instruction *foldSelectCmpBitcasts(SelectInst &Sel, + InstCombiner::BuilderTy &Builder) { + Value *Cond = Sel.getCondition(); + Value *TVal = Sel.getTrueValue(); + Value *FVal = Sel.getFalseValue(); + + CmpInst::Predicate Pred; + Value *A, *B; + if (!match(Cond, m_Cmp(Pred, m_Value(A), m_Value(B)))) + return nullptr; + + // The select condition is a compare instruction. If the select's true/false + // values are already the same as the compare operands, there's nothing to do. + if (TVal == A || TVal == B || FVal == A || FVal == B) + return nullptr; + + Value *C, *D; + if (!match(A, m_BitCast(m_Value(C))) || !match(B, m_BitCast(m_Value(D)))) + return nullptr; + + // select (cmp (bitcast C), (bitcast D)), (bitcast TSrc), (bitcast FSrc) + Value *TSrc, *FSrc; + if (!match(TVal, m_BitCast(m_Value(TSrc))) || + !match(FVal, m_BitCast(m_Value(FSrc)))) + return nullptr; + + // If the select true/false values are *different bitcasts* of the same source + // operands, make the select operands the same as the compare operands and + // cast the result. This is the canonical select form for min/max. + Value *NewSel; + if (TSrc == C && FSrc == D) { + // select (cmp (bitcast C), (bitcast D)), (bitcast' C), (bitcast' D) --> + // bitcast (select (cmp A, B), A, B) + NewSel = Builder.CreateSelect(Cond, A, B, "", &Sel); + } else if (TSrc == D && FSrc == C) { + // select (cmp (bitcast C), (bitcast D)), (bitcast' D), (bitcast' C) --> + // bitcast (select (cmp A, B), B, A) + NewSel = Builder.CreateSelect(Cond, B, A, "", &Sel); + } else { + return nullptr; + } + return CastInst::CreateBitOrPointerCast(NewSel, Sel.getType()); +} + Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { Value *CondVal = SI.getCondition(); Value *TrueVal = SI.getTrueValue(); @@ -919,9 +1101,12 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { Type *SelType = SI.getType(); if (Value *V = - SimplifySelectInst(CondVal, TrueVal, FalseVal, DL, TLI, DT, AC)) + SimplifySelectInst(CondVal, TrueVal, FalseVal, DL, &TLI, &DT, &AC)) return replaceInstUsesWith(SI, V); + if (Instruction *I = canonicalizeSelectToShuffle(SI)) + return I; + if (SelType->getScalarType()->isIntegerTy(1) && TrueVal->getType() == CondVal->getType()) { if (match(TrueVal, m_One())) { @@ -1085,7 +1270,7 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { // See if we are selecting two values based on a comparison of the two values. if (ICmpInst *ICI = dyn_cast<ICmpInst>(CondVal)) - if (Instruction *Result = visitSelectInstWithICmp(SI, ICI)) + if (Instruction *Result = foldSelectInstWithICmp(SI, ICI)) return Result; if (Instruction *Add = foldAddSubSelect(SI, *Builder)) @@ -1095,12 +1280,15 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { auto *TI = dyn_cast<Instruction>(TrueVal); auto *FI = dyn_cast<Instruction>(FalseVal); if (TI && FI && TI->getOpcode() == FI->getOpcode()) - if (Instruction *IV = FoldSelectOpOp(SI, TI, FI)) + if (Instruction *IV = foldSelectOpOp(SI, TI, FI)) return IV; + if (Instruction *I = foldSelectExtConst(SI)) + return I; + // See if we can fold the select into one of our operands. if (SelType->isIntOrIntVectorTy() || SelType->isFPOrFPVectorTy()) { - if (Instruction *FoldI = FoldSelectIntoOp(SI, TrueVal, FalseVal)) + if (Instruction *FoldI = foldSelectIntoOp(SI, TrueVal, FalseVal)) return FoldI; Value *LHS, *RHS, *LHS2, *RHS2; @@ -1124,9 +1312,9 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { Cmp = Builder->CreateFCmp(Pred, LHS, RHS); } - Value *NewSI = Builder->CreateCast(CastOp, - Builder->CreateSelect(Cmp, LHS, RHS), - SelType); + Value *NewSI = Builder->CreateCast( + CastOp, Builder->CreateSelect(Cmp, LHS, RHS, SI.getName(), &SI), + SelType); return replaceInstUsesWith(SI, NewSI); } } @@ -1139,39 +1327,35 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { // ABS(ABS(a)) -> ABS(a) // NABS(NABS(a)) -> NABS(a) if (SelectPatternFlavor SPF2 = matchSelectPattern(LHS, LHS2, RHS2).Flavor) - if (Instruction *R = FoldSPFofSPF(cast<Instruction>(LHS),SPF2,LHS2,RHS2, + if (Instruction *R = foldSPFofSPF(cast<Instruction>(LHS),SPF2,LHS2,RHS2, SI, SPF, RHS)) return R; if (SelectPatternFlavor SPF2 = matchSelectPattern(RHS, LHS2, RHS2).Flavor) - if (Instruction *R = FoldSPFofSPF(cast<Instruction>(RHS),SPF2,LHS2,RHS2, + if (Instruction *R = foldSPFofSPF(cast<Instruction>(RHS),SPF2,LHS2,RHS2, SI, SPF, LHS)) return R; } // MAX(~a, ~b) -> ~MIN(a, b) - if (SPF == SPF_SMAX || SPF == SPF_UMAX) { - if (IsFreeToInvert(LHS, LHS->hasNUses(2)) && - IsFreeToInvert(RHS, RHS->hasNUses(2))) { - - // This transform adds a xor operation and that extra cost needs to be - // justified. We look for simplifications that will result from - // applying this rule: - - bool Profitable = - (LHS->hasNUses(2) && match(LHS, m_Not(m_Value()))) || - (RHS->hasNUses(2) && match(RHS, m_Not(m_Value()))) || - (SI.hasOneUse() && match(*SI.user_begin(), m_Not(m_Value()))); - - if (Profitable) { - Value *NewLHS = Builder->CreateNot(LHS); - Value *NewRHS = Builder->CreateNot(RHS); - Value *NewCmp = SPF == SPF_SMAX - ? Builder->CreateICmpSLT(NewLHS, NewRHS) - : Builder->CreateICmpULT(NewLHS, NewRHS); - Value *NewSI = - Builder->CreateNot(Builder->CreateSelect(NewCmp, NewLHS, NewRHS)); - return replaceInstUsesWith(SI, NewSI); - } + if ((SPF == SPF_SMAX || SPF == SPF_UMAX) && + IsFreeToInvert(LHS, LHS->hasNUses(2)) && + IsFreeToInvert(RHS, RHS->hasNUses(2))) { + // For this transform to be profitable, we need to eliminate at least two + // 'not' instructions if we're going to add one 'not' instruction. + int NumberOfNots = + (LHS->hasNUses(2) && match(LHS, m_Not(m_Value()))) + + (RHS->hasNUses(2) && match(RHS, m_Not(m_Value()))) + + (SI.hasOneUse() && match(*SI.user_begin(), m_Not(m_Value()))); + + if (NumberOfNots >= 2) { + Value *NewLHS = Builder->CreateNot(LHS); + Value *NewRHS = Builder->CreateNot(RHS); + Value *NewCmp = SPF == SPF_SMAX + ? Builder->CreateICmpSLT(NewLHS, NewRHS) + : Builder->CreateICmpULT(NewLHS, NewRHS); + Value *NewSI = + Builder->CreateNot(Builder->CreateSelect(NewCmp, NewLHS, NewRHS)); + return replaceInstUsesWith(SI, NewSI); } } @@ -1182,8 +1366,8 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { // See if we can fold the select into a phi node if the condition is a select. if (isa<PHINode>(SI.getCondition())) // The true/false values have to be live in the PHI predecessor's blocks. - if (CanSelectOperandBeMappingIntoPredBlock(TrueVal, SI) && - CanSelectOperandBeMappingIntoPredBlock(FalseVal, SI)) + if (canSelectOperandBeMappingIntoPredBlock(TrueVal, SI) && + canSelectOperandBeMappingIntoPredBlock(FalseVal, SI)) if (Instruction *NV = FoldOpIntoPhi(SI)) return NV; @@ -1233,7 +1417,7 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { return &SI; } - if (VectorType* VecTy = dyn_cast<VectorType>(SelType)) { + if (VectorType *VecTy = dyn_cast<VectorType>(SelType)) { unsigned VWidth = VecTy->getNumElements(); APInt UndefElts(VWidth, 0); APInt AllOnesEltMask(APInt::getAllOnesValue(VWidth)); @@ -1266,5 +1450,8 @@ Instruction *InstCombiner::visitSelectInst(SelectInst &SI) { } } + if (Instruction *BitCastSel = foldSelectCmpBitcasts(SI, *Builder)) + return BitCastSel; + return nullptr; } |