diff options
author | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
---|---|---|
committer | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
commit | 60b571e49a90d38697b3aca23020d9da42fc7d7f (patch) | |
tree | 99351324c24d6cb146b6285b6caffa4d26fce188 /contrib/llvm/lib/Analysis/ValueTracking.cpp | |
parent | bea1b22c7a9bce1dfdd73e6e5b65bc4752215180 (diff) | |
download | FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.zip FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.tar.gz |
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste):
Add WITH_LLD_AS_LD build knob
If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not
capable of linking the world and kernel, but can self-host and link many
substantial applications. GNU ld continues to be used for the world and
kernel build, regardless of how this knob is set.
It is on by default for arm64, and off for all other CPU architectures.
Sponsored by: The FreeBSD Foundation
MFC r310840:
Reapply 310775, now it also builds correctly if lldb is disabled:
Move llvm-objdump from CLANG_EXTRAS to installed by default
We currently install three tools from binutils 2.17.50: as, ld, and
objdump. Work is underway to migrate to a permissively-licensed
tool-chain, with one goal being the retirement of binutils 2.17.50.
LLVM's llvm-objdump is intended to be compatible with GNU objdump
although it is currently missing some options and may have formatting
differences. Enable it by default for testing and further investigation.
It may later be changed to install as /usr/bin/objdump, it becomes a
fully viable replacement.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8879
MFC r312855 (by emaste):
Rename LLD_AS_LD to LLD_IS_LD, for consistency with CLANG_IS_CC
Reported by: Dan McGregor <dan.mcgregor usask.ca>
MFC r313559 | glebius | 2017-02-10 18:34:48 +0100 (Fri, 10 Feb 2017) | 5 lines
Don't check struct rtentry on FreeBSD, it is an internal kernel structure.
On other systems it may be API structure for SIOCADDRT/SIOCDELRT.
Reviewed by: emaste, dim
MFC r314152 (by jkim):
Remove an assembler flag, which is redundant since r309124. The upstream
took care of it by introducing a macro NO_EXEC_STACK_DIRECTIVE.
http://llvm.org/viewvc/llvm-project?rev=273500&view=rev
Reviewed by: dim
MFC r314564:
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC r314708:
For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
MFC r314795:
Reapply r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
MFC r314907 | mmel | 2017-03-08 12:40:27 +0100 (Wed, 08 Mar 2017) | 7 lines
Unbreak ARMv6 world.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
MFC r315016:
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release.
We were already very close to the last release candidate, so this is a
pretty minor update.
Relnotes: yes
MFC r316005:
Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by
Weiming Zhao):
builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.
Summary:
Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation
mode (-mthumb, -marm), it reflect's capability of given CPU.
Due to this:
- use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB
- use '.thumb' directive consistently in all affected files
- decorate all thumb functions using
DEFINE_COMPILERRT_THUMB_FUNCTION()
---------
Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 !
Reviewers: weimingz, rengolin, compnerd
Subscribers: aemerson, dim
Differential Revision: https://reviews.llvm.org/D30938
Discussed with: mmel
Diffstat (limited to 'contrib/llvm/lib/Analysis/ValueTracking.cpp')
-rw-r--r-- | contrib/llvm/lib/Analysis/ValueTracking.cpp | 747 |
1 files changed, 506 insertions, 241 deletions
diff --git a/contrib/llvm/lib/Analysis/ValueTracking.cpp b/contrib/llvm/lib/Analysis/ValueTracking.cpp index f2b4078..be62858 100644 --- a/contrib/llvm/lib/Analysis/ValueTracking.cpp +++ b/contrib/llvm/lib/Analysis/ValueTracking.cpp @@ -51,6 +51,12 @@ const unsigned MaxDepth = 6; static cl::opt<unsigned> DomConditionsMaxUses("dom-conditions-max-uses", cl::Hidden, cl::init(20)); +// This optimization is known to cause performance regressions is some cases, +// keep it under a temporary flag for now. +static cl::opt<bool> +DontImproveNonNegativePhiBits("dont-improve-non-negative-phi-bits", + cl::Hidden, cl::init(true)); + /// Returns the bitwidth of the given scalar or pointer type (if unknown returns /// 0). For vector types, returns the element type's bitwidth. static unsigned getBitWidth(Type *Ty, const DataLayout &DL) { @@ -80,7 +86,7 @@ struct Query { /// isKnownNonZero, which calls computeKnownBits and ComputeSignBit and /// isKnownToBeAPowerOfTwo (all of which can call computeKnownBits), and so /// on. - std::array<const Value*, MaxDepth> Excluded; + std::array<const Value *, MaxDepth> Excluded; unsigned NumExcluded; Query(const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, @@ -119,10 +125,10 @@ static const Instruction *safeCxtI(const Value *V, const Instruction *CxtI) { return nullptr; } -static void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, +static void computeKnownBits(const Value *V, APInt &KnownZero, APInt &KnownOne, unsigned Depth, const Query &Q); -void llvm::computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, +void llvm::computeKnownBits(const Value *V, APInt &KnownZero, APInt &KnownOne, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { @@ -130,7 +136,8 @@ void llvm::computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -bool llvm::haveNoCommonBitsSet(Value *LHS, Value *RHS, const DataLayout &DL, +bool llvm::haveNoCommonBitsSet(const Value *LHS, const Value *RHS, + const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { assert(LHS->getType() == RHS->getType() && @@ -145,10 +152,10 @@ bool llvm::haveNoCommonBitsSet(Value *LHS, Value *RHS, const DataLayout &DL, return (LHSKnownZero | RHSKnownZero).isAllOnesValue(); } -static void ComputeSignBit(Value *V, bool &KnownZero, bool &KnownOne, +static void ComputeSignBit(const Value *V, bool &KnownZero, bool &KnownOne, unsigned Depth, const Query &Q); -void llvm::ComputeSignBit(Value *V, bool &KnownZero, bool &KnownOne, +void llvm::ComputeSignBit(const Value *V, bool &KnownZero, bool &KnownOne, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { @@ -156,10 +163,11 @@ void llvm::ComputeSignBit(Value *V, bool &KnownZero, bool &KnownOne, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -static bool isKnownToBeAPowerOfTwo(Value *V, bool OrZero, unsigned Depth, +static bool isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth, const Query &Q); -bool llvm::isKnownToBeAPowerOfTwo(Value *V, const DataLayout &DL, bool OrZero, +bool llvm::isKnownToBeAPowerOfTwo(const Value *V, const DataLayout &DL, + bool OrZero, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { @@ -167,15 +175,16 @@ bool llvm::isKnownToBeAPowerOfTwo(Value *V, const DataLayout &DL, bool OrZero, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -static bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q); +static bool isKnownNonZero(const Value *V, unsigned Depth, const Query &Q); -bool llvm::isKnownNonZero(Value *V, const DataLayout &DL, unsigned Depth, +bool llvm::isKnownNonZero(const Value *V, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { return ::isKnownNonZero(V, Depth, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -bool llvm::isKnownNonNegative(Value *V, const DataLayout &DL, unsigned Depth, +bool llvm::isKnownNonNegative(const Value *V, const DataLayout &DL, + unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { bool NonNegative, Negative; @@ -183,7 +192,7 @@ bool llvm::isKnownNonNegative(Value *V, const DataLayout &DL, unsigned Depth, return NonNegative; } -bool llvm::isKnownPositive(Value *V, const DataLayout &DL, unsigned Depth, +bool llvm::isKnownPositive(const Value *V, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { if (auto *CI = dyn_cast<ConstantInt>(V)) @@ -195,7 +204,7 @@ bool llvm::isKnownPositive(Value *V, const DataLayout &DL, unsigned Depth, isKnownNonZero(V, DL, Depth, AC, CxtI, DT); } -bool llvm::isKnownNegative(Value *V, const DataLayout &DL, unsigned Depth, +bool llvm::isKnownNegative(const Value *V, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { bool NonNegative, Negative; @@ -203,41 +212,45 @@ bool llvm::isKnownNegative(Value *V, const DataLayout &DL, unsigned Depth, return Negative; } -static bool isKnownNonEqual(Value *V1, Value *V2, const Query &Q); +static bool isKnownNonEqual(const Value *V1, const Value *V2, const Query &Q); -bool llvm::isKnownNonEqual(Value *V1, Value *V2, const DataLayout &DL, - AssumptionCache *AC, const Instruction *CxtI, - const DominatorTree *DT) { +bool llvm::isKnownNonEqual(const Value *V1, const Value *V2, + const DataLayout &DL, + AssumptionCache *AC, const Instruction *CxtI, + const DominatorTree *DT) { return ::isKnownNonEqual(V1, V2, Query(DL, AC, safeCxtI(V1, safeCxtI(V2, CxtI)), DT)); } -static bool MaskedValueIsZero(Value *V, const APInt &Mask, unsigned Depth, +static bool MaskedValueIsZero(const Value *V, const APInt &Mask, unsigned Depth, const Query &Q); -bool llvm::MaskedValueIsZero(Value *V, const APInt &Mask, const DataLayout &DL, +bool llvm::MaskedValueIsZero(const Value *V, const APInt &Mask, + const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { return ::MaskedValueIsZero(V, Mask, Depth, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -static unsigned ComputeNumSignBits(Value *V, unsigned Depth, const Query &Q); +static unsigned ComputeNumSignBits(const Value *V, unsigned Depth, + const Query &Q); -unsigned llvm::ComputeNumSignBits(Value *V, const DataLayout &DL, +unsigned llvm::ComputeNumSignBits(const Value *V, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { return ::ComputeNumSignBits(V, Depth, Query(DL, AC, safeCxtI(V, CxtI), DT)); } -static void computeKnownBitsAddSub(bool Add, Value *Op0, Value *Op1, bool NSW, +static void computeKnownBitsAddSub(bool Add, const Value *Op0, const Value *Op1, + bool NSW, APInt &KnownZero, APInt &KnownOne, APInt &KnownZero2, APInt &KnownOne2, unsigned Depth, const Query &Q) { if (!Add) { - if (ConstantInt *CLHS = dyn_cast<ConstantInt>(Op0)) { + if (const ConstantInt *CLHS = dyn_cast<ConstantInt>(Op0)) { // We know that the top bits of C-X are clear if X contains less bits // than C (i.e. no wrap-around can happen). For example, 20-X is // positive if we can prove that X is >= 0 and < 16. @@ -311,7 +324,7 @@ static void computeKnownBitsAddSub(bool Add, Value *Op0, Value *Op1, bool NSW, } } -static void computeKnownBitsMul(Value *Op0, Value *Op1, bool NSW, +static void computeKnownBitsMul(const Value *Op0, const Value *Op1, bool NSW, APInt &KnownZero, APInt &KnownOne, APInt &KnownZero2, APInt &KnownOne2, unsigned Depth, const Query &Q) { @@ -398,7 +411,7 @@ void llvm::computeKnownBitsFromRangeMetadata(const MDNode &Ranges, } } -static bool isEphemeralValueOf(Instruction *I, const Value *E) { +static bool isEphemeralValueOf(const Instruction *I, const Value *E) { SmallVector<const Value *, 16> WorkSet(1, I); SmallPtrSet<const Value *, 32> Visited; SmallPtrSet<const Value *, 16> EphValues; @@ -406,7 +419,7 @@ static bool isEphemeralValueOf(Instruction *I, const Value *E) { // The instruction defining an assumption's condition itself is always // considered ephemeral to that assumption (even if it has other // non-ephemeral users). See r246696's test case for an example. - if (std::find(I->op_begin(), I->op_end(), E) != I->op_end()) + if (is_contained(I->operands(), E)) return true; while (!WorkSet.empty()) { @@ -415,8 +428,7 @@ static bool isEphemeralValueOf(Instruction *I, const Value *E) { continue; // If all uses of this value are ephemeral, then so is this value. - if (std::all_of(V->user_begin(), V->user_end(), - [&](const User *U) { return EphValues.count(U); })) { + if (all_of(V->users(), [&](const User *U) { return EphValues.count(U); })) { if (V == E) return true; @@ -456,9 +468,9 @@ static bool isAssumeLikeIntrinsic(const Instruction *I) { return false; } -static bool isValidAssumeForContext(Value *V, const Instruction *CxtI, - const DominatorTree *DT) { - Instruction *Inv = cast<Instruction>(V); +bool llvm::isValidAssumeForContext(const Instruction *Inv, + const Instruction *CxtI, + const DominatorTree *DT) { // There are two restrictions on the use of an assume: // 1. The assume must dominate the context (or the control flow must @@ -469,54 +481,42 @@ static bool isValidAssumeForContext(Value *V, const Instruction *CxtI, // the assume). if (DT) { - if (DT->dominates(Inv, CxtI)) { + if (DT->dominates(Inv, CxtI)) return true; - } else if (Inv->getParent() == CxtI->getParent()) { - // The context comes first, but they're both in the same block. Make sure - // there is nothing in between that might interrupt the control flow. - for (BasicBlock::const_iterator I = - std::next(BasicBlock::const_iterator(CxtI)), - IE(Inv); I != IE; ++I) - if (!isSafeToSpeculativelyExecute(&*I) && !isAssumeLikeIntrinsic(&*I)) - return false; - - return !isEphemeralValueOf(Inv, CxtI); - } + } else if (Inv->getParent() == CxtI->getParent()->getSinglePredecessor()) { + // We don't have a DT, but this trivially dominates. + return true; + } + // With or without a DT, the only remaining case we will check is if the + // instructions are in the same BB. Give up if that is not the case. + if (Inv->getParent() != CxtI->getParent()) return false; - } - // When we don't have a DT, we do a limited search... - if (Inv->getParent() == CxtI->getParent()->getSinglePredecessor()) { - return true; - } else if (Inv->getParent() == CxtI->getParent()) { + // If we have a dom tree, then we now know that the assume doens't dominate + // the other instruction. If we don't have a dom tree then we can check if + // the assume is first in the BB. + if (!DT) { // Search forward from the assume until we reach the context (or the end // of the block); the common case is that the assume will come first. - for (BasicBlock::iterator I = std::next(BasicBlock::iterator(Inv)), + for (auto I = std::next(BasicBlock::const_iterator(Inv)), IE = Inv->getParent()->end(); I != IE; ++I) if (&*I == CxtI) return true; - - // The context must come first... - for (BasicBlock::const_iterator I = - std::next(BasicBlock::const_iterator(CxtI)), - IE(Inv); I != IE; ++I) - if (!isSafeToSpeculativelyExecute(&*I) && !isAssumeLikeIntrinsic(&*I)) - return false; - - return !isEphemeralValueOf(Inv, CxtI); } - return false; -} + // The context comes first, but they're both in the same block. Make sure + // there is nothing in between that might interrupt the control flow. + for (BasicBlock::const_iterator I = + std::next(BasicBlock::const_iterator(CxtI)), IE(Inv); + I != IE; ++I) + if (!isSafeToSpeculativelyExecute(&*I) && !isAssumeLikeIntrinsic(&*I)) + return false; -bool llvm::isValidAssumeForContext(const Instruction *I, - const Instruction *CxtI, - const DominatorTree *DT) { - return ::isValidAssumeForContext(const_cast<Instruction *>(I), CxtI, DT); + return !isEphemeralValueOf(Inv, CxtI); } -static void computeKnownBitsFromAssume(Value *V, APInt &KnownZero, +static void computeKnownBitsFromAssume(const Value *V, APInt &KnownZero, APInt &KnownOne, unsigned Depth, const Query &Q) { // Use of assumptions is context-sensitive. If we don't have a context, we @@ -526,7 +526,10 @@ static void computeKnownBitsFromAssume(Value *V, APInt &KnownZero, unsigned BitWidth = KnownZero.getBitWidth(); - for (auto &AssumeVH : Q.AC->assumptions()) { + // Note that the patterns below need to be kept in sync with the code + // in AssumptionCache::updateAffectedValues. + + for (auto &AssumeVH : Q.AC->assumptionsFor(V)) { if (!AssumeVH) continue; CallInst *I = cast<CallInst>(AssumeVH); @@ -778,6 +781,23 @@ static void computeKnownBitsFromAssume(Value *V, APInt &KnownZero, APInt::getHighBitsSet(BitWidth, RHSKnownZero.countLeadingOnes()); } } + + // If assumptions conflict with each other or previous known bits, then we + // have a logical fallacy. This should only happen when a program has + // undefined behavior. We can't assert/crash, so clear out the known bits and + // hope for the best. + + // FIXME: Publish a warning/remark that we have encountered UB or the compiler + // is broken. + + // FIXME: Implement a stronger version of "I give up" by invalidating/clearing + // the assumption cache. This should indicate that the cache is corrupted so + // future callers will not waste time repopulating it with faulty assumptions. + + if ((KnownZero & KnownOne) != 0) { + KnownZero.clearAllBits(); + KnownOne.clearAllBits(); + } } // Compute known bits from a shift operator, including those with a @@ -788,11 +808,11 @@ static void computeKnownBitsFromAssume(Value *V, APInt &KnownZero, // shift amount, compute the implied known-zero or known-one bits of the shift // operator's result respectively for that shift amount. The results from calling // KZF and KOF are conservatively combined for all permitted shift amounts. -template <typename KZFunctor, typename KOFunctor> -static void computeKnownBitsFromShiftOperator(Operator *I, - APInt &KnownZero, APInt &KnownOne, - APInt &KnownZero2, APInt &KnownOne2, - unsigned Depth, const Query &Q, KZFunctor KZF, KOFunctor KOF) { +static void computeKnownBitsFromShiftOperator( + const Operator *I, APInt &KnownZero, APInt &KnownOne, APInt &KnownZero2, + APInt &KnownOne2, unsigned Depth, const Query &Q, + function_ref<APInt(const APInt &, unsigned)> KZF, + function_ref<APInt(const APInt &, unsigned)> KOF) { unsigned BitWidth = KnownZero.getBitWidth(); if (auto *SA = dyn_cast<ConstantInt>(I->getOperand(1))) { @@ -801,6 +821,14 @@ static void computeKnownBitsFromShiftOperator(Operator *I, computeKnownBits(I->getOperand(0), KnownZero, KnownOne, Depth + 1, Q); KnownZero = KZF(KnownZero, ShiftAmt); KnownOne = KOF(KnownOne, ShiftAmt); + // If there is conflict between KnownZero and KnownOne, this must be an + // overflowing left shift, so the shift result is undefined. Clear KnownZero + // and KnownOne bits so that other code could propagate this undef. + if ((KnownZero & KnownOne) != 0) { + KnownZero.clearAllBits(); + KnownOne.clearAllBits(); + } + return; } @@ -866,7 +894,7 @@ static void computeKnownBitsFromShiftOperator(Operator *I, } } -static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, +static void computeKnownBitsFromOperator(const Operator *I, APInt &KnownZero, APInt &KnownOne, unsigned Depth, const Query &Q) { unsigned BitWidth = KnownZero.getBitWidth(); @@ -950,14 +978,64 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, KnownZero = APInt::getHighBitsSet(BitWidth, LeadZ); break; } - case Instruction::Select: + case Instruction::Select: { computeKnownBits(I->getOperand(2), KnownZero, KnownOne, Depth + 1, Q); computeKnownBits(I->getOperand(1), KnownZero2, KnownOne2, Depth + 1, Q); + const Value *LHS; + const Value *RHS; + SelectPatternFlavor SPF = matchSelectPattern(I, LHS, RHS).Flavor; + if (SelectPatternResult::isMinOrMax(SPF)) { + computeKnownBits(RHS, KnownZero, KnownOne, Depth + 1, Q); + computeKnownBits(LHS, KnownZero2, KnownOne2, Depth + 1, Q); + } else { + computeKnownBits(I->getOperand(2), KnownZero, KnownOne, Depth + 1, Q); + computeKnownBits(I->getOperand(1), KnownZero2, KnownOne2, Depth + 1, Q); + } + + unsigned MaxHighOnes = 0; + unsigned MaxHighZeros = 0; + if (SPF == SPF_SMAX) { + // If both sides are negative, the result is negative. + if (KnownOne[BitWidth - 1] && KnownOne2[BitWidth - 1]) + // We can derive a lower bound on the result by taking the max of the + // leading one bits. + MaxHighOnes = + std::max(KnownOne.countLeadingOnes(), KnownOne2.countLeadingOnes()); + // If either side is non-negative, the result is non-negative. + else if (KnownZero[BitWidth - 1] || KnownZero2[BitWidth - 1]) + MaxHighZeros = 1; + } else if (SPF == SPF_SMIN) { + // If both sides are non-negative, the result is non-negative. + if (KnownZero[BitWidth - 1] && KnownZero2[BitWidth - 1]) + // We can derive an upper bound on the result by taking the max of the + // leading zero bits. + MaxHighZeros = std::max(KnownZero.countLeadingOnes(), + KnownZero2.countLeadingOnes()); + // If either side is negative, the result is negative. + else if (KnownOne[BitWidth - 1] || KnownOne2[BitWidth - 1]) + MaxHighOnes = 1; + } else if (SPF == SPF_UMAX) { + // We can derive a lower bound on the result by taking the max of the + // leading one bits. + MaxHighOnes = + std::max(KnownOne.countLeadingOnes(), KnownOne2.countLeadingOnes()); + } else if (SPF == SPF_UMIN) { + // We can derive an upper bound on the result by taking the max of the + // leading zero bits. + MaxHighZeros = + std::max(KnownZero.countLeadingOnes(), KnownZero2.countLeadingOnes()); + } + // Only known if known in both the LHS and RHS. KnownOne &= KnownOne2; KnownZero &= KnownZero2; + if (MaxHighOnes > 0) + KnownOne |= APInt::getHighBitsSet(BitWidth, MaxHighOnes); + if (MaxHighZeros > 0) + KnownZero |= APInt::getHighBitsSet(BitWidth, MaxHighZeros); break; + } case Instruction::FPTrunc: case Instruction::FPExt: case Instruction::FPToUI: @@ -967,8 +1045,8 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, break; // Can't work with floating point. case Instruction::PtrToInt: case Instruction::IntToPtr: - case Instruction::AddrSpaceCast: // Pointers could be different sizes. - // FALL THROUGH and handle them the same as zext/trunc. + // Fall through and handle them the same as zext/trunc. + LLVM_FALLTHROUGH; case Instruction::ZExt: case Instruction::Trunc: { Type *SrcTy = I->getOperand(0)->getType(); @@ -1020,13 +1098,23 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, } case Instruction::Shl: { // (shl X, C1) & C2 == 0 iff (X & C2 >>u C1) == 0 - auto KZF = [BitWidth](const APInt &KnownZero, unsigned ShiftAmt) { - return (KnownZero << ShiftAmt) | - APInt::getLowBitsSet(BitWidth, ShiftAmt); // Low bits known 0. + bool NSW = cast<OverflowingBinaryOperator>(I)->hasNoSignedWrap(); + auto KZF = [BitWidth, NSW](const APInt &KnownZero, unsigned ShiftAmt) { + APInt KZResult = + (KnownZero << ShiftAmt) | + APInt::getLowBitsSet(BitWidth, ShiftAmt); // Low bits known 0. + // If this shift has "nsw" keyword, then the result is either a poison + // value or has the same sign bit as the first operand. + if (NSW && KnownZero.isNegative()) + KZResult.setBit(BitWidth - 1); + return KZResult; }; - auto KOF = [BitWidth](const APInt &KnownOne, unsigned ShiftAmt) { - return KnownOne << ShiftAmt; + auto KOF = [BitWidth, NSW](const APInt &KnownOne, unsigned ShiftAmt) { + APInt KOResult = KnownOne << ShiftAmt; + if (NSW && KnownOne.isNegative()) + KOResult.setBit(BitWidth - 1); + return KOResult; }; computeKnownBitsFromShiftOperator(I, KnownZero, KnownOne, @@ -1143,7 +1231,7 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, } case Instruction::Alloca: { - AllocaInst *AI = cast<AllocaInst>(I); + const AllocaInst *AI = cast<AllocaInst>(I); unsigned Align = AI->getAlignment(); if (Align == 0) Align = Q.DL.getABITypeAlignment(AI->getAllocatedType()); @@ -1163,7 +1251,7 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, gep_type_iterator GTI = gep_type_begin(I); for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) { Value *Index = I->getOperand(i); - if (StructType *STy = dyn_cast<StructType>(*GTI)) { + if (StructType *STy = GTI.getStructTypeOrNull()) { // Handle struct member offset arithmetic. // Handle case when index is vector zeroinitializer @@ -1200,7 +1288,7 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, break; } case Instruction::PHI: { - PHINode *P = cast<PHINode>(I); + const PHINode *P = cast<PHINode>(I); // Handle the case of a simple two-predecessor recurrence PHI. // There's a lot more that could theoretically be done here, but // this is sufficient to catch some interesting cases. @@ -1237,9 +1325,46 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, APInt KnownZero3(KnownZero), KnownOne3(KnownOne); computeKnownBits(L, KnownZero3, KnownOne3, Depth + 1, Q); - KnownZero = APInt::getLowBitsSet(BitWidth, - std::min(KnownZero2.countTrailingOnes(), - KnownZero3.countTrailingOnes())); + KnownZero = APInt::getLowBitsSet( + BitWidth, std::min(KnownZero2.countTrailingOnes(), + KnownZero3.countTrailingOnes())); + + if (DontImproveNonNegativePhiBits) + break; + + auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(LU); + if (OverflowOp && OverflowOp->hasNoSignedWrap()) { + // If initial value of recurrence is nonnegative, and we are adding + // a nonnegative number with nsw, the result can only be nonnegative + // or poison value regardless of the number of times we execute the + // add in phi recurrence. If initial value is negative and we are + // adding a negative number with nsw, the result can only be + // negative or poison value. Similar arguments apply to sub and mul. + // + // (add non-negative, non-negative) --> non-negative + // (add negative, negative) --> negative + if (Opcode == Instruction::Add) { + if (KnownZero2.isNegative() && KnownZero3.isNegative()) + KnownZero.setBit(BitWidth - 1); + else if (KnownOne2.isNegative() && KnownOne3.isNegative()) + KnownOne.setBit(BitWidth - 1); + } + + // (sub nsw non-negative, negative) --> non-negative + // (sub nsw negative, non-negative) --> negative + else if (Opcode == Instruction::Sub && LL == I) { + if (KnownZero2.isNegative() && KnownOne3.isNegative()) + KnownZero.setBit(BitWidth - 1); + else if (KnownOne2.isNegative() && KnownZero3.isNegative()) + KnownOne.setBit(BitWidth - 1); + } + + // (mul nsw non-negative, non-negative) --> non-negative + else if (Opcode == Instruction::Mul && KnownZero2.isNegative() && + KnownZero3.isNegative()) + KnownZero.setBit(BitWidth - 1); + } + break; } } @@ -1284,12 +1409,12 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, // function. if (MDNode *MD = cast<Instruction>(I)->getMetadata(LLVMContext::MD_range)) computeKnownBitsFromRangeMetadata(*MD, KnownZero, KnownOne); - if (Value *RV = CallSite(I).getReturnedArgOperand()) { + if (const Value *RV = ImmutableCallSite(I).getReturnedArgOperand()) { computeKnownBits(RV, KnownZero2, KnownOne2, Depth + 1, Q); KnownZero |= KnownZero2; KnownOne |= KnownOne2; } - if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) { + if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) { switch (II->getIntrinsicID()) { default: break; case Intrinsic::bswap: @@ -1326,9 +1451,16 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, } } break; + case Instruction::ExtractElement: + // Look through extract element. At the moment we keep this simple and skip + // tracking the specific element. But at least we might find information + // valid for all elements of the vector (for example if vector is sign + // extended, shifted, etc). + computeKnownBits(I->getOperand(0), KnownZero, KnownOne, Depth + 1, Q); + break; case Instruction::ExtractValue: if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I->getOperand(0))) { - ExtractValueInst *EVI = cast<ExtractValueInst>(I); + const ExtractValueInst *EVI = cast<ExtractValueInst>(I); if (EVI->getNumIndices() != 1) break; if (EVI->getIndices()[0] == 0) { switch (II->getIntrinsicID()) { @@ -1372,7 +1504,7 @@ static void computeKnownBitsFromOperator(Operator *I, APInt &KnownZero, /// where V is a vector, known zero, and known one values are the /// same width as the vector element, and the bit is set only if it is true /// for all of the elements in the vector. -void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, +void computeKnownBits(const Value *V, APInt &KnownZero, APInt &KnownOne, unsigned Depth, const Query &Q) { assert(V && "No Value?"); assert(Depth <= MaxDepth && "Limit Search Depth"); @@ -1388,9 +1520,10 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, KnownOne.getBitWidth() == BitWidth && "V, KnownOne and KnownZero should have same BitWidth"); - if (ConstantInt *CI = dyn_cast<ConstantInt>(V)) { - // We know all of the bits for a constant! - KnownOne = CI->getValue(); + const APInt *C; + if (match(V, m_APInt(C))) { + // We know all of the bits for a scalar constant or a splat vector constant! + KnownOne = *C; KnownZero = ~KnownOne; return; } @@ -1402,7 +1535,7 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, } // Handle a constant vector by taking the intersection of the known bits of // each element. - if (ConstantDataSequential *CDS = dyn_cast<ConstantDataSequential>(V)) { + if (const ConstantDataSequential *CDS = dyn_cast<ConstantDataSequential>(V)) { // We know that CDS must be a vector of integers. Take the intersection of // each element. KnownZero.setAllBits(); KnownOne.setAllBits(); @@ -1415,7 +1548,7 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, return; } - if (auto *CV = dyn_cast<ConstantVector>(V)) { + if (const auto *CV = dyn_cast<ConstantVector>(V)) { // We know that CV must be a vector of integers. Take the intersection of // each element. KnownZero.setAllBits(); KnownOne.setAllBits(); @@ -1438,6 +1571,14 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, // Start out not knowing anything. KnownZero.clearAllBits(); KnownOne.clearAllBits(); + // We can't imply anything about undefs. + if (isa<UndefValue>(V)) + return; + + // There's no point in looking through other users of ConstantData for + // assumptions. Confirm that we've handled them all. + assert(!isa<ConstantData>(V) && "Unhandled constant data!"); + // Limit search depth. // All recursive calls that increase depth must come after this. if (Depth == MaxDepth) @@ -1445,13 +1586,13 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, // A weak GlobalAlias is totally unknown. A non-weak GlobalAlias has // the bits of its aliasee. - if (GlobalAlias *GA = dyn_cast<GlobalAlias>(V)) { + if (const GlobalAlias *GA = dyn_cast<GlobalAlias>(V)) { if (!GA->isInterposable()) computeKnownBits(GA->getAliasee(), KnownZero, KnownOne, Depth + 1, Q); return; } - if (Operator *I = dyn_cast<Operator>(V)) + if (const Operator *I = dyn_cast<Operator>(V)) computeKnownBitsFromOperator(I, KnownZero, KnownOne, Depth, Q); // Aligned pointers have trailing zeros - refine KnownZero set @@ -1472,7 +1613,7 @@ void computeKnownBits(Value *V, APInt &KnownZero, APInt &KnownOne, /// Determine whether the sign bit is known to be zero or one. /// Convenience wrapper around computeKnownBits. -void ComputeSignBit(Value *V, bool &KnownZero, bool &KnownOne, +void ComputeSignBit(const Value *V, bool &KnownZero, bool &KnownOne, unsigned Depth, const Query &Q) { unsigned BitWidth = getBitWidth(V->getType(), Q.DL); if (!BitWidth) { @@ -1491,9 +1632,9 @@ void ComputeSignBit(Value *V, bool &KnownZero, bool &KnownOne, /// bit set when defined. For vectors return true if every element is known to /// be a power of two when defined. Supports values with integer or pointer /// types and vectors of integers. -bool isKnownToBeAPowerOfTwo(Value *V, bool OrZero, unsigned Depth, +bool isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth, const Query &Q) { - if (Constant *C = dyn_cast<Constant>(V)) { + if (const Constant *C = dyn_cast<Constant>(V)) { if (C->isNullValue()) return OrZero; @@ -1523,10 +1664,10 @@ bool isKnownToBeAPowerOfTwo(Value *V, bool OrZero, unsigned Depth, match(V, m_LShr(m_Value(X), m_Value())))) return isKnownToBeAPowerOfTwo(X, /*OrZero*/ true, Depth, Q); - if (ZExtInst *ZI = dyn_cast<ZExtInst>(V)) + if (const ZExtInst *ZI = dyn_cast<ZExtInst>(V)) return isKnownToBeAPowerOfTwo(ZI->getOperand(0), OrZero, Depth, Q); - if (SelectInst *SI = dyn_cast<SelectInst>(V)) + if (const SelectInst *SI = dyn_cast<SelectInst>(V)) return isKnownToBeAPowerOfTwo(SI->getTrueValue(), OrZero, Depth, Q) && isKnownToBeAPowerOfTwo(SI->getFalseValue(), OrZero, Depth, Q); @@ -1544,7 +1685,7 @@ bool isKnownToBeAPowerOfTwo(Value *V, bool OrZero, unsigned Depth, // Adding a power-of-two or zero to the same power-of-two or zero yields // either the original power-of-two, a larger power-of-two or zero. if (match(V, m_Add(m_Value(X), m_Value(Y)))) { - OverflowingBinaryOperator *VOBO = cast<OverflowingBinaryOperator>(V); + const OverflowingBinaryOperator *VOBO = cast<OverflowingBinaryOperator>(V); if (OrZero || VOBO->hasNoUnsignedWrap() || VOBO->hasNoSignedWrap()) { if (match(X, m_And(m_Specific(Y), m_Value())) || match(X, m_And(m_Value(), m_Specific(Y)))) @@ -1590,7 +1731,7 @@ bool isKnownToBeAPowerOfTwo(Value *V, bool OrZero, unsigned Depth, /// to be non-null. /// /// Currently this routine does not support vector GEPs. -static bool isGEPKnownNonNull(GEPOperator *GEP, unsigned Depth, +static bool isGEPKnownNonNull(const GEPOperator *GEP, unsigned Depth, const Query &Q) { if (!GEP->isInBounds() || GEP->getPointerAddressSpace() != 0) return false; @@ -1609,7 +1750,7 @@ static bool isGEPKnownNonNull(GEPOperator *GEP, unsigned Depth, for (gep_type_iterator GTI = gep_type_begin(GEP), GTE = gep_type_end(GEP); GTI != GTE; ++GTI) { // Struct types are easy -- they must always be indexed by a constant. - if (StructType *STy = dyn_cast<StructType>(*GTI)) { + if (StructType *STy = GTI.getStructTypeOrNull()) { ConstantInt *OpC = cast<ConstantInt>(GTI.getOperand()); unsigned ElementIdx = OpC->getZExtValue(); const StructLayout *SL = Q.DL.getStructLayout(STy); @@ -1649,7 +1790,7 @@ static bool isGEPKnownNonNull(GEPOperator *GEP, unsigned Depth, /// Does the 'Range' metadata (which must be a valid MD_range operand list) /// ensure that the value it's attached to is never Value? 'RangeType' is /// is the type of the value described by the range. -static bool rangeMetadataExcludesValue(MDNode* Ranges, const APInt& Value) { +static bool rangeMetadataExcludesValue(const MDNode* Ranges, const APInt& Value) { const unsigned NumRanges = Ranges->getNumOperands() / 2; assert(NumRanges >= 1); for (unsigned i = 0; i < NumRanges; ++i) { @@ -1668,7 +1809,7 @@ static bool rangeMetadataExcludesValue(MDNode* Ranges, const APInt& Value) { /// For vectors return true if every element is known to be non-zero when /// defined. Supports values with integer or pointer type and vectors of /// integers. -bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { +bool isKnownNonZero(const Value *V, unsigned Depth, const Query &Q) { if (auto *C = dyn_cast<Constant>(V)) { if (C->isNullValue()) return false; @@ -1712,7 +1853,7 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { if (V->getType()->isPointerTy()) { if (isKnownNonNull(V)) return true; - if (GEPOperator *GEP = dyn_cast<GEPOperator>(V)) + if (const GEPOperator *GEP = dyn_cast<GEPOperator>(V)) if (isGEPKnownNonNull(GEP, Depth, Q)) return true; } @@ -1732,7 +1873,7 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { // if the lowest bit is shifted off the end. if (BitWidth && match(V, m_Shl(m_Value(X), m_Value(Y)))) { // shl nuw can't remove any non-zero bits. - OverflowingBinaryOperator *BO = cast<OverflowingBinaryOperator>(V); + const OverflowingBinaryOperator *BO = cast<OverflowingBinaryOperator>(V); if (BO->hasNoUnsignedWrap()) return isKnownNonZero(X, Depth, Q); @@ -1746,7 +1887,7 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { // defined if the sign bit is shifted off the end. else if (match(V, m_Shr(m_Value(X), m_Value(Y)))) { // shr exact can only shift out zero bits. - PossiblyExactOperator *BO = cast<PossiblyExactOperator>(V); + const PossiblyExactOperator *BO = cast<PossiblyExactOperator>(V); if (BO->isExact()) return isKnownNonZero(X, Depth, Q); @@ -1817,7 +1958,7 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { } // X * Y. else if (match(V, m_Mul(m_Value(X), m_Value(Y)))) { - OverflowingBinaryOperator *BO = cast<OverflowingBinaryOperator>(V); + const OverflowingBinaryOperator *BO = cast<OverflowingBinaryOperator>(V); // If X and Y are non-zero then so is X * Y as long as the multiplication // does not overflow. if ((BO->hasNoSignedWrap() || BO->hasNoUnsignedWrap()) && @@ -1825,13 +1966,13 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { return true; } // (C ? X : Y) != 0 if X != 0 and Y != 0. - else if (SelectInst *SI = dyn_cast<SelectInst>(V)) { + else if (const SelectInst *SI = dyn_cast<SelectInst>(V)) { if (isKnownNonZero(SI->getTrueValue(), Depth, Q) && isKnownNonZero(SI->getFalseValue(), Depth, Q)) return true; } // PHI - else if (PHINode *PN = dyn_cast<PHINode>(V)) { + else if (const PHINode *PN = dyn_cast<PHINode>(V)) { // Try and detect a recurrence that monotonically increases from a // starting value, as these are common as induction variables. if (PN->getNumIncomingValues() == 2) { @@ -1865,8 +2006,8 @@ bool isKnownNonZero(Value *V, unsigned Depth, const Query &Q) { } /// Return true if V2 == V1 + X, where X is known non-zero. -static bool isAddOfNonZero(Value *V1, Value *V2, const Query &Q) { - BinaryOperator *BO = dyn_cast<BinaryOperator>(V1); +static bool isAddOfNonZero(const Value *V1, const Value *V2, const Query &Q) { + const BinaryOperator *BO = dyn_cast<BinaryOperator>(V1); if (!BO || BO->getOpcode() != Instruction::Add) return false; Value *Op = nullptr; @@ -1880,7 +2021,7 @@ static bool isAddOfNonZero(Value *V1, Value *V2, const Query &Q) { } /// Return true if it is known that V1 != V2. -static bool isKnownNonEqual(Value *V1, Value *V2, const Query &Q) { +static bool isKnownNonEqual(const Value *V1, const Value *V2, const Query &Q) { if (V1->getType()->isVectorTy() || V1 == V2) return false; if (V1->getType() != V2->getType()) @@ -1916,7 +2057,7 @@ static bool isKnownNonEqual(Value *V1, Value *V2, const Query &Q) { /// where V is a vector, the mask, known zero, and known one values are the /// same width as the vector element, and the bit is set only if it is true /// for all of the elements in the vector. -bool MaskedValueIsZero(Value *V, const APInt &Mask, unsigned Depth, +bool MaskedValueIsZero(const Value *V, const APInt &Mask, unsigned Depth, const Query &Q) { APInt KnownZero(Mask.getBitWidth(), 0), KnownOne(Mask.getBitWidth(), 0); computeKnownBits(V, KnownZero, KnownOne, Depth, Q); @@ -1927,8 +2068,9 @@ bool MaskedValueIsZero(Value *V, const APInt &Mask, unsigned Depth, /// minimum number of sign bits. Return 0 if the value is not a vector constant /// or if any element was not analyzed; otherwise, return the count for the /// element with the minimum number of sign bits. -static unsigned computeNumSignBitsVectorConstant(Value *V, unsigned TyBits) { - auto *CV = dyn_cast<Constant>(V); +static unsigned computeNumSignBitsVectorConstant(const Value *V, + unsigned TyBits) { + const auto *CV = dyn_cast<Constant>(V); if (!CV || !CV->getType()->isVectorTy()) return 0; @@ -1956,7 +2098,7 @@ static unsigned computeNumSignBitsVectorConstant(Value *V, unsigned TyBits) { /// after an "ashr X, 2", we know that the top 3 bits are all equal to each /// other, so we return 3. For vectors, return the number of sign bits for the /// vector element with the mininum number of known sign bits. -unsigned ComputeNumSignBits(Value *V, unsigned Depth, const Query &Q) { +unsigned ComputeNumSignBits(const Value *V, unsigned Depth, const Query &Q) { unsigned TyBits = Q.DL.getTypeSizeInBits(V->getType()->getScalarType()); unsigned Tmp, Tmp2; unsigned FirstAnswer = 1; @@ -1964,10 +2106,10 @@ unsigned ComputeNumSignBits(Value *V, unsigned Depth, const Query &Q) { // Note that ConstantInt is handled by the general computeKnownBits case // below. - if (Depth == 6) + if (Depth == MaxDepth) return 1; // Limit search depth. - Operator *U = dyn_cast<Operator>(V); + const Operator *U = dyn_cast<Operator>(V); switch (Operator::getOpcode(V)) { default: break; case Instruction::SExt: @@ -2125,7 +2267,7 @@ unsigned ComputeNumSignBits(Value *V, unsigned Depth, const Query &Q) { return std::min(Tmp, Tmp2)-1; case Instruction::PHI: { - PHINode *PN = cast<PHINode>(U); + const PHINode *PN = cast<PHINode>(U); unsigned NumIncomingValues = PN->getNumIncomingValues(); // Don't analyze large in-degree PHIs. if (NumIncomingValues > 4) break; @@ -2147,6 +2289,13 @@ unsigned ComputeNumSignBits(Value *V, unsigned Depth, const Query &Q) { // FIXME: it's tricky to do anything useful for this, but it is an important // case for targets like X86. break; + + case Instruction::ExtractElement: + // Look through extract element. At the moment we keep this simple and skip + // tracking the specific element. But at least we might find information + // valid for all elements of the vector (for example if vector is sign + // extended, shifted, etc). + return ComputeNumSignBits(U->getOperand(0), Depth + 1, Q); } // Finally, if we can prove that the top bits of the result are 0's or 1's, @@ -2413,10 +2562,7 @@ bool llvm::CannotBeNegativeZero(const Value *V, const TargetLibraryInfo *TLI, if (const ConstantFP *CFP = dyn_cast<ConstantFP>(V)) return !CFP->getValueAPF().isNegZero(); - // FIXME: Magic number! At the least, this should be given a name because it's - // used similarly in CannotBeOrderedLessThanZero(). A better fix may be to - // expose it as a parameter, so it can be used for testing / experimenting. - if (Depth == 6) + if (Depth == MaxDepth) return false; // Limit search depth. const Operator *I = dyn_cast<Operator>(V); @@ -2454,54 +2600,70 @@ bool llvm::CannotBeNegativeZero(const Value *V, const TargetLibraryInfo *TLI, return false; } -bool llvm::CannotBeOrderedLessThanZero(const Value *V, - const TargetLibraryInfo *TLI, - unsigned Depth) { - if (const ConstantFP *CFP = dyn_cast<ConstantFP>(V)) - return !CFP->getValueAPF().isNegative() || CFP->getValueAPF().isZero(); +/// If \p SignBitOnly is true, test for a known 0 sign bit rather than a +/// standard ordered compare. e.g. make -0.0 olt 0.0 be true because of the sign +/// bit despite comparing equal. +static bool cannotBeOrderedLessThanZeroImpl(const Value *V, + const TargetLibraryInfo *TLI, + bool SignBitOnly, + unsigned Depth) { + if (const ConstantFP *CFP = dyn_cast<ConstantFP>(V)) { + return !CFP->getValueAPF().isNegative() || + (!SignBitOnly && CFP->getValueAPF().isZero()); + } - // FIXME: Magic number! At the least, this should be given a name because it's - // used similarly in CannotBeNegativeZero(). A better fix may be to - // expose it as a parameter, so it can be used for testing / experimenting. - if (Depth == 6) - return false; // Limit search depth. + if (Depth == MaxDepth) + return false; // Limit search depth. const Operator *I = dyn_cast<Operator>(V); - if (!I) return false; + if (!I) + return false; switch (I->getOpcode()) { - default: break; + default: + break; // Unsigned integers are always nonnegative. case Instruction::UIToFP: return true; case Instruction::FMul: // x*x is always non-negative or a NaN. - if (I->getOperand(0) == I->getOperand(1)) + if (I->getOperand(0) == I->getOperand(1) && + (!SignBitOnly || cast<FPMathOperator>(I)->hasNoNaNs())) return true; - // Fall through + + LLVM_FALLTHROUGH; case Instruction::FAdd: case Instruction::FDiv: case Instruction::FRem: - return CannotBeOrderedLessThanZero(I->getOperand(0), TLI, Depth + 1) && - CannotBeOrderedLessThanZero(I->getOperand(1), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly, + Depth + 1) && + cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly, + Depth + 1); case Instruction::Select: - return CannotBeOrderedLessThanZero(I->getOperand(1), TLI, Depth + 1) && - CannotBeOrderedLessThanZero(I->getOperand(2), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly, + Depth + 1) && + cannotBeOrderedLessThanZeroImpl(I->getOperand(2), TLI, SignBitOnly, + Depth + 1); case Instruction::FPExt: case Instruction::FPTrunc: // Widening/narrowing never change sign. - return CannotBeOrderedLessThanZero(I->getOperand(0), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly, + Depth + 1); case Instruction::Call: Intrinsic::ID IID = getIntrinsicForCallSite(cast<CallInst>(I), TLI); switch (IID) { default: break; case Intrinsic::maxnum: - return CannotBeOrderedLessThanZero(I->getOperand(0), TLI, Depth + 1) || - CannotBeOrderedLessThanZero(I->getOperand(1), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly, + Depth + 1) || + cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly, + Depth + 1); case Intrinsic::minnum: - return CannotBeOrderedLessThanZero(I->getOperand(0), TLI, Depth + 1) && - CannotBeOrderedLessThanZero(I->getOperand(1), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly, + Depth + 1) && + cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly, + Depth + 1); case Intrinsic::exp: case Intrinsic::exp2: case Intrinsic::fabs: @@ -2513,18 +2675,30 @@ bool llvm::CannotBeOrderedLessThanZero(const Value *V, if (CI->getBitWidth() <= 64 && CI->getSExtValue() % 2u == 0) return true; } - return CannotBeOrderedLessThanZero(I->getOperand(0), TLI, Depth + 1); + return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly, + Depth + 1); case Intrinsic::fma: case Intrinsic::fmuladd: // x*x+y is non-negative if y is non-negative. return I->getOperand(0) == I->getOperand(1) && - CannotBeOrderedLessThanZero(I->getOperand(2), TLI, Depth + 1); + (!SignBitOnly || cast<FPMathOperator>(I)->hasNoNaNs()) && + cannotBeOrderedLessThanZeroImpl(I->getOperand(2), TLI, SignBitOnly, + Depth + 1); } break; } return false; } +bool llvm::CannotBeOrderedLessThanZero(const Value *V, + const TargetLibraryInfo *TLI) { + return cannotBeOrderedLessThanZeroImpl(V, TLI, false, 0); +} + +bool llvm::SignBitMustBeZero(const Value *V, const TargetLibraryInfo *TLI) { + return cannotBeOrderedLessThanZeroImpl(V, TLI, true, 0); +} + /// If the specified value can be set by repeating the same byte in memory, /// return the i8 value that it is represented with. This is /// true for all i8 values obviously, but is also true for i32 0, i32 -1, @@ -2768,11 +2942,17 @@ Value *llvm::GetPointerBaseWithConstantOffset(Value *Ptr, int64_t &Offset, break; if (GEPOperator *GEP = dyn_cast<GEPOperator>(Ptr)) { - APInt GEPOffset(BitWidth, 0); + // If one of the values we have visited is an addrspacecast, then + // the pointer type of this GEP may be different from the type + // of the Ptr parameter which was passed to this function. This + // means when we construct GEPOffset, we need to use the size + // of GEP's pointer type rather than the size of the original + // pointer type. + APInt GEPOffset(DL.getPointerTypeSizeInBits(Ptr->getType()), 0); if (!GEP->accumulateConstantOffset(DL, GEPOffset)) break; - ByteOffset += GEPOffset; + ByteOffset += GEPOffset.getSExtValue(); Ptr = GEP->getPointerOperand(); } else if (Operator::getOpcode(Ptr) == Instruction::BitCast || @@ -2886,13 +3066,14 @@ bool llvm::getConstantStringInfo(const Value *V, StringRef &Str, /// If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. -static uint64_t GetStringLengthH(Value *V, SmallPtrSetImpl<PHINode*> &PHIs) { +static uint64_t GetStringLengthH(const Value *V, + SmallPtrSetImpl<const PHINode*> &PHIs) { // Look through noop bitcast instructions. V = V->stripPointerCasts(); // If this is a PHI node, there are two cases: either we have already seen it // or we haven't. - if (PHINode *PN = dyn_cast<PHINode>(V)) { + if (const PHINode *PN = dyn_cast<PHINode>(V)) { if (!PHIs.insert(PN).second) return ~0ULL; // already in the set. @@ -2914,7 +3095,7 @@ static uint64_t GetStringLengthH(Value *V, SmallPtrSetImpl<PHINode*> &PHIs) { } // strlen(select(c,x,y)) -> strlen(x) ^ strlen(y) - if (SelectInst *SI = dyn_cast<SelectInst>(V)) { + if (const SelectInst *SI = dyn_cast<SelectInst>(V)) { uint64_t Len1 = GetStringLengthH(SI->getTrueValue(), PHIs); if (Len1 == 0) return 0; uint64_t Len2 = GetStringLengthH(SI->getFalseValue(), PHIs); @@ -2935,10 +3116,10 @@ static uint64_t GetStringLengthH(Value *V, SmallPtrSetImpl<PHINode*> &PHIs) { /// If we can compute the length of the string pointed to by /// the specified pointer, return 'len+1'. If we can't, return 0. -uint64_t llvm::GetStringLength(Value *V) { +uint64_t llvm::GetStringLength(const Value *V) { if (!V->getType()->isPointerTy()) return 0; - SmallPtrSet<PHINode*, 32> PHIs; + SmallPtrSet<const PHINode*, 32> PHIs; uint64_t Len = GetStringLengthH(V, PHIs); // If Len is ~0ULL, we had an infinite phi cycle: this is dead code, so return // an empty string as a length. @@ -2947,7 +3128,8 @@ uint64_t llvm::GetStringLength(Value *V) { /// \brief \p PN defines a loop-variant pointer to an object. Check if the /// previous iteration of the loop was referring to the same object as \p PN. -static bool isSameUnderlyingObjectInLoop(PHINode *PN, LoopInfo *LI) { +static bool isSameUnderlyingObjectInLoop(const PHINode *PN, + const LoopInfo *LI) { // Find the loop-defined value. Loop *L = LI->getLoopFor(PN->getParent()); if (PN->getNumIncomingValues() != 2) @@ -3126,6 +3308,7 @@ bool llvm::isSafeToSpeculativelyExecute(const Value *V, case Intrinsic::dbg_value: return true; + case Intrinsic::bitreverse: case Intrinsic::bswap: case Intrinsic::ctlz: case Intrinsic::ctpop: @@ -3208,11 +3391,11 @@ bool llvm::isKnownNonNull(const Value *V) { if (const Argument *A = dyn_cast<Argument>(V)) return A->hasByValOrInAllocaAttr() || A->hasNonNullAttr(); - // A global variable in address space 0 is non null unless extern weak. - // Other address spaces may have null as a valid address for a global, - // so we can't assume anything. + // A global variable in address space 0 is non null unless extern weak + // or an absolute symbol reference. Other address spaces may have null as a + // valid address for a global, so we can't assume anything. if (const GlobalValue *GV = dyn_cast<GlobalValue>(V)) - return !GV->hasExternalWeakLinkage() && + return !GV->isAbsoluteSymbolRef() && !GV->hasExternalWeakLinkage() && GV->getType()->getAddressSpace() == 0; // A Load tagged with nonnull metadata is never null. @@ -3230,6 +3413,9 @@ static bool isKnownNonNullFromDominatingCondition(const Value *V, const Instruction *CtxI, const DominatorTree *DT) { assert(V->getType()->isPointerTy() && "V must be pointer type"); + assert(!isa<ConstantData>(V) && "Did not expect ConstantPointerNull"); + assert(CtxI && "Context instruction required for analysis"); + assert(DT && "Dominator tree required for analysis"); unsigned NumUsesExplored = 0; for (auto *U : V->users()) { @@ -3266,13 +3452,20 @@ static bool isKnownNonNullFromDominatingCondition(const Value *V, bool llvm::isKnownNonNullAt(const Value *V, const Instruction *CtxI, const DominatorTree *DT) { + if (isa<ConstantPointerNull>(V) || isa<UndefValue>(V)) + return false; + if (isKnownNonNull(V)) return true; - return CtxI ? ::isKnownNonNullFromDominatingCondition(V, CtxI, DT) : false; + if (!CtxI || !DT) + return false; + + return ::isKnownNonNullFromDominatingCondition(V, CtxI, DT); } -OverflowResult llvm::computeOverflowForUnsignedMul(Value *LHS, Value *RHS, +OverflowResult llvm::computeOverflowForUnsignedMul(const Value *LHS, + const Value *RHS, const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, @@ -3322,7 +3515,8 @@ OverflowResult llvm::computeOverflowForUnsignedMul(Value *LHS, Value *RHS, return OverflowResult::MayOverflow; } -OverflowResult llvm::computeOverflowForUnsignedAdd(Value *LHS, Value *RHS, +OverflowResult llvm::computeOverflowForUnsignedAdd(const Value *LHS, + const Value *RHS, const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, @@ -3351,9 +3545,13 @@ OverflowResult llvm::computeOverflowForUnsignedAdd(Value *LHS, Value *RHS, return OverflowResult::MayOverflow; } -static OverflowResult computeOverflowForSignedAdd( - Value *LHS, Value *RHS, AddOperator *Add, const DataLayout &DL, - AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { +static OverflowResult computeOverflowForSignedAdd(const Value *LHS, + const Value *RHS, + const AddOperator *Add, + const DataLayout &DL, + AssumptionCache *AC, + const Instruction *CxtI, + const DominatorTree *DT) { if (Add && Add->hasNoSignedWrap()) { return OverflowResult::NeverOverflows; } @@ -3395,7 +3593,8 @@ static OverflowResult computeOverflowForSignedAdd( return OverflowResult::MayOverflow; } -bool llvm::isOverflowIntrinsicNoWrap(IntrinsicInst *II, DominatorTree &DT) { +bool llvm::isOverflowIntrinsicNoWrap(const IntrinsicInst *II, + const DominatorTree &DT) { #ifndef NDEBUG auto IID = II->getIntrinsicID(); assert((IID == Intrinsic::sadd_with_overflow || @@ -3407,11 +3606,11 @@ bool llvm::isOverflowIntrinsicNoWrap(IntrinsicInst *II, DominatorTree &DT) { "Not an overflow intrinsic!"); #endif - SmallVector<BranchInst *, 2> GuardingBranches; - SmallVector<ExtractValueInst *, 2> Results; + SmallVector<const BranchInst *, 2> GuardingBranches; + SmallVector<const ExtractValueInst *, 2> Results; - for (User *U : II->users()) { - if (auto *EVI = dyn_cast<ExtractValueInst>(U)) { + for (const User *U : II->users()) { + if (const auto *EVI = dyn_cast<ExtractValueInst>(U)) { assert(EVI->getNumIndices() == 1 && "Obvious from CI's type"); if (EVI->getIndices()[0] == 0) @@ -3419,8 +3618,8 @@ bool llvm::isOverflowIntrinsicNoWrap(IntrinsicInst *II, DominatorTree &DT) { else { assert(EVI->getIndices()[0] == 1 && "Obvious from CI's type"); - for (auto *U : EVI->users()) - if (auto *B = dyn_cast<BranchInst>(U)) { + for (const auto *U : EVI->users()) + if (const auto *B = dyn_cast<BranchInst>(U)) { assert(B->isConditional() && "How else is it using an i1?"); GuardingBranches.push_back(B); } @@ -3432,13 +3631,13 @@ bool llvm::isOverflowIntrinsicNoWrap(IntrinsicInst *II, DominatorTree &DT) { } } - auto AllUsesGuardedByBranch = [&](BranchInst *BI) { + auto AllUsesGuardedByBranch = [&](const BranchInst *BI) { BasicBlockEdge NoWrapEdge(BI->getParent(), BI->getSuccessor(1)); if (!NoWrapEdge.isSingleEdge()) return false; // Check if all users of the add are provably no-wrap. - for (auto *Result : Results) { + for (const auto *Result : Results) { // If the extractvalue itself is not executed on overflow, the we don't // need to check each use separately, since domination is transitive. if (DT.dominates(NoWrapEdge, Result->getParent())) @@ -3456,7 +3655,7 @@ bool llvm::isOverflowIntrinsicNoWrap(IntrinsicInst *II, DominatorTree &DT) { } -OverflowResult llvm::computeOverflowForSignedAdd(AddOperator *Add, +OverflowResult llvm::computeOverflowForSignedAdd(const AddOperator *Add, const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, @@ -3465,7 +3664,8 @@ OverflowResult llvm::computeOverflowForSignedAdd(AddOperator *Add, Add, DL, AC, CxtI, DT); } -OverflowResult llvm::computeOverflowForSignedAdd(Value *LHS, Value *RHS, +OverflowResult llvm::computeOverflowForSignedAdd(const Value *LHS, + const Value *RHS, const DataLayout &DL, AssumptionCache *AC, const Instruction *CxtI, @@ -3502,12 +3702,27 @@ bool llvm::isGuaranteedToTransferExecutionToSuccessor(const Instruction *I) { return false; // Calls can throw, or contain an infinite loop, or kill the process. - if (CallSite CS = CallSite(const_cast<Instruction*>(I))) { - // Calls which don't write to arbitrary memory are safe. - // FIXME: Ignoring infinite loops without any side-effects is too aggressive, - // but it's consistent with other passes. See http://llvm.org/PR965 . - // FIXME: This isn't aggressive enough; a call which only writes to a - // global is guaranteed to return. + if (auto CS = ImmutableCallSite(I)) { + // Call sites that throw have implicit non-local control flow. + if (!CS.doesNotThrow()) + return false; + + // Non-throwing call sites can loop infinitely, call exit/pthread_exit + // etc. and thus not return. However, LLVM already assumes that + // + // - Thread exiting actions are modeled as writes to memory invisible to + // the program. + // + // - Loops that don't have side effects (side effects are volatile/atomic + // stores and IO) always terminate (see http://llvm.org/PR965). + // Furthermore IO itself is also modeled as writes to memory invisible to + // the program. + // + // We rely on those assumptions here, and use the memory effects of the call + // target as a proxy for checking that it always returns. + + // FIXME: This isn't aggressive enough; a call which only writes to a global + // is guaranteed to return. return CS.onlyReadsMemory() || CS.onlyAccessesArgMemory() || match(I, m_Intrinsic<Intrinsic::assume>()); } @@ -3688,7 +3903,7 @@ bool llvm::isKnownNotFullPoison(const Instruction *PoisonI) { return false; } -static bool isKnownNonNaN(Value *V, FastMathFlags FMF) { +static bool isKnownNonNaN(const Value *V, FastMathFlags FMF) { if (FMF.noNaNs()) return true; @@ -3697,12 +3912,90 @@ static bool isKnownNonNaN(Value *V, FastMathFlags FMF) { return false; } -static bool isKnownNonZero(Value *V) { +static bool isKnownNonZero(const Value *V) { if (auto *C = dyn_cast<ConstantFP>(V)) return !C->isZero(); return false; } +/// Match non-obvious integer minimum and maximum sequences. +static SelectPatternResult matchMinMax(CmpInst::Predicate Pred, + Value *CmpLHS, Value *CmpRHS, + Value *TrueVal, Value *FalseVal, + Value *&LHS, Value *&RHS) { + if (Pred != CmpInst::ICMP_SGT && Pred != CmpInst::ICMP_SLT) + return {SPF_UNKNOWN, SPNB_NA, false}; + + // Z = X -nsw Y + // (X >s Y) ? 0 : Z ==> (Z >s 0) ? 0 : Z ==> SMIN(Z, 0) + // (X <s Y) ? 0 : Z ==> (Z <s 0) ? 0 : Z ==> SMAX(Z, 0) + if (match(TrueVal, m_Zero()) && + match(FalseVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS)))) { + LHS = TrueVal; + RHS = FalseVal; + return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false}; + } + + // Z = X -nsw Y + // (X >s Y) ? Z : 0 ==> (Z >s 0) ? Z : 0 ==> SMAX(Z, 0) + // (X <s Y) ? Z : 0 ==> (Z <s 0) ? Z : 0 ==> SMIN(Z, 0) + if (match(FalseVal, m_Zero()) && + match(TrueVal, m_NSWSub(m_Specific(CmpLHS), m_Specific(CmpRHS)))) { + LHS = TrueVal; + RHS = FalseVal; + return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false}; + } + + const APInt *C1; + if (!match(CmpRHS, m_APInt(C1))) + return {SPF_UNKNOWN, SPNB_NA, false}; + + // An unsigned min/max can be written with a signed compare. + const APInt *C2; + if ((CmpLHS == TrueVal && match(FalseVal, m_APInt(C2))) || + (CmpLHS == FalseVal && match(TrueVal, m_APInt(C2)))) { + // Is the sign bit set? + // (X <s 0) ? X : MAXVAL ==> (X >u MAXVAL) ? X : MAXVAL ==> UMAX + // (X <s 0) ? MAXVAL : X ==> (X >u MAXVAL) ? MAXVAL : X ==> UMIN + if (Pred == CmpInst::ICMP_SLT && *C1 == 0 && C2->isMaxSignedValue()) { + LHS = TrueVal; + RHS = FalseVal; + return {CmpLHS == TrueVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false}; + } + + // Is the sign bit clear? + // (X >s -1) ? MINVAL : X ==> (X <u MINVAL) ? MINVAL : X ==> UMAX + // (X >s -1) ? X : MINVAL ==> (X <u MINVAL) ? X : MINVAL ==> UMIN + if (Pred == CmpInst::ICMP_SGT && C1->isAllOnesValue() && + C2->isMinSignedValue()) { + LHS = TrueVal; + RHS = FalseVal; + return {CmpLHS == FalseVal ? SPF_UMAX : SPF_UMIN, SPNB_NA, false}; + } + } + + // Look through 'not' ops to find disguised signed min/max. + // (X >s C) ? ~X : ~C ==> (~X <s ~C) ? ~X : ~C ==> SMIN(~X, ~C) + // (X <s C) ? ~X : ~C ==> (~X >s ~C) ? ~X : ~C ==> SMAX(~X, ~C) + if (match(TrueVal, m_Not(m_Specific(CmpLHS))) && + match(FalseVal, m_APInt(C2)) && ~(*C1) == *C2) { + LHS = TrueVal; + RHS = FalseVal; + return {Pred == CmpInst::ICMP_SGT ? SPF_SMIN : SPF_SMAX, SPNB_NA, false}; + } + + // (X >s C) ? ~C : ~X ==> (~X <s ~C) ? ~C : ~X ==> SMAX(~C, ~X) + // (X <s C) ? ~C : ~X ==> (~X >s ~C) ? ~C : ~X ==> SMIN(~C, ~X) + if (match(FalseVal, m_Not(m_Specific(CmpLHS))) && + match(TrueVal, m_APInt(C2)) && ~(*C1) == *C2) { + LHS = TrueVal; + RHS = FalseVal; + return {Pred == CmpInst::ICMP_SGT ? SPF_SMAX : SPF_SMIN, SPNB_NA, false}; + } + + return {SPF_UNKNOWN, SPNB_NA, false}; +} + static SelectPatternResult matchSelectPattern(CmpInst::Predicate Pred, FastMathFlags FMF, Value *CmpLHS, Value *CmpRHS, @@ -3801,39 +4094,26 @@ static SelectPatternResult matchSelectPattern(CmpInst::Predicate Pred, } } - if (ConstantInt *C1 = dyn_cast<ConstantInt>(CmpRHS)) { + const APInt *C1; + if (match(CmpRHS, m_APInt(C1))) { if ((CmpLHS == TrueVal && match(FalseVal, m_Neg(m_Specific(CmpLHS)))) || (CmpLHS == FalseVal && match(TrueVal, m_Neg(m_Specific(CmpLHS))))) { // ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X // NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X - if (Pred == ICmpInst::ICMP_SGT && (C1->isZero() || C1->isMinusOne())) { + if (Pred == ICmpInst::ICMP_SGT && (*C1 == 0 || C1->isAllOnesValue())) { return {(CmpLHS == TrueVal) ? SPF_ABS : SPF_NABS, SPNB_NA, false}; } // ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X // NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X - if (Pred == ICmpInst::ICMP_SLT && (C1->isZero() || C1->isOne())) { + if (Pred == ICmpInst::ICMP_SLT && (*C1 == 0 || *C1 == 1)) { return {(CmpLHS == FalseVal) ? SPF_ABS : SPF_NABS, SPNB_NA, false}; } } - - // Y >s C ? ~Y : ~C == ~Y <s ~C ? ~Y : ~C = SMIN(~Y, ~C) - if (const auto *C2 = dyn_cast<ConstantInt>(FalseVal)) { - if (Pred == ICmpInst::ICMP_SGT && C1->getType() == C2->getType() && - ~C1->getValue() == C2->getValue() && - (match(TrueVal, m_Not(m_Specific(CmpLHS))) || - match(CmpLHS, m_Not(m_Specific(TrueVal))))) { - LHS = TrueVal; - RHS = FalseVal; - return {SPF_SMIN, SPNB_NA, false}; - } - } } - // TODO: (X > 4) ? X : 5 --> (X >= 5) ? X : 5 --> MAX(X, 5) - - return {SPF_UNKNOWN, SPNB_NA, false}; + return matchMinMax(Pred, CmpLHS, CmpRHS, TrueVal, FalseVal, LHS, RHS); } static Value *lookThroughCast(CmpInst *CmpI, Value *V1, Value *V2, @@ -3932,30 +4212,9 @@ SelectPatternResult llvm::matchSelectPattern(Value *V, Value *&LHS, Value *&RHS, LHS, RHS); } -ConstantRange llvm::getConstantRangeFromMetadata(MDNode &Ranges) { - const unsigned NumRanges = Ranges.getNumOperands() / 2; - assert(NumRanges >= 1 && "Must have at least one range!"); - assert(Ranges.getNumOperands() % 2 == 0 && "Must be a sequence of pairs"); - - auto *FirstLow = mdconst::extract<ConstantInt>(Ranges.getOperand(0)); - auto *FirstHigh = mdconst::extract<ConstantInt>(Ranges.getOperand(1)); - - ConstantRange CR(FirstLow->getValue(), FirstHigh->getValue()); - - for (unsigned i = 1; i < NumRanges; ++i) { - auto *Low = mdconst::extract<ConstantInt>(Ranges.getOperand(2 * i + 0)); - auto *High = mdconst::extract<ConstantInt>(Ranges.getOperand(2 * i + 1)); - - // Note: unionWith will potentially create a range that contains values not - // contained in any of the original N ranges. - CR = CR.unionWith(ConstantRange(Low->getValue(), High->getValue())); - } - - return CR; -} - /// Return true if "icmp Pred LHS RHS" is always true. -static bool isTruePredicate(CmpInst::Predicate Pred, Value *LHS, Value *RHS, +static bool isTruePredicate(CmpInst::Predicate Pred, + const Value *LHS, const Value *RHS, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { @@ -3984,7 +4243,8 @@ static bool isTruePredicate(CmpInst::Predicate Pred, Value *LHS, Value *RHS, return true; // Match A to (X +_{nuw} CA) and B to (X +_{nuw} CB) - auto MatchNUWAddsToSameValue = [&](Value *A, Value *B, Value *&X, + auto MatchNUWAddsToSameValue = [&](const Value *A, const Value *B, + const Value *&X, const APInt *&CA, const APInt *&CB) { if (match(A, m_NUWAdd(m_Value(X), m_APInt(CA))) && match(B, m_NUWAdd(m_Specific(X), m_APInt(CB)))) @@ -4004,7 +4264,7 @@ static bool isTruePredicate(CmpInst::Predicate Pred, Value *LHS, Value *RHS, return false; }; - Value *X; + const Value *X; const APInt *CLHS, *CRHS; if (MatchNUWAddsToSameValue(LHS, RHS, X, CLHS, CRHS)) return CLHS->ule(*CRHS); @@ -4017,8 +4277,9 @@ static bool isTruePredicate(CmpInst::Predicate Pred, Value *LHS, Value *RHS, /// Return true if "icmp Pred BLHS BRHS" is true whenever "icmp Pred /// ALHS ARHS" is true. Otherwise, return None. static Optional<bool> -isImpliedCondOperands(CmpInst::Predicate Pred, Value *ALHS, Value *ARHS, - Value *BLHS, Value *BRHS, const DataLayout &DL, +isImpliedCondOperands(CmpInst::Predicate Pred, const Value *ALHS, + const Value *ARHS, const Value *BLHS, + const Value *BRHS, const DataLayout &DL, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, const DominatorTree *DT) { switch (Pred) { @@ -4045,7 +4306,8 @@ isImpliedCondOperands(CmpInst::Predicate Pred, Value *ALHS, Value *ARHS, /// Return true if the operands of the two compares match. IsSwappedOps is true /// when the operands match, but are swapped. -static bool isMatchingOps(Value *ALHS, Value *ARHS, Value *BLHS, Value *BRHS, +static bool isMatchingOps(const Value *ALHS, const Value *ARHS, + const Value *BLHS, const Value *BRHS, bool &IsSwappedOps) { bool IsMatchingOps = (ALHS == BLHS && ARHS == BRHS); @@ -4057,9 +4319,11 @@ static bool isMatchingOps(Value *ALHS, Value *ARHS, Value *BLHS, Value *BRHS, /// true. Return false if "icmp1 APred ALHS ARHS" implies "icmp2 BPred BLHS /// BRHS" is false. Otherwise, return None if we can't infer anything. static Optional<bool> isImpliedCondMatchingOperands(CmpInst::Predicate APred, - Value *ALHS, Value *ARHS, + const Value *ALHS, + const Value *ARHS, CmpInst::Predicate BPred, - Value *BLHS, Value *BRHS, + const Value *BLHS, + const Value *BRHS, bool IsSwappedOps) { // Canonicalize the operands so they're matching. if (IsSwappedOps) { @@ -4078,9 +4342,10 @@ static Optional<bool> isImpliedCondMatchingOperands(CmpInst::Predicate APred, /// true. Return false if "icmp1 APred ALHS C1" implies "icmp2 BPred BLHS /// C2" is false. Otherwise, return None if we can't infer anything. static Optional<bool> -isImpliedCondMatchingImmOperands(CmpInst::Predicate APred, Value *ALHS, - ConstantInt *C1, CmpInst::Predicate BPred, - Value *BLHS, ConstantInt *C2) { +isImpliedCondMatchingImmOperands(CmpInst::Predicate APred, const Value *ALHS, + const ConstantInt *C1, + CmpInst::Predicate BPred, + const Value *BLHS, const ConstantInt *C2) { assert(ALHS == BLHS && "LHS operands must match."); ConstantRange DomCR = ConstantRange::makeExactICmpRegion(APred, C1->getValue()); @@ -4095,7 +4360,7 @@ isImpliedCondMatchingImmOperands(CmpInst::Predicate APred, Value *ALHS, return None; } -Optional<bool> llvm::isImpliedCondition(Value *LHS, Value *RHS, +Optional<bool> llvm::isImpliedCondition(const Value *LHS, const Value *RHS, const DataLayout &DL, bool InvertAPred, unsigned Depth, AssumptionCache *AC, const Instruction *CxtI, |