summaryrefslogtreecommitdiffstats
path: root/contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp
diff options
context:
space:
mode:
authordim <dim@FreeBSD.org>2017-04-02 17:24:58 +0000
committerdim <dim@FreeBSD.org>2017-04-02 17:24:58 +0000
commit60b571e49a90d38697b3aca23020d9da42fc7d7f (patch)
tree99351324c24d6cb146b6285b6caffa4d26fce188 /contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp
parentbea1b22c7a9bce1dfdd73e6e5b65bc4752215180 (diff)
downloadFreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.zip
FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.tar.gz
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste): Add WITH_LLD_AS_LD build knob If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not capable of linking the world and kernel, but can self-host and link many substantial applications. GNU ld continues to be used for the world and kernel build, regardless of how this knob is set. It is on by default for arm64, and off for all other CPU architectures. Sponsored by: The FreeBSD Foundation MFC r310840: Reapply 310775, now it also builds correctly if lldb is disabled: Move llvm-objdump from CLANG_EXTRAS to installed by default We currently install three tools from binutils 2.17.50: as, ld, and objdump. Work is underway to migrate to a permissively-licensed tool-chain, with one goal being the retirement of binutils 2.17.50. LLVM's llvm-objdump is intended to be compatible with GNU objdump although it is currently missing some options and may have formatting differences. Enable it by default for testing and further investigation. It may later be changed to install as /usr/bin/objdump, it becomes a fully viable replacement. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D8879 MFC r312855 (by emaste): Rename LLD_AS_LD to LLD_IS_LD, for consistency with CLANG_IS_CC Reported by: Dan McGregor <dan.mcgregor usask.ca> MFC r313559 | glebius | 2017-02-10 18:34:48 +0100 (Fri, 10 Feb 2017) | 5 lines Don't check struct rtentry on FreeBSD, it is an internal kernel structure. On other systems it may be API structure for SIOCADDRT/SIOCDELRT. Reviewed by: emaste, dim MFC r314152 (by jkim): Remove an assembler flag, which is redundant since r309124. The upstream took care of it by introducing a macro NO_EXEC_STACK_DIRECTIVE. http://llvm.org/viewvc/llvm-project?rev=273500&view=rev Reviewed by: dim MFC r314564: Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 (branches/release_40 296509). The release will follow soon. Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11 support to build; see UPDATING for more information. Also note that as of 4.0.0, lld should be able to link the base system on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5). Though please be aware that this is work in progress. Release notes for llvm, clang and lld will be available here: <http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html> <http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html> <http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html> Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for their help. Relnotes: yes Exp-run: antoine PR: 215969, 216008 MFC r314708: For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov): [SCEV] limit recursion depth of CompareSCEVComplexity Summary: CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time. Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter. Reviewers: sanjoy Subscribers: mzolotukhin, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D26389 This commit is the cause of excessive compile times on skein_block.c (and possibly other files) during kernel builds on amd64. We never saw the problematic behavior described in this upstream commit, so for now it is better to revert it. An upstream bug has been filed here: https://bugs.llvm.org/show_bug.cgi?id=32142 Reported by: mjg MFC r314795: Reapply r287232 from upstream llvm trunk (by Daniil Fukalov): [SCEV] limit recursion depth of CompareSCEVComplexity Summary: CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time. Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter. Reviewers: sanjoy Subscribers: mzolotukhin, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D26389 Pull in r296992 from upstream llvm trunk (by Sanjoy Das): [SCEV] Decrease the recursion threshold for CompareValueComplexity Fixes PR32142. r287232 accidentally increased the recursion threshold for CompareValueComplexity from 2 to 32. This change reverses that change by introducing a separate flag for CompareValueComplexity's threshold. The latter revision fixes the excessive compile times for skein_block.c. MFC r314907 | mmel | 2017-03-08 12:40:27 +0100 (Wed, 08 Mar 2017) | 7 lines Unbreak ARMv6 world. The new compiler_rt library imported with clang 4.0.0 have several fatal issues (non-functional __udivsi3 for example) with ARM specific instrict functions. As temporary workaround, until upstream solve these problems, disable all thumb[1][2] related feature. MFC r315016: Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release. We were already very close to the last release candidate, so this is a pretty minor update. Relnotes: yes MFC r316005: Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by Weiming Zhao): builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA. Summary: Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation mode (-mthumb, -marm), it reflect's capability of given CPU. Due to this: - use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB - use '.thumb' directive consistently in all affected files - decorate all thumb functions using DEFINE_COMPILERRT_THUMB_FUNCTION() --------- Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 ! Reviewers: weimingz, rengolin, compnerd Subscribers: aemerson, dim Differential Revision: https://reviews.llvm.org/D30938 Discussed with: mmel
Diffstat (limited to 'contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp')
-rw-r--r--contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp358
1 files changed, 223 insertions, 135 deletions
diff --git a/contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp b/contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp
index 242b596..c7c61e0 100644
--- a/contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp
+++ b/contrib/llvm/tools/clang/lib/CodeGen/CGCall.cpp
@@ -29,6 +29,7 @@
#include "clang/CodeGen/SwiftCallingConv.h"
#include "clang/Frontend/CodeGenOptions.h"
#include "llvm/ADT/StringExtras.h"
+#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Attributes.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/CallSite.h"
@@ -47,6 +48,7 @@ unsigned CodeGenTypes::ClangCallConvToLLVMCallConv(CallingConv CC) {
default: return llvm::CallingConv::C;
case CC_X86StdCall: return llvm::CallingConv::X86_StdCall;
case CC_X86FastCall: return llvm::CallingConv::X86_FastCall;
+ case CC_X86RegCall: return llvm::CallingConv::X86_RegCall;
case CC_X86ThisCall: return llvm::CallingConv::X86_ThisCall;
case CC_X86_64Win64: return llvm::CallingConv::X86_64_Win64;
case CC_X86_64SysV: return llvm::CallingConv::X86_64_SysV;
@@ -172,6 +174,9 @@ static CallingConv getCallingConventionForDecl(const Decl *D, bool IsWindows) {
if (D->hasAttr<FastCallAttr>())
return CC_X86FastCall;
+ if (D->hasAttr<RegCallAttr>())
+ return CC_X86RegCall;
+
if (D->hasAttr<ThisCallAttr>())
return CC_X86ThisCall;
@@ -388,15 +393,13 @@ CodeGenTypes::arrangeFunctionDeclaration(const FunctionDecl *FD) {
// When declaring a function without a prototype, always use a
// non-variadic type.
- if (isa<FunctionNoProtoType>(FTy)) {
- CanQual<FunctionNoProtoType> noProto = FTy.getAs<FunctionNoProtoType>();
+ if (CanQual<FunctionNoProtoType> noProto = FTy.getAs<FunctionNoProtoType>()) {
return arrangeLLVMFunctionInfo(
noProto->getReturnType(), /*instanceMethod=*/false,
/*chainCall=*/false, None, noProto->getExtInfo(), {},RequiredArgs::All);
}
- assert(isa<FunctionProtoType>(FTy));
- return arrangeFreeFunctionType(FTy.getAs<FunctionProtoType>(), FD);
+ return arrangeFreeFunctionType(FTy.castAs<FunctionProtoType>(), FD);
}
/// Arrange the argument and result information for the declaration or
@@ -1647,6 +1650,8 @@ void CodeGenModule::ConstructAttributeList(
FuncAttrs.addAttribute(llvm::Attribute::NoReturn);
if (TargetDecl->hasAttr<NoDuplicateAttr>())
FuncAttrs.addAttribute(llvm::Attribute::NoDuplicate);
+ if (TargetDecl->hasAttr<ConvergentAttr>())
+ FuncAttrs.addAttribute(llvm::Attribute::Convergent);
if (const FunctionDecl *Fn = dyn_cast<FunctionDecl>(TargetDecl)) {
AddAttributesFromFunctionProtoType(
@@ -1676,6 +1681,14 @@ void CodeGenModule::ConstructAttributeList(
HasAnyX86InterruptAttr = TargetDecl->hasAttr<AnyX86InterruptAttr>();
HasOptnone = TargetDecl->hasAttr<OptimizeNoneAttr>();
+ if (auto *AllocSize = TargetDecl->getAttr<AllocSizeAttr>()) {
+ Optional<unsigned> NumElemsParam;
+ // alloc_size args are base-1, 0 means not present.
+ if (unsigned N = AllocSize->getNumElemsParam())
+ NumElemsParam = N - 1;
+ FuncAttrs.addAllocSizeAttr(AllocSize->getElemSizeParam() - 1,
+ NumElemsParam);
+ }
}
// OptimizeNoneAttr takes precedence over -Os or -Oz. No warning needed.
@@ -1722,6 +1735,16 @@ void CodeGenModule::ConstructAttributeList(
FuncAttrs.addAttribute("less-precise-fpmad",
llvm::toStringRef(CodeGenOpts.LessPreciseFPMAD));
+
+ if (!CodeGenOpts.FPDenormalMode.empty())
+ FuncAttrs.addAttribute("denormal-fp-math",
+ CodeGenOpts.FPDenormalMode);
+
+ FuncAttrs.addAttribute("no-trapping-math",
+ llvm::toStringRef(CodeGenOpts.NoTrappingMath));
+
+ // TODO: Are these all needed?
+ // unsafe/inf/nan/nsz are handled by instruction-level FastMathFlags.
FuncAttrs.addAttribute("no-infs-fp-math",
llvm::toStringRef(CodeGenOpts.NoInfsFPMath));
FuncAttrs.addAttribute("no-nans-fp-math",
@@ -1734,6 +1757,15 @@ void CodeGenModule::ConstructAttributeList(
llvm::utostr(CodeGenOpts.SSPBufferSize));
FuncAttrs.addAttribute("no-signed-zeros-fp-math",
llvm::toStringRef(CodeGenOpts.NoSignedZeros));
+ FuncAttrs.addAttribute(
+ "correctly-rounded-divide-sqrt-fp-math",
+ llvm::toStringRef(CodeGenOpts.CorrectlyRoundedDivSqrt));
+
+ // TODO: Reciprocal estimate codegen options should apply to instructions?
+ std::vector<std::string> &Recips = getTarget().getTargetOpts().Reciprocals;
+ if (!Recips.empty())
+ FuncAttrs.addAttribute("reciprocal-estimates",
+ llvm::join(Recips.begin(), Recips.end(), ","));
if (CodeGenOpts.StackRealignment)
FuncAttrs.addAttribute("stackrealign");
@@ -1794,6 +1826,9 @@ void CodeGenModule::ConstructAttributeList(
// them). LLVM will remove this attribute where it safely can.
FuncAttrs.addAttribute(llvm::Attribute::Convergent);
+ // Exceptions aren't supported in CUDA device code.
+ FuncAttrs.addAttribute(llvm::Attribute::NoUnwind);
+
// Respect -fcuda-flush-denormals-to-zero.
if (getLangOpts().CUDADeviceFlushDenormalsToZero)
FuncAttrs.addAttribute("nvptx-f32ftz", "true");
@@ -2299,13 +2334,6 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
if (isPromoted)
V = emitArgumentDemotion(*this, Arg, V);
- if (const CXXMethodDecl *MD =
- dyn_cast_or_null<CXXMethodDecl>(CurCodeDecl)) {
- if (MD->isVirtual() && Arg == CXXABIThisDecl)
- V = CGM.getCXXABI().
- adjustThisParameterInVirtualFunctionPrologue(*this, CurGD, V);
- }
-
// Because of merging of function types from multiple decls it is
// possible for the type of an argument to not match the corresponding
// type in the function type. Since we are codegening the callee
@@ -2465,7 +2493,7 @@ static llvm::Value *tryEmitFusedAutoreleaseOfResult(CodeGenFunction &CGF,
// result is in a BasicBlock and is therefore an Instruction.
llvm::Instruction *generator = cast<llvm::Instruction>(result);
- SmallVector<llvm::Instruction*,4> insnsToKill;
+ SmallVector<llvm::Instruction *, 4> InstsToKill;
// Look for:
// %generator = bitcast %type1* %generator2 to %type2*
@@ -2478,7 +2506,7 @@ static llvm::Value *tryEmitFusedAutoreleaseOfResult(CodeGenFunction &CGF,
if (generator->getNextNode() != bitcast)
return nullptr;
- insnsToKill.push_back(bitcast);
+ InstsToKill.push_back(bitcast);
}
// Look for:
@@ -2511,27 +2539,26 @@ static llvm::Value *tryEmitFusedAutoreleaseOfResult(CodeGenFunction &CGF,
assert(isa<llvm::CallInst>(prev));
assert(cast<llvm::CallInst>(prev)->getCalledValue() ==
CGF.CGM.getObjCEntrypoints().retainAutoreleasedReturnValueMarker);
- insnsToKill.push_back(prev);
+ InstsToKill.push_back(prev);
}
} else {
return nullptr;
}
result = call->getArgOperand(0);
- insnsToKill.push_back(call);
+ InstsToKill.push_back(call);
// Keep killing bitcasts, for sanity. Note that we no longer care
// about precise ordering as long as there's exactly one use.
while (llvm::BitCastInst *bitcast = dyn_cast<llvm::BitCastInst>(result)) {
if (!bitcast->hasOneUse()) break;
- insnsToKill.push_back(bitcast);
+ InstsToKill.push_back(bitcast);
result = bitcast->getOperand(0);
}
// Delete all the unnecessary instructions, from latest to earliest.
- for (SmallVectorImpl<llvm::Instruction*>::iterator
- i = insnsToKill.begin(), e = insnsToKill.end(); i != e; ++i)
- (*i)->eraseFromParent();
+ for (auto *I : InstsToKill)
+ I->eraseFromParent();
// Do the fused retain/autorelease if we were asked to.
if (doRetainAutorelease)
@@ -2841,7 +2868,7 @@ void CodeGenFunction::EmitFunctionEpilog(const CGFunctionInfo &FI,
EmitCheckSourceLocation(RetNNAttr->getLocation()),
};
EmitCheck(std::make_pair(Cond, SanitizerKind::ReturnsNonnullAttribute),
- "nonnull_return", StaticData, None);
+ SanitizerHandler::NonnullReturn, StaticData, None);
}
}
Ret = Builder.CreateRet(RV);
@@ -2863,13 +2890,13 @@ static AggValueSlot createPlaceholderSlot(CodeGenFunction &CGF,
// FIXME: Generate IR in one pass, rather than going back and fixing up these
// placeholders.
llvm::Type *IRTy = CGF.ConvertTypeForMem(Ty);
- llvm::Value *Placeholder =
- llvm::UndefValue::get(IRTy->getPointerTo()->getPointerTo());
- Placeholder = CGF.Builder.CreateDefaultAlignedLoad(Placeholder);
+ llvm::Type *IRPtrTy = IRTy->getPointerTo();
+ llvm::Value *Placeholder = llvm::UndefValue::get(IRPtrTy->getPointerTo());
// FIXME: When we generate this IR in one pass, we shouldn't need
// this win32-specific alignment hack.
CharUnits Align = CharUnits::fromQuantity(4);
+ Placeholder = CGF.Builder.CreateAlignedLoad(IRPtrTy, Placeholder, Align);
return AggValueSlot::forAddr(Address(Placeholder, Align),
Ty.getQualifiers(),
@@ -2891,22 +2918,36 @@ void CodeGenFunction::EmitDelegateCallArg(CallArgList &args,
assert(!isInAllocaArgument(CGM.getCXXABI(), type) &&
"cannot emit delegate call arguments for inalloca arguments!");
+ // GetAddrOfLocalVar returns a pointer-to-pointer for references,
+ // but the argument needs to be the original pointer.
+ if (type->isReferenceType()) {
+ args.add(RValue::get(Builder.CreateLoad(local)), type);
+
+ // In ARC, move out of consumed arguments so that the release cleanup
+ // entered by StartFunction doesn't cause an over-release. This isn't
+ // optimal -O0 code generation, but it should get cleaned up when
+ // optimization is enabled. This also assumes that delegate calls are
+ // performed exactly once for a set of arguments, but that should be safe.
+ } else if (getLangOpts().ObjCAutoRefCount &&
+ param->hasAttr<NSConsumedAttr>() &&
+ type->isObjCRetainableType()) {
+ llvm::Value *ptr = Builder.CreateLoad(local);
+ auto null =
+ llvm::ConstantPointerNull::get(cast<llvm::PointerType>(ptr->getType()));
+ Builder.CreateStore(null, local);
+ args.add(RValue::get(ptr), type);
+
// For the most part, we just need to load the alloca, except that
// aggregate r-values are actually pointers to temporaries.
- if (type->isReferenceType())
- args.add(RValue::get(Builder.CreateLoad(local)), type);
- else
+ } else {
args.add(convertTempToRValue(local, type, loc), type);
+ }
}
static bool isProvablyNull(llvm::Value *addr) {
return isa<llvm::ConstantPointerNull>(addr);
}
-static bool isProvablyNonNull(llvm::Value *addr) {
- return isa<llvm::AllocaInst>(addr);
-}
-
/// Emit the actual writing-back of a writeback.
static void emitWriteback(CodeGenFunction &CGF,
const CallArgList::Writeback &writeback) {
@@ -2919,7 +2960,7 @@ static void emitWriteback(CodeGenFunction &CGF,
// If the argument wasn't provably non-null, we need to null check
// before doing the store.
- bool provablyNonNull = isProvablyNonNull(srcAddr.getPointer());
+ bool provablyNonNull = llvm::isKnownNonNull(srcAddr.getPointer());
if (!provablyNonNull) {
llvm::BasicBlock *writebackBB = CGF.createBasicBlock("icr.writeback");
contBB = CGF.createBasicBlock("icr.done");
@@ -3059,7 +3100,7 @@ static void emitWritebackArg(CodeGenFunction &CGF, CallArgList &args,
// If the address is *not* known to be non-null, we need to switch.
llvm::Value *finalArgument;
- bool provablyNonNull = isProvablyNonNull(srcAddr.getPointer());
+ bool provablyNonNull = llvm::isKnownNonNull(srcAddr.getPointer());
if (provablyNonNull) {
finalArgument = temp.getPointer();
} else {
@@ -3130,7 +3171,7 @@ static void emitWritebackArg(CodeGenFunction &CGF, CallArgList &args,
}
void CallArgList::allocateArgumentMemory(CodeGenFunction &CGF) {
- assert(!StackBase && !StackCleanup.isValid());
+ assert(!StackBase);
// Save the stack.
llvm::Function *F = CGF.CGM.getIntrinsic(llvm::Intrinsic::stacksave);
@@ -3167,13 +3208,14 @@ void CodeGenFunction::EmitNonNullArgCheck(RValue RV, QualType ArgType,
llvm::ConstantInt::get(Int32Ty, ArgNo + 1),
};
EmitCheck(std::make_pair(Cond, SanitizerKind::NonnullAttribute),
- "nonnull_arg", StaticData, None);
+ SanitizerHandler::NonnullArg, StaticData, None);
}
void CodeGenFunction::EmitCallArgs(
CallArgList &Args, ArrayRef<QualType> ArgTypes,
llvm::iterator_range<CallExpr::const_arg_iterator> ArgRange,
- const FunctionDecl *CalleeDecl, unsigned ParamsToSkip) {
+ const FunctionDecl *CalleeDecl, unsigned ParamsToSkip,
+ EvaluationOrder Order) {
assert((int)ArgTypes.size() == (ArgRange.end() - ArgRange.begin()));
auto MaybeEmitImplicitObjectSize = [&](unsigned I, const Expr *Arg) {
@@ -3191,10 +3233,18 @@ void CodeGenFunction::EmitCallArgs(
};
// We *have* to evaluate arguments from right to left in the MS C++ ABI,
- // because arguments are destroyed left to right in the callee.
- if (CGM.getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()) {
- // Insert a stack save if we're going to need any inalloca args.
- bool HasInAllocaArgs = false;
+ // because arguments are destroyed left to right in the callee. As a special
+ // case, there are certain language constructs that require left-to-right
+ // evaluation, and in those cases we consider the evaluation order requirement
+ // to trump the "destruction order is reverse construction order" guarantee.
+ bool LeftToRight =
+ CGM.getTarget().getCXXABI().areArgsDestroyedLeftToRightInCallee()
+ ? Order == EvaluationOrder::ForceLeftToRight
+ : Order != EvaluationOrder::ForceRightToLeft;
+
+ // Insert a stack save if we're going to need any inalloca args.
+ bool HasInAllocaArgs = false;
+ if (CGM.getTarget().getCXXABI().isMicrosoft()) {
for (ArrayRef<QualType>::iterator I = ArgTypes.begin(), E = ArgTypes.end();
I != E && !HasInAllocaArgs; ++I)
HasInAllocaArgs = isInAllocaArgument(CGM.getCXXABI(), *I);
@@ -3202,30 +3252,24 @@ void CodeGenFunction::EmitCallArgs(
assert(getTarget().getTriple().getArch() == llvm::Triple::x86);
Args.allocateArgumentMemory(*this);
}
+ }
- // Evaluate each argument.
- size_t CallArgsStart = Args.size();
- for (int I = ArgTypes.size() - 1; I >= 0; --I) {
- CallExpr::const_arg_iterator Arg = ArgRange.begin() + I;
- MaybeEmitImplicitObjectSize(I, *Arg);
- EmitCallArg(Args, *Arg, ArgTypes[I]);
- EmitNonNullArgCheck(Args.back().RV, ArgTypes[I], (*Arg)->getExprLoc(),
- CalleeDecl, ParamsToSkip + I);
- }
+ // Evaluate each argument in the appropriate order.
+ size_t CallArgsStart = Args.size();
+ for (unsigned I = 0, E = ArgTypes.size(); I != E; ++I) {
+ unsigned Idx = LeftToRight ? I : E - I - 1;
+ CallExpr::const_arg_iterator Arg = ArgRange.begin() + Idx;
+ if (!LeftToRight) MaybeEmitImplicitObjectSize(Idx, *Arg);
+ EmitCallArg(Args, *Arg, ArgTypes[Idx]);
+ EmitNonNullArgCheck(Args.back().RV, ArgTypes[Idx], (*Arg)->getExprLoc(),
+ CalleeDecl, ParamsToSkip + Idx);
+ if (LeftToRight) MaybeEmitImplicitObjectSize(Idx, *Arg);
+ }
+ if (!LeftToRight) {
// Un-reverse the arguments we just evaluated so they match up with the LLVM
// IR function.
std::reverse(Args.begin() + CallArgsStart, Args.end());
- return;
- }
-
- for (unsigned I = 0, E = ArgTypes.size(); I != E; ++I) {
- CallExpr::const_arg_iterator Arg = ArgRange.begin() + I;
- assert(Arg != ArgRange.end());
- EmitCallArg(Args, *Arg, ArgTypes[I]);
- EmitNonNullArgCheck(Args.back().RV, ArgTypes[I], (*Arg)->getExprLoc(),
- CalleeDecl, ParamsToSkip + I);
- MaybeEmitImplicitObjectSize(I, *Arg);
}
}
@@ -3267,7 +3311,7 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
if (const ObjCIndirectCopyRestoreExpr *CRE
= dyn_cast<ObjCIndirectCopyRestoreExpr>(E)) {
assert(getLangOpts().ObjCAutoRefCount);
- assert(getContext().hasSameType(E->getType(), type));
+ assert(getContext().hasSameUnqualifiedType(E->getType(), type));
return emitWritebackArg(*this, args, CRE);
}
@@ -3505,21 +3549,22 @@ void CodeGenFunction::deferPlaceholderReplacement(llvm::Instruction *Old,
}
RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
- llvm::Value *Callee,
+ const CGCallee &Callee,
ReturnValueSlot ReturnValue,
const CallArgList &CallArgs,
- CGCalleeInfo CalleeInfo,
llvm::Instruction **callOrInvoke) {
// FIXME: We no longer need the types from CallArgs; lift up and simplify.
+ assert(Callee.isOrdinary());
+
// Handle struct-return functions by passing a pointer to the
// location that we would like to return into.
QualType RetTy = CallInfo.getReturnType();
const ABIArgInfo &RetAI = CallInfo.getReturnInfo();
- llvm::FunctionType *IRFuncTy =
- cast<llvm::FunctionType>(
- cast<llvm::PointerType>(Callee->getType())->getElementType());
+ llvm::FunctionType *IRFuncTy = Callee.getFunctionType();
+
+ // 1. Set up the arguments.
// If we're using inalloca, insert the allocation after the stack save.
// FIXME: Do this earlier rather than hacking it in here!
@@ -3579,6 +3624,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
Address swiftErrorTemp = Address::invalid();
Address swiftErrorArg = Address::invalid();
+ // Translate all of the arguments as necessary to match the IR lowering.
assert(CallInfo.arg_size() == CallArgs.size() &&
"Mismatch between function signature & arguments.");
unsigned ArgNo = 0;
@@ -3826,6 +3872,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
}
}
+ llvm::Value *CalleePtr = Callee.getFunctionPointer();
+
+ // If we're using inalloca, set up that argument.
if (ArgMemory.isValid()) {
llvm::Value *Arg = ArgMemory.getPointer();
if (CallInfo.isVariadic()) {
@@ -3833,10 +3882,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
// end up with a variadic prototype and an inalloca call site. In such
// cases, we can't do any parameter mismatch checks. Give up and bitcast
// the callee.
- unsigned CalleeAS =
- cast<llvm::PointerType>(Callee->getType())->getAddressSpace();
- Callee = Builder.CreateBitCast(
- Callee, getTypes().GetFunctionType(CallInfo)->getPointerTo(CalleeAS));
+ unsigned CalleeAS = CalleePtr->getType()->getPointerAddressSpace();
+ auto FnTy = getTypes().GetFunctionType(CallInfo)->getPointerTo(CalleeAS);
+ CalleePtr = Builder.CreateBitCast(CalleePtr, FnTy);
} else {
llvm::Type *LastParamTy =
IRFuncTy->getParamType(IRFuncTy->getNumParams() - 1);
@@ -3860,39 +3908,57 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
IRCallArgs[IRFunctionArgs.getInallocaArgNo()] = Arg;
}
- if (!CallArgs.getCleanupsToDeactivate().empty())
- deactivateArgCleanupsBeforeCall(*this, CallArgs);
+ // 2. Prepare the function pointer.
+
+ // If the callee is a bitcast of a non-variadic function to have a
+ // variadic function pointer type, check to see if we can remove the
+ // bitcast. This comes up with unprototyped functions.
+ //
+ // This makes the IR nicer, but more importantly it ensures that we
+ // can inline the function at -O0 if it is marked always_inline.
+ auto simplifyVariadicCallee = [](llvm::Value *Ptr) -> llvm::Value* {
+ llvm::FunctionType *CalleeFT =
+ cast<llvm::FunctionType>(Ptr->getType()->getPointerElementType());
+ if (!CalleeFT->isVarArg())
+ return Ptr;
+
+ llvm::ConstantExpr *CE = dyn_cast<llvm::ConstantExpr>(Ptr);
+ if (!CE || CE->getOpcode() != llvm::Instruction::BitCast)
+ return Ptr;
+
+ llvm::Function *OrigFn = dyn_cast<llvm::Function>(CE->getOperand(0));
+ if (!OrigFn)
+ return Ptr;
+
+ llvm::FunctionType *OrigFT = OrigFn->getFunctionType();
+
+ // If the original type is variadic, or if any of the component types
+ // disagree, we cannot remove the cast.
+ if (OrigFT->isVarArg() ||
+ OrigFT->getNumParams() != CalleeFT->getNumParams() ||
+ OrigFT->getReturnType() != CalleeFT->getReturnType())
+ return Ptr;
+
+ for (unsigned i = 0, e = OrigFT->getNumParams(); i != e; ++i)
+ if (OrigFT->getParamType(i) != CalleeFT->getParamType(i))
+ return Ptr;
+
+ return OrigFn;
+ };
+ CalleePtr = simplifyVariadicCallee(CalleePtr);
- // If the callee is a bitcast of a function to a varargs pointer to function
- // type, check to see if we can remove the bitcast. This handles some cases
- // with unprototyped functions.
- if (llvm::ConstantExpr *CE = dyn_cast<llvm::ConstantExpr>(Callee))
- if (llvm::Function *CalleeF = dyn_cast<llvm::Function>(CE->getOperand(0))) {
- llvm::PointerType *CurPT=cast<llvm::PointerType>(Callee->getType());
- llvm::FunctionType *CurFT =
- cast<llvm::FunctionType>(CurPT->getElementType());
- llvm::FunctionType *ActualFT = CalleeF->getFunctionType();
-
- if (CE->getOpcode() == llvm::Instruction::BitCast &&
- ActualFT->getReturnType() == CurFT->getReturnType() &&
- ActualFT->getNumParams() == CurFT->getNumParams() &&
- ActualFT->getNumParams() == IRCallArgs.size() &&
- (CurFT->isVarArg() || !ActualFT->isVarArg())) {
- bool ArgsMatch = true;
- for (unsigned i = 0, e = ActualFT->getNumParams(); i != e; ++i)
- if (ActualFT->getParamType(i) != CurFT->getParamType(i)) {
- ArgsMatch = false;
- break;
- }
+ // 3. Perform the actual call.
- // Strip the cast if we can get away with it. This is a nice cleanup,
- // but also allows us to inline the function at -O0 if it is marked
- // always_inline.
- if (ArgsMatch)
- Callee = CalleeF;
- }
- }
+ // Deactivate any cleanups that we're supposed to do immediately before
+ // the call.
+ if (!CallArgs.getCleanupsToDeactivate().empty())
+ deactivateArgCleanupsBeforeCall(*this, CallArgs);
+ // Assert that the arguments we computed match up. The IR verifier
+ // will catch this, but this is a common enough source of problems
+ // during IRGen changes that it's way better for debugging to catch
+ // it ourselves here.
+#ifndef NDEBUG
assert(IRCallArgs.size() == IRFuncTy->getNumParams() || IRFuncTy->isVarArg());
for (unsigned i = 0; i < IRCallArgs.size(); ++i) {
// Inalloca argument can have different type.
@@ -3902,75 +3968,106 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
if (i < IRFuncTy->getNumParams())
assert(IRCallArgs[i]->getType() == IRFuncTy->getParamType(i));
}
+#endif
+ // Compute the calling convention and attributes.
unsigned CallingConv;
CodeGen::AttributeListType AttributeList;
- CGM.ConstructAttributeList(Callee->getName(), CallInfo, CalleeInfo,
+ CGM.ConstructAttributeList(CalleePtr->getName(), CallInfo,
+ Callee.getAbstractInfo(),
AttributeList, CallingConv,
/*AttrOnCallSite=*/true);
llvm::AttributeSet Attrs = llvm::AttributeSet::get(getLLVMContext(),
AttributeList);
+ // Apply some call-site-specific attributes.
+ // TODO: work this into building the attribute set.
+
+ // Apply always_inline to all calls within flatten functions.
+ // FIXME: should this really take priority over __try, below?
+ if (CurCodeDecl && CurCodeDecl->hasAttr<FlattenAttr>() &&
+ !(Callee.getAbstractInfo().getCalleeDecl() &&
+ Callee.getAbstractInfo().getCalleeDecl()->hasAttr<NoInlineAttr>())) {
+ Attrs =
+ Attrs.addAttribute(getLLVMContext(),
+ llvm::AttributeSet::FunctionIndex,
+ llvm::Attribute::AlwaysInline);
+ }
+
+ // Disable inlining inside SEH __try blocks.
+ if (isSEHTryScope()) {
+ Attrs =
+ Attrs.addAttribute(getLLVMContext(), llvm::AttributeSet::FunctionIndex,
+ llvm::Attribute::NoInline);
+ }
+
+ // Decide whether to use a call or an invoke.
bool CannotThrow;
if (currentFunctionUsesSEHTry()) {
- // SEH cares about asynchronous exceptions, everything can "throw."
+ // SEH cares about asynchronous exceptions, so everything can "throw."
CannotThrow = false;
} else if (isCleanupPadScope() &&
EHPersonality::get(*this).isMSVCXXPersonality()) {
// The MSVC++ personality will implicitly terminate the program if an
- // exception is thrown. An unwind edge cannot be reached.
+ // exception is thrown during a cleanup outside of a try/catch.
+ // We don't need to model anything in IR to get this behavior.
CannotThrow = true;
} else {
- // Otherwise, nowunind callsites will never throw.
+ // Otherwise, nounwind call sites will never throw.
CannotThrow = Attrs.hasAttribute(llvm::AttributeSet::FunctionIndex,
llvm::Attribute::NoUnwind);
}
llvm::BasicBlock *InvokeDest = CannotThrow ? nullptr : getInvokeDest();
SmallVector<llvm::OperandBundleDef, 1> BundleList;
- getBundlesForFunclet(Callee, CurrentFuncletPad, BundleList);
+ getBundlesForFunclet(CalleePtr, CurrentFuncletPad, BundleList);
+ // Emit the actual call/invoke instruction.
llvm::CallSite CS;
if (!InvokeDest) {
- CS = Builder.CreateCall(Callee, IRCallArgs, BundleList);
+ CS = Builder.CreateCall(CalleePtr, IRCallArgs, BundleList);
} else {
llvm::BasicBlock *Cont = createBasicBlock("invoke.cont");
- CS = Builder.CreateInvoke(Callee, Cont, InvokeDest, IRCallArgs,
+ CS = Builder.CreateInvoke(CalleePtr, Cont, InvokeDest, IRCallArgs,
BundleList);
EmitBlock(Cont);
}
+ llvm::Instruction *CI = CS.getInstruction();
if (callOrInvoke)
- *callOrInvoke = CS.getInstruction();
-
- if (CurCodeDecl && CurCodeDecl->hasAttr<FlattenAttr>() &&
- !CS.hasFnAttr(llvm::Attribute::NoInline))
- Attrs =
- Attrs.addAttribute(getLLVMContext(), llvm::AttributeSet::FunctionIndex,
- llvm::Attribute::AlwaysInline);
-
- // Disable inlining inside SEH __try blocks.
- if (isSEHTryScope())
- Attrs =
- Attrs.addAttribute(getLLVMContext(), llvm::AttributeSet::FunctionIndex,
- llvm::Attribute::NoInline);
+ *callOrInvoke = CI;
+ // Apply the attributes and calling convention.
CS.setAttributes(Attrs);
CS.setCallingConv(static_cast<llvm::CallingConv::ID>(CallingConv));
+ // Apply various metadata.
+
+ if (!CI->getType()->isVoidTy())
+ CI->setName("call");
+
// Insert instrumentation or attach profile metadata at indirect call sites.
// For more details, see the comment before the definition of
// IPVK_IndirectCallTarget in InstrProfData.inc.
if (!CS.getCalledFunction())
PGO.valueProfile(Builder, llvm::IPVK_IndirectCallTarget,
- CS.getInstruction(), Callee);
+ CI, CalleePtr);
// In ObjC ARC mode with no ObjC ARC exception safety, tell the ARC
// optimizer it can aggressively ignore unwind edges.
if (CGM.getLangOpts().ObjCAutoRefCount)
- AddObjCARCExceptionMetadata(CS.getInstruction());
+ AddObjCARCExceptionMetadata(CI);
+
+ // Suppress tail calls if requested.
+ if (llvm::CallInst *Call = dyn_cast<llvm::CallInst>(CI)) {
+ const Decl *TargetDecl = Callee.getAbstractInfo().getCalleeDecl();
+ if (TargetDecl && TargetDecl->hasAttr<NotTailCalledAttr>())
+ Call->setTailCallKind(llvm::CallInst::TCK_NoTail);
+ }
+
+ // 4. Finish the call.
// If the call doesn't return, finish the basic block and clear the
- // insertion point; this allows the rest of IRgen to discard
+ // insertion point; this allows the rest of IRGen to discard
// unreachable code.
if (CS.doesNotReturn()) {
if (UnusedReturnSize)
@@ -3989,18 +4086,14 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
return GetUndefRValue(RetTy);
}
- llvm::Instruction *CI = CS.getInstruction();
- if (!CI->getType()->isVoidTy())
- CI->setName("call");
-
// Perform the swifterror writeback.
if (swiftErrorTemp.isValid()) {
llvm::Value *errorResult = Builder.CreateLoad(swiftErrorTemp);
Builder.CreateStore(errorResult, swiftErrorArg);
}
- // Emit any writebacks immediately. Arguably this should happen
- // after any return-value munging.
+ // Emit any call-associated writebacks immediately. Arguably this
+ // should happen after any return-value munging.
if (CallArgs.hasWritebacks())
emitWritebacks(*this, CallArgs);
@@ -4008,12 +4101,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
// lexical order, so deactivate it and run it manually here.
CallArgs.freeArgumentMemory(*this);
- if (llvm::CallInst *Call = dyn_cast<llvm::CallInst>(CI)) {
- const Decl *TargetDecl = CalleeInfo.getCalleeDecl();
- if (TargetDecl && TargetDecl->hasAttr<NotTailCalledAttr>())
- Call->setTailCallKind(llvm::CallInst::TCK_NoTail);
- }
-
+ // Extract the return value.
RValue Ret = [&] {
switch (RetAI.getKind()) {
case ABIArgInfo::CoerceAndExpand: {
@@ -4110,8 +4198,8 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
llvm_unreachable("Unhandled ABIArgInfo::Kind");
} ();
- const Decl *TargetDecl = CalleeInfo.getCalleeDecl();
-
+ // Emit the assume_aligned check on the return value.
+ const Decl *TargetDecl = Callee.getAbstractInfo().getCalleeDecl();
if (Ret.isScalar() && TargetDecl) {
if (const auto *AA = TargetDecl->getAttr<AssumeAlignedAttr>()) {
llvm::Value *OffsetValue = nullptr;
OpenPOWER on IntegriCloud