diff options
author | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
---|---|---|
committer | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
commit | 60b571e49a90d38697b3aca23020d9da42fc7d7f (patch) | |
tree | 99351324c24d6cb146b6285b6caffa4d26fce188 /contrib/llvm/lib/Target/SystemZ | |
parent | bea1b22c7a9bce1dfdd73e6e5b65bc4752215180 (diff) | |
download | FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.zip FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.tar.gz |
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste):
Add WITH_LLD_AS_LD build knob
If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not
capable of linking the world and kernel, but can self-host and link many
substantial applications. GNU ld continues to be used for the world and
kernel build, regardless of how this knob is set.
It is on by default for arm64, and off for all other CPU architectures.
Sponsored by: The FreeBSD Foundation
MFC r310840:
Reapply 310775, now it also builds correctly if lldb is disabled:
Move llvm-objdump from CLANG_EXTRAS to installed by default
We currently install three tools from binutils 2.17.50: as, ld, and
objdump. Work is underway to migrate to a permissively-licensed
tool-chain, with one goal being the retirement of binutils 2.17.50.
LLVM's llvm-objdump is intended to be compatible with GNU objdump
although it is currently missing some options and may have formatting
differences. Enable it by default for testing and further investigation.
It may later be changed to install as /usr/bin/objdump, it becomes a
fully viable replacement.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8879
MFC r312855 (by emaste):
Rename LLD_AS_LD to LLD_IS_LD, for consistency with CLANG_IS_CC
Reported by: Dan McGregor <dan.mcgregor usask.ca>
MFC r313559 | glebius | 2017-02-10 18:34:48 +0100 (Fri, 10 Feb 2017) | 5 lines
Don't check struct rtentry on FreeBSD, it is an internal kernel structure.
On other systems it may be API structure for SIOCADDRT/SIOCDELRT.
Reviewed by: emaste, dim
MFC r314152 (by jkim):
Remove an assembler flag, which is redundant since r309124. The upstream
took care of it by introducing a macro NO_EXEC_STACK_DIRECTIVE.
http://llvm.org/viewvc/llvm-project?rev=273500&view=rev
Reviewed by: dim
MFC r314564:
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC r314708:
For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
MFC r314795:
Reapply r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
MFC r314907 | mmel | 2017-03-08 12:40:27 +0100 (Wed, 08 Mar 2017) | 7 lines
Unbreak ARMv6 world.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
MFC r315016:
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release.
We were already very close to the last release candidate, so this is a
pretty minor update.
Relnotes: yes
MFC r316005:
Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by
Weiming Zhao):
builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.
Summary:
Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation
mode (-mthumb, -marm), it reflect's capability of given CPU.
Due to this:
- use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB
- use '.thumb' directive consistently in all affected files
- decorate all thumb functions using
DEFINE_COMPILERRT_THUMB_FUNCTION()
---------
Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 !
Reviewers: weimingz, rengolin, compnerd
Subscribers: aemerson, dim
Differential Revision: https://reviews.llvm.org/D30938
Discussed with: mmel
Diffstat (limited to 'contrib/llvm/lib/Target/SystemZ')
50 files changed, 7976 insertions, 1498 deletions
diff --git a/contrib/llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp b/contrib/llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp index 3923614..a94717c 100644 --- a/contrib/llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp +++ b/contrib/llvm/lib/Target/SystemZ/AsmParser/SystemZAsmParser.cpp @@ -12,6 +12,7 @@ #include "llvm/MC/MCContext.h" #include "llvm/MC/MCExpr.h" #include "llvm/MC/MCInst.h" +#include "llvm/MC/MCInstBuilder.h" #include "llvm/MC/MCParser/MCParsedAsmOperand.h" #include "llvm/MC/MCParser/MCTargetAsmParser.h" #include "llvm/MC/MCStreamer.h" @@ -42,13 +43,15 @@ enum RegisterKind { FP128Reg, VR32Reg, VR64Reg, - VR128Reg + VR128Reg, + AR32Reg, }; enum MemoryKind { BDMem, BDXMem, BDLMem, + BDRMem, BDVMem }; @@ -59,7 +62,6 @@ private: KindInvalid, KindToken, KindReg, - KindAccessReg, KindImm, KindImmTLS, KindMem @@ -98,7 +100,10 @@ private: unsigned MemKind : 4; unsigned RegKind : 4; const MCExpr *Disp; - const MCExpr *Length; + union { + const MCExpr *Imm; + unsigned Reg; + } Length; }; // Imm is an immediate operand, and Sym is an optional TLS symbol @@ -111,7 +116,6 @@ private: union { TokenOp Token; RegOp Reg; - unsigned AccessReg; const MCExpr *Imm; ImmTLSOp ImmTLS; MemOp Mem; @@ -150,12 +154,6 @@ public: return Op; } static std::unique_ptr<SystemZOperand> - createAccessReg(unsigned Num, SMLoc StartLoc, SMLoc EndLoc) { - auto Op = make_unique<SystemZOperand>(KindAccessReg, StartLoc, EndLoc); - Op->AccessReg = Num; - return Op; - } - static std::unique_ptr<SystemZOperand> createImm(const MCExpr *Expr, SMLoc StartLoc, SMLoc EndLoc) { auto Op = make_unique<SystemZOperand>(KindImm, StartLoc, EndLoc); Op->Imm = Expr; @@ -163,15 +161,18 @@ public: } static std::unique_ptr<SystemZOperand> createMem(MemoryKind MemKind, RegisterKind RegKind, unsigned Base, - const MCExpr *Disp, unsigned Index, const MCExpr *Length, - SMLoc StartLoc, SMLoc EndLoc) { + const MCExpr *Disp, unsigned Index, const MCExpr *LengthImm, + unsigned LengthReg, SMLoc StartLoc, SMLoc EndLoc) { auto Op = make_unique<SystemZOperand>(KindMem, StartLoc, EndLoc); Op->Mem.MemKind = MemKind; Op->Mem.RegKind = RegKind; Op->Mem.Base = Base; Op->Mem.Index = Index; Op->Mem.Disp = Disp; - Op->Mem.Length = Length; + if (MemKind == BDLMem) + Op->Mem.Length.Imm = LengthImm; + if (MemKind == BDRMem) + Op->Mem.Length.Reg = LengthReg; return Op; } static std::unique_ptr<SystemZOperand> @@ -204,12 +205,6 @@ public: return Reg.Num; } - // Access register operands. Access registers aren't exposed to LLVM - // as registers. - bool isAccessReg() const { - return Kind == KindAccessReg; - } - // Immediate operands. bool isImm() const override { return Kind == KindImm; @@ -248,14 +243,7 @@ public: return isMem(MemKind, RegKind) && inRange(Mem.Disp, -524288, 524287); } bool isMemDisp12Len8(RegisterKind RegKind) const { - return isMemDisp12(BDLMem, RegKind) && inRange(Mem.Length, 1, 0x100); - } - void addBDVAddrOperands(MCInst &Inst, unsigned N) const { - assert(N == 3 && "Invalid number of operands"); - assert(isMem(BDVMem) && "Invalid operand type"); - Inst.addOperand(MCOperand::createReg(Mem.Base)); - addExpr(Inst, Mem.Disp); - Inst.addOperand(MCOperand::createReg(Mem.Index)); + return isMemDisp12(BDLMem, RegKind) && inRange(Mem.Length.Imm, 1, 0x100); } // Override MCParsedAsmOperand. @@ -269,11 +257,6 @@ public: assert(N == 1 && "Invalid number of operands"); Inst.addOperand(MCOperand::createReg(getReg())); } - void addAccessRegOperands(MCInst &Inst, unsigned N) const { - assert(N == 1 && "Invalid number of operands"); - assert(Kind == KindAccessReg && "Invalid operand type"); - Inst.addOperand(MCOperand::createImm(AccessReg)); - } void addImmOperands(MCInst &Inst, unsigned N) const { assert(N == 1 && "Invalid number of operands"); addExpr(Inst, getImm()); @@ -296,7 +279,21 @@ public: assert(isMem(BDLMem) && "Invalid operand type"); Inst.addOperand(MCOperand::createReg(Mem.Base)); addExpr(Inst, Mem.Disp); - addExpr(Inst, Mem.Length); + addExpr(Inst, Mem.Length.Imm); + } + void addBDRAddrOperands(MCInst &Inst, unsigned N) const { + assert(N == 3 && "Invalid number of operands"); + assert(isMem(BDRMem) && "Invalid operand type"); + Inst.addOperand(MCOperand::createReg(Mem.Base)); + addExpr(Inst, Mem.Disp); + Inst.addOperand(MCOperand::createReg(Mem.Length.Reg)); + } + void addBDVAddrOperands(MCInst &Inst, unsigned N) const { + assert(N == 3 && "Invalid number of operands"); + assert(isMem(BDVMem) && "Invalid operand type"); + Inst.addOperand(MCOperand::createReg(Mem.Base)); + addExpr(Inst, Mem.Disp); + Inst.addOperand(MCOperand::createReg(Mem.Index)); } void addImmTLSOperands(MCInst &Inst, unsigned N) const { assert(N == 2 && "Invalid number of operands"); @@ -322,6 +319,8 @@ public: bool isVR64() const { return isReg(VR64Reg); } bool isVF128() const { return false; } bool isVR128() const { return isReg(VR128Reg); } + bool isAR32() const { return isReg(AR32Reg); } + bool isAnyReg() const { return (isReg() || isImm(0, 15)); } bool isBDAddr32Disp12() const { return isMemDisp12(BDMem, ADDR32Reg); } bool isBDAddr32Disp20() const { return isMemDisp20(BDMem, ADDR32Reg); } bool isBDAddr64Disp12() const { return isMemDisp12(BDMem, ADDR64Reg); } @@ -329,6 +328,7 @@ public: bool isBDXAddr64Disp12() const { return isMemDisp12(BDXMem, ADDR64Reg); } bool isBDXAddr64Disp20() const { return isMemDisp20(BDXMem, ADDR64Reg); } bool isBDLAddr64Disp12Len8() const { return isMemDisp12Len8(ADDR64Reg); } + bool isBDRAddr64Disp12() const { return isMemDisp12(BDRMem, ADDR64Reg); } bool isBDVAddr64Disp12() const { return isMemDisp12(BDVMem, ADDR64Reg); } bool isU1Imm() const { return isImm(0, 1); } bool isU2Imm() const { return isImm(0, 3); } @@ -342,6 +342,7 @@ public: bool isS16Imm() const { return isImm(-32768, 32767); } bool isU32Imm() const { return isImm(0, (1LL << 32) - 1); } bool isS32Imm() const { return isImm(-(1LL << 31), (1LL << 31) - 1); } + bool isU48Imm() const { return isImm(0, (1LL << 48) - 1); } }; class SystemZAsmParser : public MCTargetAsmParser { @@ -354,7 +355,7 @@ private: RegGR, RegFP, RegV, - RegAccess + RegAR }; struct Register { RegisterGroup Group; @@ -371,9 +372,14 @@ private: RegisterGroup Group, const unsigned *Regs, RegisterKind Kind); - bool parseAddress(unsigned &Base, const MCExpr *&Disp, - unsigned &Index, bool &IsVector, const MCExpr *&Length, - const unsigned *Regs, RegisterKind RegKind); + OperandMatchResultTy parseAnyRegister(OperandVector &Operands); + + bool parseAddress(bool &HaveReg1, Register &Reg1, + bool &HaveReg2, Register &Reg2, + const MCExpr *&Disp, const MCExpr *&Length); + bool parseAddressRegister(Register &Reg); + + bool ParseDirectiveInsn(SMLoc L); OperandMatchResultTy parseAddress(OperandVector &Operands, MemoryKind MemKind, const unsigned *Regs, @@ -454,6 +460,12 @@ public: OperandMatchResultTy parseVR128(OperandVector &Operands) { return parseRegister(Operands, RegV, SystemZMC::VR128Regs, VR128Reg); } + OperandMatchResultTy parseAR32(OperandVector &Operands) { + return parseRegister(Operands, RegAR, SystemZMC::AR32Regs, AR32Reg); + } + OperandMatchResultTy parseAnyReg(OperandVector &Operands) { + return parseAnyRegister(Operands); + } OperandMatchResultTy parseBDAddr32(OperandVector &Operands) { return parseAddress(Operands, BDMem, SystemZMC::GR32Regs, ADDR32Reg); } @@ -466,13 +478,21 @@ public: OperandMatchResultTy parseBDLAddr64(OperandVector &Operands) { return parseAddress(Operands, BDLMem, SystemZMC::GR64Regs, ADDR64Reg); } + OperandMatchResultTy parseBDRAddr64(OperandVector &Operands) { + return parseAddress(Operands, BDRMem, SystemZMC::GR64Regs, ADDR64Reg); + } OperandMatchResultTy parseBDVAddr64(OperandVector &Operands) { return parseAddress(Operands, BDVMem, SystemZMC::GR64Regs, ADDR64Reg); } - OperandMatchResultTy parseAccessReg(OperandVector &Operands); + OperandMatchResultTy parsePCRel12(OperandVector &Operands) { + return parsePCRel(Operands, -(1LL << 12), (1LL << 12) - 1, false); + } OperandMatchResultTy parsePCRel16(OperandVector &Operands) { return parsePCRel(Operands, -(1LL << 16), (1LL << 16) - 1, false); } + OperandMatchResultTy parsePCRel24(OperandVector &Operands) { + return parsePCRel(Operands, -(1LL << 24), (1LL << 24) - 1, false); + } OperandMatchResultTy parsePCRel32(OperandVector &Operands) { return parsePCRel(Operands, -(1LL << 32), (1LL << 32) - 1, false); } @@ -490,6 +510,83 @@ public: #define GET_MATCHER_IMPLEMENTATION #include "SystemZGenAsmMatcher.inc" +// Used for the .insn directives; contains information needed to parse the +// operands in the directive. +struct InsnMatchEntry { + StringRef Format; + uint64_t Opcode; + int32_t NumOperands; + MatchClassKind OperandKinds[5]; +}; + +// For equal_range comparison. +struct CompareInsn { + bool operator() (const InsnMatchEntry &LHS, StringRef RHS) { + return LHS.Format < RHS; + } + bool operator() (StringRef LHS, const InsnMatchEntry &RHS) { + return LHS < RHS.Format; + } + bool operator() (const InsnMatchEntry &LHS, const InsnMatchEntry &RHS) { + return LHS.Format < RHS.Format; + } +}; + +// Table initializing information for parsing the .insn directive. +static struct InsnMatchEntry InsnMatchTable[] = { + /* Format, Opcode, NumOperands, OperandKinds */ + { "e", SystemZ::InsnE, 1, + { MCK_U16Imm } }, + { "ri", SystemZ::InsnRI, 3, + { MCK_U32Imm, MCK_AnyReg, MCK_S16Imm } }, + { "rie", SystemZ::InsnRIE, 4, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_PCRel16 } }, + { "ril", SystemZ::InsnRIL, 3, + { MCK_U48Imm, MCK_AnyReg, MCK_PCRel32 } }, + { "rilu", SystemZ::InsnRILU, 3, + { MCK_U48Imm, MCK_AnyReg, MCK_U32Imm } }, + { "ris", SystemZ::InsnRIS, 5, + { MCK_U48Imm, MCK_AnyReg, MCK_S8Imm, MCK_U4Imm, MCK_BDAddr64Disp12 } }, + { "rr", SystemZ::InsnRR, 3, + { MCK_U16Imm, MCK_AnyReg, MCK_AnyReg } }, + { "rre", SystemZ::InsnRRE, 3, + { MCK_U32Imm, MCK_AnyReg, MCK_AnyReg } }, + { "rrf", SystemZ::InsnRRF, 5, + { MCK_U32Imm, MCK_AnyReg, MCK_AnyReg, MCK_AnyReg, MCK_U4Imm } }, + { "rrs", SystemZ::InsnRRS, 5, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_U4Imm, MCK_BDAddr64Disp12 } }, + { "rs", SystemZ::InsnRS, 4, + { MCK_U32Imm, MCK_AnyReg, MCK_AnyReg, MCK_BDAddr64Disp12 } }, + { "rse", SystemZ::InsnRSE, 4, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_BDAddr64Disp12 } }, + { "rsi", SystemZ::InsnRSI, 4, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_PCRel16 } }, + { "rsy", SystemZ::InsnRSY, 4, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_BDAddr64Disp20 } }, + { "rx", SystemZ::InsnRX, 3, + { MCK_U32Imm, MCK_AnyReg, MCK_BDXAddr64Disp12 } }, + { "rxe", SystemZ::InsnRXE, 3, + { MCK_U48Imm, MCK_AnyReg, MCK_BDXAddr64Disp12 } }, + { "rxf", SystemZ::InsnRXF, 4, + { MCK_U48Imm, MCK_AnyReg, MCK_AnyReg, MCK_BDXAddr64Disp12 } }, + { "rxy", SystemZ::InsnRXY, 3, + { MCK_U48Imm, MCK_AnyReg, MCK_BDXAddr64Disp20 } }, + { "s", SystemZ::InsnS, 2, + { MCK_U32Imm, MCK_BDAddr64Disp12 } }, + { "si", SystemZ::InsnSI, 3, + { MCK_U32Imm, MCK_BDAddr64Disp12, MCK_S8Imm } }, + { "sil", SystemZ::InsnSIL, 3, + { MCK_U48Imm, MCK_BDAddr64Disp12, MCK_U16Imm } }, + { "siy", SystemZ::InsnSIY, 3, + { MCK_U48Imm, MCK_BDAddr64Disp20, MCK_U8Imm } }, + { "ss", SystemZ::InsnSS, 4, + { MCK_U48Imm, MCK_BDXAddr64Disp12, MCK_BDAddr64Disp12, MCK_AnyReg } }, + { "sse", SystemZ::InsnSSE, 3, + { MCK_U48Imm, MCK_BDAddr64Disp12, MCK_BDAddr64Disp12 } }, + { "ssf", SystemZ::InsnSSF, 4, + { MCK_U48Imm, MCK_BDAddr64Disp12, MCK_BDAddr64Disp12, MCK_AnyReg } } +}; + void SystemZOperand::print(raw_ostream &OS) const { llvm_unreachable("Not implemented"); } @@ -525,7 +622,7 @@ bool SystemZAsmParser::parseRegister(Register &Reg) { else if (Prefix == 'v' && Reg.Num < 32) Reg.Group = RegV; else if (Prefix == 'a' && Reg.Num < 16) - Reg.Group = RegAccess; + Reg.Group = RegAR; else return Error(Reg.StartLoc, "invalid register"); @@ -556,7 +653,7 @@ bool SystemZAsmParser::parseRegister(Register &Reg, RegisterGroup Group, } // Parse a register and add it to Operands. The other arguments are as above. -SystemZAsmParser::OperandMatchResultTy +OperandMatchResultTy SystemZAsmParser::parseRegister(OperandVector &Operands, RegisterGroup Group, const unsigned *Regs, RegisterKind Kind) { if (Parser.getTok().isNot(AsmToken::Percent)) @@ -572,58 +669,96 @@ SystemZAsmParser::parseRegister(OperandVector &Operands, RegisterGroup Group, return MatchOperand_Success; } -// Parse a memory operand into Base, Disp, Index and Length. -// Regs maps asm register numbers to LLVM register numbers and RegKind -// says what kind of address register we're using (ADDR32Reg or ADDR64Reg). -bool SystemZAsmParser::parseAddress(unsigned &Base, const MCExpr *&Disp, - unsigned &Index, bool &IsVector, - const MCExpr *&Length, const unsigned *Regs, - RegisterKind RegKind) { +// Parse any type of register (including integers) and add it to Operands. +OperandMatchResultTy +SystemZAsmParser::parseAnyRegister(OperandVector &Operands) { + // Handle integer values. + if (Parser.getTok().is(AsmToken::Integer)) { + const MCExpr *Register; + SMLoc StartLoc = Parser.getTok().getLoc(); + if (Parser.parseExpression(Register)) + return MatchOperand_ParseFail; + + if (auto *CE = dyn_cast<MCConstantExpr>(Register)) { + int64_t Value = CE->getValue(); + if (Value < 0 || Value > 15) { + Error(StartLoc, "invalid register"); + return MatchOperand_ParseFail; + } + } + + SMLoc EndLoc = + SMLoc::getFromPointer(Parser.getTok().getLoc().getPointer() - 1); + + Operands.push_back(SystemZOperand::createImm(Register, StartLoc, EndLoc)); + } + else { + Register Reg; + if (parseRegister(Reg)) + return MatchOperand_ParseFail; + + // Map to the correct register kind. + RegisterKind Kind; + unsigned RegNo; + if (Reg.Group == RegGR) { + Kind = GR64Reg; + RegNo = SystemZMC::GR64Regs[Reg.Num]; + } + else if (Reg.Group == RegFP) { + Kind = FP64Reg; + RegNo = SystemZMC::FP64Regs[Reg.Num]; + } + else if (Reg.Group == RegV) { + Kind = VR128Reg; + RegNo = SystemZMC::VR128Regs[Reg.Num]; + } + else if (Reg.Group == RegAR) { + Kind = AR32Reg; + RegNo = SystemZMC::AR32Regs[Reg.Num]; + } + else { + return MatchOperand_ParseFail; + } + + Operands.push_back(SystemZOperand::createReg(Kind, RegNo, + Reg.StartLoc, Reg.EndLoc)); + } + return MatchOperand_Success; +} + +// Parse a memory operand into Reg1, Reg2, Disp, and Length. +bool SystemZAsmParser::parseAddress(bool &HaveReg1, Register &Reg1, + bool &HaveReg2, Register &Reg2, + const MCExpr *&Disp, + const MCExpr *&Length) { // Parse the displacement, which must always be present. if (getParser().parseExpression(Disp)) return true; // Parse the optional base and index. - Index = 0; - Base = 0; - IsVector = false; + HaveReg1 = false; + HaveReg2 = false; Length = nullptr; if (getLexer().is(AsmToken::LParen)) { Parser.Lex(); if (getLexer().is(AsmToken::Percent)) { - // Parse the first register and decide whether it's a base or an index. - Register Reg; - if (parseRegister(Reg)) + // Parse the first register. + HaveReg1 = true; + if (parseRegister(Reg1)) return true; - if (Reg.Group == RegV) { - // A vector index register. The base register is optional. - IsVector = true; - Index = SystemZMC::VR128Regs[Reg.Num]; - } else if (Reg.Group == RegGR) { - if (Reg.Num == 0) - return Error(Reg.StartLoc, "%r0 used in an address"); - // If the are two registers, the first one is the index and the - // second is the base. - if (getLexer().is(AsmToken::Comma)) - Index = Regs[Reg.Num]; - else - Base = Regs[Reg.Num]; - } else - return Error(Reg.StartLoc, "invalid address register"); } else { // Parse the length. if (getParser().parseExpression(Length)) return true; } - // Check whether there's a second register. It's the base if so. + // Check whether there's a second register. if (getLexer().is(AsmToken::Comma)) { Parser.Lex(); - Register Reg; - if (parseRegister(Reg, RegGR, Regs, RegKind)) + HaveReg2 = true; + if (parseRegister(Reg2)) return true; - Base = Reg.Num; } // Consume the closing bracket. @@ -634,56 +769,255 @@ bool SystemZAsmParser::parseAddress(unsigned &Base, const MCExpr *&Disp, return false; } +// Verify that Reg is a valid address register (base or index). +bool +SystemZAsmParser::parseAddressRegister(Register &Reg) { + if (Reg.Group == RegV) { + Error(Reg.StartLoc, "invalid use of vector addressing"); + return true; + } else if (Reg.Group != RegGR) { + Error(Reg.StartLoc, "invalid address register"); + return true; + } else if (Reg.Num == 0) { + Error(Reg.StartLoc, "%r0 used in an address"); + return true; + } + return false; +} + // Parse a memory operand and add it to Operands. The other arguments // are as above. -SystemZAsmParser::OperandMatchResultTy +OperandMatchResultTy SystemZAsmParser::parseAddress(OperandVector &Operands, MemoryKind MemKind, const unsigned *Regs, RegisterKind RegKind) { SMLoc StartLoc = Parser.getTok().getLoc(); - unsigned Base, Index; - bool IsVector; + unsigned Base = 0, Index = 0, LengthReg = 0; + Register Reg1, Reg2; + bool HaveReg1, HaveReg2; const MCExpr *Disp; const MCExpr *Length; - if (parseAddress(Base, Disp, Index, IsVector, Length, Regs, RegKind)) - return MatchOperand_ParseFail; - - if (IsVector && MemKind != BDVMem) { - Error(StartLoc, "invalid use of vector addressing"); - return MatchOperand_ParseFail; - } - - if (!IsVector && MemKind == BDVMem) { - Error(StartLoc, "vector index required in address"); - return MatchOperand_ParseFail; - } - - if (Index && MemKind != BDXMem && MemKind != BDVMem) { - Error(StartLoc, "invalid use of indexed addressing"); + if (parseAddress(HaveReg1, Reg1, HaveReg2, Reg2, Disp, Length)) return MatchOperand_ParseFail; - } - if (Length && MemKind != BDLMem) { - Error(StartLoc, "invalid use of length addressing"); - return MatchOperand_ParseFail; - } - - if (!Length && MemKind == BDLMem) { - Error(StartLoc, "missing length in address"); - return MatchOperand_ParseFail; + switch (MemKind) { + case BDMem: + // If we have Reg1, it must be an address register. + if (HaveReg1) { + if (parseAddressRegister(Reg1)) + return MatchOperand_ParseFail; + Base = Regs[Reg1.Num]; + } + // There must be no Reg2 or length. + if (Length) { + Error(StartLoc, "invalid use of length addressing"); + return MatchOperand_ParseFail; + } + if (HaveReg2) { + Error(StartLoc, "invalid use of indexed addressing"); + return MatchOperand_ParseFail; + } + break; + case BDXMem: + // If we have Reg1, it must be an address register. + if (HaveReg1) { + if (parseAddressRegister(Reg1)) + return MatchOperand_ParseFail; + // If the are two registers, the first one is the index and the + // second is the base. + if (HaveReg2) + Index = Regs[Reg1.Num]; + else + Base = Regs[Reg1.Num]; + } + // If we have Reg2, it must be an address register. + if (HaveReg2) { + if (parseAddressRegister(Reg2)) + return MatchOperand_ParseFail; + Base = Regs[Reg2.Num]; + } + // There must be no length. + if (Length) { + Error(StartLoc, "invalid use of length addressing"); + return MatchOperand_ParseFail; + } + break; + case BDLMem: + // If we have Reg2, it must be an address register. + if (HaveReg2) { + if (parseAddressRegister(Reg2)) + return MatchOperand_ParseFail; + Base = Regs[Reg2.Num]; + } + // We cannot support base+index addressing. + if (HaveReg1 && HaveReg2) { + Error(StartLoc, "invalid use of indexed addressing"); + return MatchOperand_ParseFail; + } + // We must have a length. + if (!Length) { + Error(StartLoc, "missing length in address"); + return MatchOperand_ParseFail; + } + break; + case BDRMem: + // We must have Reg1, and it must be a GPR. + if (!HaveReg1 || Reg1.Group != RegGR) { + Error(StartLoc, "invalid operand for instruction"); + return MatchOperand_ParseFail; + } + LengthReg = SystemZMC::GR64Regs[Reg1.Num]; + // If we have Reg2, it must be an address register. + if (HaveReg2) { + if (parseAddressRegister(Reg2)) + return MatchOperand_ParseFail; + Base = Regs[Reg2.Num]; + } + // There must be no length. + if (Length) { + Error(StartLoc, "invalid use of length addressing"); + return MatchOperand_ParseFail; + } + break; + case BDVMem: + // We must have Reg1, and it must be a vector register. + if (!HaveReg1 || Reg1.Group != RegV) { + Error(StartLoc, "vector index required in address"); + return MatchOperand_ParseFail; + } + Index = SystemZMC::VR128Regs[Reg1.Num]; + // If we have Reg2, it must be an address register. + if (HaveReg2) { + if (parseAddressRegister(Reg2)) + return MatchOperand_ParseFail; + Base = Regs[Reg2.Num]; + } + // There must be no length. + if (Length) { + Error(StartLoc, "invalid use of length addressing"); + return MatchOperand_ParseFail; + } + break; } SMLoc EndLoc = SMLoc::getFromPointer(Parser.getTok().getLoc().getPointer() - 1); Operands.push_back(SystemZOperand::createMem(MemKind, RegKind, Base, Disp, - Index, Length, StartLoc, - EndLoc)); + Index, Length, LengthReg, + StartLoc, EndLoc)); return MatchOperand_Success; } bool SystemZAsmParser::ParseDirective(AsmToken DirectiveID) { + StringRef IDVal = DirectiveID.getIdentifier(); + + if (IDVal == ".insn") + return ParseDirectiveInsn(DirectiveID.getLoc()); + return true; } +/// ParseDirectiveInsn +/// ::= .insn [ format, encoding, (operands (, operands)*) ] +bool SystemZAsmParser::ParseDirectiveInsn(SMLoc L) { + MCAsmParser &Parser = getParser(); + + // Expect instruction format as identifier. + StringRef Format; + SMLoc ErrorLoc = Parser.getTok().getLoc(); + if (Parser.parseIdentifier(Format)) + return Error(ErrorLoc, "expected instruction format"); + + SmallVector<std::unique_ptr<MCParsedAsmOperand>, 8> Operands; + + // Find entry for this format in InsnMatchTable. + auto EntryRange = + std::equal_range(std::begin(InsnMatchTable), std::end(InsnMatchTable), + Format, CompareInsn()); + + // If first == second, couldn't find a match in the table. + if (EntryRange.first == EntryRange.second) + return Error(ErrorLoc, "unrecognized format"); + + struct InsnMatchEntry *Entry = EntryRange.first; + + // Format should match from equal_range. + assert(Entry->Format == Format); + + // Parse the following operands using the table's information. + for (int i = 0; i < Entry->NumOperands; i++) { + MatchClassKind Kind = Entry->OperandKinds[i]; + + SMLoc StartLoc = Parser.getTok().getLoc(); + + // Always expect commas as separators for operands. + if (getLexer().isNot(AsmToken::Comma)) + return Error(StartLoc, "unexpected token in directive"); + Lex(); + + // Parse operands. + OperandMatchResultTy ResTy; + if (Kind == MCK_AnyReg) + ResTy = parseAnyReg(Operands); + else if (Kind == MCK_BDXAddr64Disp12 || Kind == MCK_BDXAddr64Disp20) + ResTy = parseBDXAddr64(Operands); + else if (Kind == MCK_BDAddr64Disp12 || Kind == MCK_BDAddr64Disp20) + ResTy = parseBDAddr64(Operands); + else if (Kind == MCK_PCRel32) + ResTy = parsePCRel32(Operands); + else if (Kind == MCK_PCRel16) + ResTy = parsePCRel16(Operands); + else { + // Only remaining operand kind is an immediate. + const MCExpr *Expr; + SMLoc StartLoc = Parser.getTok().getLoc(); + + // Expect immediate expression. + if (Parser.parseExpression(Expr)) + return Error(StartLoc, "unexpected token in directive"); + + SMLoc EndLoc = + SMLoc::getFromPointer(Parser.getTok().getLoc().getPointer() - 1); + + Operands.push_back(SystemZOperand::createImm(Expr, StartLoc, EndLoc)); + ResTy = MatchOperand_Success; + } + + if (ResTy != MatchOperand_Success) + return true; + } + + // Build the instruction with the parsed operands. + MCInst Inst = MCInstBuilder(Entry->Opcode); + + for (size_t i = 0; i < Operands.size(); i++) { + MCParsedAsmOperand &Operand = *Operands[i]; + MatchClassKind Kind = Entry->OperandKinds[i]; + + // Verify operand. + unsigned Res = validateOperandClass(Operand, Kind); + if (Res != Match_Success) + return Error(Operand.getStartLoc(), "unexpected operand type"); + + // Add operands to instruction. + SystemZOperand &ZOperand = static_cast<SystemZOperand &>(Operand); + if (ZOperand.isReg()) + ZOperand.addRegOperands(Inst, 1); + else if (ZOperand.isMem(BDMem)) + ZOperand.addBDAddrOperands(Inst, 2); + else if (ZOperand.isMem(BDXMem)) + ZOperand.addBDXAddrOperands(Inst, 3); + else if (ZOperand.isImm()) + ZOperand.addImmOperands(Inst, 1); + else + llvm_unreachable("unexpected operand type"); + } + + // Emit as a regular instruction. + Parser.getStreamer().EmitInstruction(Inst, getSTI()); + + return false; +} + bool SystemZAsmParser::ParseRegister(unsigned &RegNo, SMLoc &StartLoc, SMLoc &EndLoc) { Register Reg; @@ -695,9 +1029,8 @@ bool SystemZAsmParser::ParseRegister(unsigned &RegNo, SMLoc &StartLoc, RegNo = SystemZMC::FP64Regs[Reg.Num]; else if (Reg.Group == RegV) RegNo = SystemZMC::VR128Regs[Reg.Num]; - else - // FIXME: Access registers aren't modelled as LLVM registers yet. - return Error(Reg.StartLoc, "invalid operand for instruction"); + else if (Reg.Group == RegAR) + RegNo = SystemZMC::AR32Regs[Reg.Num]; StartLoc = Reg.StartLoc; EndLoc = Reg.EndLoc; return false; @@ -712,7 +1045,6 @@ bool SystemZAsmParser::ParseInstruction(ParseInstructionInfo &Info, if (getLexer().isNot(AsmToken::EndOfStatement)) { // Read the first operand. if (parseOperand(Operands, Name)) { - Parser.eatToEndOfStatement(); return true; } @@ -720,13 +1052,11 @@ bool SystemZAsmParser::ParseInstruction(ParseInstructionInfo &Info, while (getLexer().is(AsmToken::Comma)) { Parser.Lex(); if (parseOperand(Operands, Name)) { - Parser.eatToEndOfStatement(); return true; } } if (getLexer().isNot(AsmToken::EndOfStatement)) { SMLoc Loc = getLexer().getLoc(); - Parser.eatToEndOfStatement(); return Error(Loc, "unexpected token in argument list"); } } @@ -739,8 +1069,14 @@ bool SystemZAsmParser::ParseInstruction(ParseInstructionInfo &Info, bool SystemZAsmParser::parseOperand(OperandVector &Operands, StringRef Mnemonic) { // Check if the current operand has a custom associated parser, if so, try to - // custom parse the operand, or fallback to the general approach. + // custom parse the operand, or fallback to the general approach. Force all + // features to be available during the operand check, or else we will fail to + // find the custom parser, and then we will later get an InvalidOperand error + // instead of a MissingFeature errror. + uint64_t AvailableFeatures = getAvailableFeatures(); + setAvailableFeatures(~(uint64_t)0); OperandMatchResultTy ResTy = MatchOperandParserImpl(Operands, Mnemonic); + setAvailableFeatures(AvailableFeatures); if (ResTy == MatchOperand_Success) return false; @@ -766,16 +1102,23 @@ bool SystemZAsmParser::parseOperand(OperandVector &Operands, // real address operands should have used a context-dependent parse routine, // so we treat any plain expression as an immediate. SMLoc StartLoc = Parser.getTok().getLoc(); - unsigned Base, Index; - bool IsVector; - const MCExpr *Expr, *Length; - if (parseAddress(Base, Expr, Index, IsVector, Length, SystemZMC::GR64Regs, - ADDR64Reg)) + Register Reg1, Reg2; + bool HaveReg1, HaveReg2; + const MCExpr *Expr; + const MCExpr *Length; + if (parseAddress(HaveReg1, Reg1, HaveReg2, Reg2, Expr, Length)) + return true; + // If the register combination is not valid for any instruction, reject it. + // Otherwise, fall back to reporting an unrecognized instruction. + if (HaveReg1 && Reg1.Group != RegGR && Reg1.Group != RegV + && parseAddressRegister(Reg1)) + return true; + if (HaveReg2 && parseAddressRegister(Reg2)) return true; SMLoc EndLoc = SMLoc::getFromPointer(Parser.getTok().getLoc().getPointer() - 1); - if (Base || Index || Length) + if (HaveReg1 || HaveReg2 || Length) Operands.push_back(SystemZOperand::createInvalid(StartLoc, EndLoc)); else Operands.push_back(SystemZOperand::createImm(Expr, StartLoc, EndLoc)); @@ -834,22 +1177,7 @@ bool SystemZAsmParser::MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode, llvm_unreachable("Unexpected match type"); } -SystemZAsmParser::OperandMatchResultTy -SystemZAsmParser::parseAccessReg(OperandVector &Operands) { - if (Parser.getTok().isNot(AsmToken::Percent)) - return MatchOperand_NoMatch; - - Register Reg; - if (parseRegister(Reg, RegAccess, nullptr)) - return MatchOperand_ParseFail; - - Operands.push_back(SystemZOperand::createAccessReg(Reg.Num, - Reg.StartLoc, - Reg.EndLoc)); - return MatchOperand_Success; -} - -SystemZAsmParser::OperandMatchResultTy +OperandMatchResultTy SystemZAsmParser::parsePCRel(OperandVector &Operands, int64_t MinVal, int64_t MaxVal, bool AllowTLS) { MCContext &Ctx = getContext(); @@ -927,5 +1255,5 @@ SystemZAsmParser::parsePCRel(OperandVector &Operands, int64_t MinVal, // Force static initialization. extern "C" void LLVMInitializeSystemZAsmParser() { - RegisterMCAsmParser<SystemZAsmParser> X(TheSystemZTarget); + RegisterMCAsmParser<SystemZAsmParser> X(getTheSystemZTarget()); } diff --git a/contrib/llvm/lib/Target/SystemZ/Disassembler/SystemZDisassembler.cpp b/contrib/llvm/lib/Target/SystemZ/Disassembler/SystemZDisassembler.cpp index 20e015b..1806e01 100644 --- a/contrib/llvm/lib/Target/SystemZ/Disassembler/SystemZDisassembler.cpp +++ b/contrib/llvm/lib/Target/SystemZ/Disassembler/SystemZDisassembler.cpp @@ -42,7 +42,7 @@ static MCDisassembler *createSystemZDisassembler(const Target &T, extern "C" void LLVMInitializeSystemZDisassembler() { // Register the disassembler. - TargetRegistry::RegisterMCDisassembler(TheSystemZTarget, + TargetRegistry::RegisterMCDisassembler(getTheSystemZTarget(), createSystemZDisassembler); } @@ -150,6 +150,12 @@ static DecodeStatus DecodeVR128BitRegisterClass(MCInst &Inst, uint64_t RegNo, return decodeRegisterClass(Inst, RegNo, SystemZMC::VR128Regs, 32); } +static DecodeStatus DecodeAR32BitRegisterClass(MCInst &Inst, uint64_t RegNo, + uint64_t Address, + const void *Decoder) { + return decodeRegisterClass(Inst, RegNo, SystemZMC::AR32Regs, 16); +} + template<unsigned N> static DecodeStatus decodeUImmOperand(MCInst &Inst, uint64_t Imm) { if (!isUInt<N>(Imm)) @@ -166,12 +172,6 @@ static DecodeStatus decodeSImmOperand(MCInst &Inst, uint64_t Imm) { return MCDisassembler::Success; } -static DecodeStatus decodeAccessRegOperand(MCInst &Inst, uint64_t Imm, - uint64_t Address, - const void *Decoder) { - return decodeUImmOperand<4>(Inst, Imm); -} - static DecodeStatus decodeU1ImmOperand(MCInst &Inst, uint64_t Imm, uint64_t Address, const void *Decoder) { return decodeUImmOperand<1>(Inst, Imm); @@ -247,12 +247,24 @@ static DecodeStatus decodePCDBLOperand(MCInst &Inst, uint64_t Imm, return MCDisassembler::Success; } +static DecodeStatus decodePC12DBLBranchOperand(MCInst &Inst, uint64_t Imm, + uint64_t Address, + const void *Decoder) { + return decodePCDBLOperand<12>(Inst, Imm, Address, true, Decoder); +} + static DecodeStatus decodePC16DBLBranchOperand(MCInst &Inst, uint64_t Imm, uint64_t Address, const void *Decoder) { return decodePCDBLOperand<16>(Inst, Imm, Address, true, Decoder); } +static DecodeStatus decodePC24DBLBranchOperand(MCInst &Inst, uint64_t Imm, + uint64_t Address, + const void *Decoder) { + return decodePCDBLOperand<24>(Inst, Imm, Address, true, Decoder); +} + static DecodeStatus decodePC32DBLBranchOperand(MCInst &Inst, uint64_t Imm, uint64_t Address, const void *Decoder) { @@ -321,6 +333,18 @@ static DecodeStatus decodeBDLAddr12Len8Operand(MCInst &Inst, uint64_t Field, return MCDisassembler::Success; } +static DecodeStatus decodeBDRAddr12Operand(MCInst &Inst, uint64_t Field, + const unsigned *Regs) { + uint64_t Length = Field >> 16; + uint64_t Base = (Field >> 12) & 0xf; + uint64_t Disp = Field & 0xfff; + assert(Length < 16 && "Invalid BDRAddr12"); + Inst.addOperand(MCOperand::createReg(Base == 0 ? 0 : Regs[Base])); + Inst.addOperand(MCOperand::createImm(Disp)); + Inst.addOperand(MCOperand::createReg(Regs[Length])); + return MCDisassembler::Success; +} + static DecodeStatus decodeBDVAddr12Operand(MCInst &Inst, uint64_t Field, const unsigned *Regs) { uint64_t Index = Field >> 16; @@ -376,6 +400,13 @@ static DecodeStatus decodeBDLAddr64Disp12Len8Operand(MCInst &Inst, return decodeBDLAddr12Len8Operand(Inst, Field, SystemZMC::GR64Regs); } +static DecodeStatus decodeBDRAddr64Disp12Operand(MCInst &Inst, + uint64_t Field, + uint64_t Address, + const void *Decoder) { + return decodeBDRAddr12Operand(Inst, Field, SystemZMC::GR64Regs); +} + static DecodeStatus decodeBDVAddr64Disp12Operand(MCInst &Inst, uint64_t Field, uint64_t Address, const void *Decoder) { diff --git a/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.cpp b/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.cpp index 6444cf8..1207c7b 100644 --- a/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.cpp +++ b/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.cpp @@ -134,11 +134,9 @@ void SystemZInstPrinter::printU32ImmOperand(const MCInst *MI, int OpNum, printUImmOperand<32>(MI, OpNum, O); } -void SystemZInstPrinter::printAccessRegOperand(const MCInst *MI, int OpNum, - raw_ostream &O) { - uint64_t Value = MI->getOperand(OpNum).getImm(); - assert(Value < 16 && "Invalid access register number"); - O << "%a" << (unsigned int)Value; +void SystemZInstPrinter::printU48ImmOperand(const MCInst *MI, int OpNum, + raw_ostream &O) { + printUImmOperand<48>(MI, OpNum, O); } void SystemZInstPrinter::printPCRelOperand(const MCInst *MI, int OpNum, @@ -203,6 +201,17 @@ void SystemZInstPrinter::printBDLAddrOperand(const MCInst *MI, int OpNum, O << ')'; } +void SystemZInstPrinter::printBDRAddrOperand(const MCInst *MI, int OpNum, + raw_ostream &O) { + unsigned Base = MI->getOperand(OpNum).getReg(); + uint64_t Disp = MI->getOperand(OpNum + 1).getImm(); + unsigned Length = MI->getOperand(OpNum + 2).getReg(); + O << Disp << "(%" << getRegisterName(Length); + if (Base) + O << ",%" << getRegisterName(Base); + O << ')'; +} + void SystemZInstPrinter::printBDVAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O) { printAddress(MI->getOperand(OpNum).getReg(), diff --git a/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.h b/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.h index 7ca386f..6336f5e 100644 --- a/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.h +++ b/contrib/llvm/lib/Target/SystemZ/InstPrinter/SystemZInstPrinter.h @@ -48,6 +48,7 @@ private: void printBDAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printBDXAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printBDLAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O); + void printBDRAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printBDVAddrOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printU1ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printU2ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); @@ -61,9 +62,9 @@ private: void printU16ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printS32ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printU32ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); + void printU48ImmOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printPCRelOperand(const MCInst *MI, int OpNum, raw_ostream &O); void printPCRelTLSOperand(const MCInst *MI, int OpNum, raw_ostream &O); - void printAccessRegOperand(const MCInst *MI, int OpNum, raw_ostream &O); // Print the mnemonic for a condition-code mask ("ne", "lh", etc.) // This forms part of the instruction name rather than the operand list. diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp index c4d546c..9192448 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp @@ -25,7 +25,9 @@ static uint64_t extractBitsForFixup(MCFixupKind Kind, uint64_t Value) { return Value; switch (unsigned(Kind)) { + case SystemZ::FK_390_PC12DBL: case SystemZ::FK_390_PC16DBL: + case SystemZ::FK_390_PC24DBL: case SystemZ::FK_390_PC32DBL: return (int64_t)Value / 2; @@ -72,7 +74,9 @@ public: const MCFixupKindInfo & SystemZMCAsmBackend::getFixupKindInfo(MCFixupKind Kind) const { const static MCFixupKindInfo Infos[SystemZ::NumTargetFixupKinds] = { + { "FK_390_PC12DBL", 4, 12, MCFixupKindInfo::FKF_IsPCRel }, { "FK_390_PC16DBL", 0, 16, MCFixupKindInfo::FKF_IsPCRel }, + { "FK_390_PC24DBL", 0, 24, MCFixupKindInfo::FKF_IsPCRel }, { "FK_390_PC32DBL", 0, 32, MCFixupKindInfo::FKF_IsPCRel }, { "FK_390_TLS_CALL", 0, 0, 0 } }; @@ -90,12 +94,15 @@ void SystemZMCAsmBackend::applyFixup(const MCFixup &Fixup, char *Data, bool IsPCRel) const { MCFixupKind Kind = Fixup.getKind(); unsigned Offset = Fixup.getOffset(); - unsigned Size = (getFixupKindInfo(Kind).TargetSize + 7) / 8; + unsigned BitSize = getFixupKindInfo(Kind).TargetSize; + unsigned Size = (BitSize + 7) / 8; assert(Offset + Size <= DataSize && "Invalid fixup offset!"); // Big-endian insertion of Size bytes. Value = extractBitsForFixup(Kind, Value); + if (BitSize < 64) + Value &= ((uint64_t)1 << BitSize) - 1; unsigned ShiftValue = (Size * 8) - 8; for (unsigned I = 0; I != Size; ++I) { Data[Offset + I] |= uint8_t(Value >> ShiftValue); @@ -112,7 +119,8 @@ bool SystemZMCAsmBackend::writeNopData(uint64_t Count, MCAsmBackend *llvm::createSystemZMCAsmBackend(const Target &T, const MCRegisterInfo &MRI, - const Triple &TT, StringRef CPU) { + const Triple &TT, StringRef CPU, + const MCTargetOptions &Options) { uint8_t OSABI = MCELFObjectTargetWriter::getOSABI(TT.getOS()); return new SystemZMCAsmBackend(OSABI); } diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp index fd52a2e..7082aba 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp @@ -72,6 +72,9 @@ private: uint64_t getBDLAddr12Len8Encoding(const MCInst &MI, unsigned OpNum, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI) const; + uint64_t getBDRAddr12Encoding(const MCInst &MI, unsigned OpNum, + SmallVectorImpl<MCFixup> &Fixups, + const MCSubtargetInfo &STI) const; uint64_t getBDVAddr12Encoding(const MCInst &MI, unsigned OpNum, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI) const; @@ -110,6 +113,29 @@ private: return getPCRelEncoding(MI, OpNum, Fixups, SystemZ::FK_390_PC32DBL, 2, true); } + uint64_t getPC12DBLBPPEncoding(const MCInst &MI, unsigned OpNum, + SmallVectorImpl<MCFixup> &Fixups, + const MCSubtargetInfo &STI) const { + return getPCRelEncoding(MI, OpNum, Fixups, + SystemZ::FK_390_PC12DBL, 1, false); + } + uint64_t getPC16DBLBPPEncoding(const MCInst &MI, unsigned OpNum, + SmallVectorImpl<MCFixup> &Fixups, + const MCSubtargetInfo &STI) const { + return getPCRelEncoding(MI, OpNum, Fixups, + SystemZ::FK_390_PC16DBL, 4, false); + } + uint64_t getPC24DBLBPPEncoding(const MCInst &MI, unsigned OpNum, + SmallVectorImpl<MCFixup> &Fixups, + const MCSubtargetInfo &STI) const { + return getPCRelEncoding(MI, OpNum, Fixups, + SystemZ::FK_390_PC24DBL, 3, false); + } + +private: + uint64_t computeAvailableFeatures(const FeatureBitset &FB) const; + void verifyInstructionPredicates(const MCInst &MI, + uint64_t AvailableFeatures) const; }; } // end anonymous namespace @@ -123,6 +149,9 @@ void SystemZMCCodeEmitter:: encodeInstruction(const MCInst &MI, raw_ostream &OS, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI) const { + verifyInstructionPredicates(MI, + computeAvailableFeatures(STI.getFeatureBits())); + uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI); unsigned Size = MCII.get(MI.getOpcode()).getSize(); // Big-endian insertion of Size bytes. @@ -199,6 +228,17 @@ getBDLAddr12Len8Encoding(const MCInst &MI, unsigned OpNum, } uint64_t SystemZMCCodeEmitter:: +getBDRAddr12Encoding(const MCInst &MI, unsigned OpNum, + SmallVectorImpl<MCFixup> &Fixups, + const MCSubtargetInfo &STI) const { + uint64_t Base = getMachineOpValue(MI, MI.getOperand(OpNum), Fixups, STI); + uint64_t Disp = getMachineOpValue(MI, MI.getOperand(OpNum + 1), Fixups, STI); + uint64_t Len = getMachineOpValue(MI, MI.getOperand(OpNum + 2), Fixups, STI); + assert(isUInt<4>(Base) && isUInt<12>(Disp) && isUInt<4>(Len)); + return (Len << 16) | (Base << 12) | Disp; +} + +uint64_t SystemZMCCodeEmitter:: getBDVAddr12Encoding(const MCInst &MI, unsigned OpNum, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI) const { @@ -240,4 +280,5 @@ SystemZMCCodeEmitter::getPCRelEncoding(const MCInst &MI, unsigned OpNum, return 0; } +#define ENABLE_INSTR_PREDICATE_VERIFIER #include "SystemZGenMCCodeEmitter.inc" diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCFixups.h b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCFixups.h index 229ab5d..c012acc 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCFixups.h +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCFixups.h @@ -16,7 +16,9 @@ namespace llvm { namespace SystemZ { enum FixupKind { // These correspond directly to R_390_* relocations. - FK_390_PC16DBL = FirstTargetFixupKind, + FK_390_PC12DBL = FirstTargetFixupKind, + FK_390_PC16DBL, + FK_390_PC24DBL, FK_390_PC32DBL, FK_390_TLS_CALL, diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCObjectWriter.cpp b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCObjectWriter.cpp index 368c95f..43a96e8 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCObjectWriter.cpp +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCObjectWriter.cpp @@ -53,7 +53,9 @@ static unsigned getPCRelReloc(unsigned Kind) { case FK_Data_2: return ELF::R_390_PC16; case FK_Data_4: return ELF::R_390_PC32; case FK_Data_8: return ELF::R_390_PC64; + case SystemZ::FK_390_PC12DBL: return ELF::R_390_PC12DBL; case SystemZ::FK_390_PC16DBL: return ELF::R_390_PC16DBL; + case SystemZ::FK_390_PC24DBL: return ELF::R_390_PC24DBL; case SystemZ::FK_390_PC32DBL: return ELF::R_390_PC32DBL; } llvm_unreachable("Unsupported PC-relative address"); @@ -100,7 +102,9 @@ static unsigned getTLSGDReloc(unsigned Kind) { // Return the PLT relocation counterpart of MCFixupKind Kind. static unsigned getPLTReloc(unsigned Kind) { switch (Kind) { + case SystemZ::FK_390_PC12DBL: return ELF::R_390_PLT12DBL; case SystemZ::FK_390_PC16DBL: return ELF::R_390_PLT16DBL; + case SystemZ::FK_390_PC24DBL: return ELF::R_390_PLT24DBL; case SystemZ::FK_390_PC32DBL: return ELF::R_390_PLT32DBL; } llvm_unreachable("Unsupported absolute address"); diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp index e16ba9e..dfea7e3 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.cpp @@ -109,6 +109,13 @@ const unsigned SystemZMC::VR128Regs[32] = { SystemZ::V28, SystemZ::V29, SystemZ::V30, SystemZ::V31 }; +const unsigned SystemZMC::AR32Regs[16] = { + SystemZ::A0, SystemZ::A1, SystemZ::A2, SystemZ::A3, + SystemZ::A4, SystemZ::A5, SystemZ::A6, SystemZ::A7, + SystemZ::A8, SystemZ::A9, SystemZ::A10, SystemZ::A11, + SystemZ::A12, SystemZ::A13, SystemZ::A14, SystemZ::A15 +}; + unsigned SystemZMC::getFirstReg(unsigned Reg) { static unsigned Map[SystemZ::NUM_TARGET_REGS]; static bool Initialized = false; @@ -119,6 +126,7 @@ unsigned SystemZMC::getFirstReg(unsigned Reg) { Map[GR64Regs[I]] = I; Map[GR128Regs[I]] = I; Map[FP128Regs[I]] = I; + Map[AR32Regs[I]] = I; } for (unsigned I = 0; I < 32; ++I) { Map[VR32Regs[I]] = I; @@ -205,34 +213,34 @@ static MCInstPrinter *createSystemZMCInstPrinter(const Triple &T, extern "C" void LLVMInitializeSystemZTargetMC() { // Register the MCAsmInfo. - TargetRegistry::RegisterMCAsmInfo(TheSystemZTarget, + TargetRegistry::RegisterMCAsmInfo(getTheSystemZTarget(), createSystemZMCAsmInfo); // Register the adjustCodeGenOpts. - TargetRegistry::registerMCAdjustCodeGenOpts(TheSystemZTarget, + TargetRegistry::registerMCAdjustCodeGenOpts(getTheSystemZTarget(), adjustCodeGenOpts); // Register the MCCodeEmitter. - TargetRegistry::RegisterMCCodeEmitter(TheSystemZTarget, + TargetRegistry::RegisterMCCodeEmitter(getTheSystemZTarget(), createSystemZMCCodeEmitter); // Register the MCInstrInfo. - TargetRegistry::RegisterMCInstrInfo(TheSystemZTarget, + TargetRegistry::RegisterMCInstrInfo(getTheSystemZTarget(), createSystemZMCInstrInfo); // Register the MCRegisterInfo. - TargetRegistry::RegisterMCRegInfo(TheSystemZTarget, + TargetRegistry::RegisterMCRegInfo(getTheSystemZTarget(), createSystemZMCRegisterInfo); // Register the MCSubtargetInfo. - TargetRegistry::RegisterMCSubtargetInfo(TheSystemZTarget, + TargetRegistry::RegisterMCSubtargetInfo(getTheSystemZTarget(), createSystemZMCSubtargetInfo); // Register the MCAsmBackend. - TargetRegistry::RegisterMCAsmBackend(TheSystemZTarget, + TargetRegistry::RegisterMCAsmBackend(getTheSystemZTarget(), createSystemZMCAsmBackend); // Register the MCInstPrinter. - TargetRegistry::RegisterMCInstPrinter(TheSystemZTarget, + TargetRegistry::RegisterMCInstPrinter(getTheSystemZTarget(), createSystemZMCInstPrinter); } diff --git a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.h b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.h index 0db48fe..d9926c7 100644 --- a/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.h +++ b/contrib/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCTargetDesc.h @@ -21,13 +21,14 @@ class MCInstrInfo; class MCObjectWriter; class MCRegisterInfo; class MCSubtargetInfo; +class MCTargetOptions; class StringRef; class Target; class Triple; class raw_pwrite_stream; class raw_ostream; -extern Target TheSystemZTarget; +Target &getTheSystemZTarget(); namespace SystemZMC { // How many bytes are in the ABI-defined, caller-allocated part of @@ -53,6 +54,7 @@ extern const unsigned FP128Regs[16]; extern const unsigned VR32Regs[32]; extern const unsigned VR64Regs[32]; extern const unsigned VR128Regs[32]; +extern const unsigned AR32Regs[16]; // Return the 0-based number of the first architectural register that // contains the given LLVM register. E.g. R1D -> 1. @@ -85,7 +87,8 @@ MCCodeEmitter *createSystemZMCCodeEmitter(const MCInstrInfo &MCII, MCAsmBackend *createSystemZMCAsmBackend(const Target &T, const MCRegisterInfo &MRI, - const Triple &TT, StringRef CPU); + const Triple &TT, StringRef CPU, + const MCTargetOptions &Options); MCObjectWriter *createSystemZObjectWriter(raw_pwrite_stream &OS, uint8_t OSABI); } // end namespace llvm diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZ.h b/contrib/llvm/lib/Target/SystemZ/SystemZ.h index c8ea964..9a8e508 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZ.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZ.h @@ -175,6 +175,7 @@ static inline bool isImmHF(uint64_t Val) { FunctionPass *createSystemZISelDag(SystemZTargetMachine &TM, CodeGenOpt::Level OptLevel); FunctionPass *createSystemZElimComparePass(SystemZTargetMachine &TM); +FunctionPass *createSystemZExpandPseudoPass(SystemZTargetMachine &TM); FunctionPass *createSystemZShortenInstPass(SystemZTargetMachine &TM); FunctionPass *createSystemZLongBranchPass(SystemZTargetMachine &TM); FunctionPass *createSystemZLDCleanupPass(SystemZTargetMachine &TM); diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZ.td b/contrib/llvm/lib/Target/SystemZ/SystemZ.td index d4d636d..6bdfd4d 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZ.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZ.td @@ -14,7 +14,19 @@ include "llvm/Target/Target.td" //===----------------------------------------------------------------------===// -// SystemZ supported processors and features +// SystemZ subtarget features +//===----------------------------------------------------------------------===// + +include "SystemZFeatures.td" + +//===----------------------------------------------------------------------===// +// SystemZ subtarget scheduling models +//===----------------------------------------------------------------------===// + +include "SystemZSchedule.td" + +//===----------------------------------------------------------------------===// +// SystemZ supported processors //===----------------------------------------------------------------------===// include "SystemZProcessors.td" diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp index 9c0f327..b39245b 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp @@ -418,10 +418,10 @@ void SystemZAsmPrinter::EmitInstruction(const MachineInstr *MI) { case SystemZ::Serialize: if (MF->getSubtarget<SystemZSubtarget>().hasFastSerialization()) - LoweredMI = MCInstBuilder(SystemZ::AsmBCR) + LoweredMI = MCInstBuilder(SystemZ::BCRAsm) .addImm(14).addReg(SystemZ::R0D); else - LoweredMI = MCInstBuilder(SystemZ::AsmBCR) + LoweredMI = MCInstBuilder(SystemZ::BCRAsm) .addImm(15).addReg(SystemZ::R0D); break; @@ -523,5 +523,5 @@ bool SystemZAsmPrinter::PrintAsmMemoryOperand(const MachineInstr *MI, // Force static initialization. extern "C" void LLVMInitializeSystemZAsmPrinter() { - RegisterAsmPrinter<SystemZAsmPrinter> X(TheSystemZTarget); + RegisterAsmPrinter<SystemZAsmPrinter> X(getTheSystemZTarget()); } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.h b/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.h index 7f6e823..fe8c88f 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZAsmPrinter.h @@ -27,9 +27,7 @@ public: : AsmPrinter(TM, std::move(Streamer)) {} // Override AsmPrinter. - const char *getPassName() const override { - return "SystemZ Assembly Printer"; - } + StringRef getPassName() const override { return "SystemZ Assembly Printer"; } void EmitInstruction(const MachineInstr *MI) override; void EmitMachineConstantPoolValue(MachineConstantPoolValue *MCPV) override; bool PrintAsmOperand(const MachineInstr *MI, unsigned OpNo, diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp index 27350b8..b4c843f 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZElimCompare.cpp @@ -28,6 +28,7 @@ using namespace llvm; #define DEBUG_TYPE "systemz-elim-compare" STATISTIC(BranchOnCounts, "Number of branch-on-count instructions"); +STATISTIC(LoadAndTraps, "Number of load-and-trap instructions"); STATISTIC(EliminatedComparisons, "Number of eliminated comparisons"); STATISTIC(FusedComparisons, "Number of fused compare-and-branch instructions"); @@ -58,7 +59,7 @@ public: SystemZElimCompare(const SystemZTargetMachine &tm) : MachineFunctionPass(ID), TII(nullptr), TRI(nullptr) {} - const char *getPassName() const override { + StringRef getPassName() const override { return "SystemZ Comparison Elimination"; } @@ -66,13 +67,15 @@ public: bool runOnMachineFunction(MachineFunction &F) override; MachineFunctionProperties getRequiredProperties() const override { return MachineFunctionProperties().set( - MachineFunctionProperties::Property::AllVRegsAllocated); + MachineFunctionProperties::Property::NoVRegs); } private: Reference getRegReferences(MachineInstr &MI, unsigned Reg); bool convertToBRCT(MachineInstr &MI, MachineInstr &Compare, SmallVectorImpl<MachineInstr *> &CCUsers); + bool convertToLoadAndTrap(MachineInstr &MI, MachineInstr &Compare, + SmallVectorImpl<MachineInstr *> &CCUsers); bool convertToLoadAndTest(MachineInstr &MI); bool adjustCCMasksForInstr(MachineInstr &MI, MachineInstr &Compare, SmallVectorImpl<MachineInstr *> &CCUsers); @@ -171,7 +174,7 @@ static unsigned getCompareSourceReg(MachineInstr &Compare) { // Compare compares the result of MI against zero. If MI is an addition // of -1 and if CCUsers is a single branch on nonzero, eliminate the addition -// and convert the branch to a BRCT(G). Return true on success. +// and convert the branch to a BRCT(G) or BRCTH. Return true on success. bool SystemZElimCompare::convertToBRCT( MachineInstr &MI, MachineInstr &Compare, SmallVectorImpl<MachineInstr *> &CCUsers) { @@ -182,6 +185,8 @@ bool SystemZElimCompare::convertToBRCT( BRCT = SystemZ::BRCT; else if (Opcode == SystemZ::AGHI) BRCT = SystemZ::BRCTG; + else if (Opcode == SystemZ::AIH) + BRCT = SystemZ::BRCTH; else return false; if (MI.getOperand(2).getImm() != -1) @@ -205,16 +210,61 @@ bool SystemZElimCompare::convertToBRCT( if (getRegReferences(*MBBI, SrcReg)) return false; - // The transformation is OK. Rebuild Branch as a BRCT(G). + // The transformation is OK. Rebuild Branch as a BRCT(G) or BRCTH. MachineOperand Target(Branch->getOperand(2)); while (Branch->getNumOperands()) Branch->RemoveOperand(0); Branch->setDesc(TII->get(BRCT)); + MachineInstrBuilder MIB(*Branch->getParent()->getParent(), Branch); + MIB.addOperand(MI.getOperand(0)) + .addOperand(MI.getOperand(1)) + .addOperand(Target); + // Add a CC def to BRCT(G), since we may have to split them again if the + // branch displacement overflows. BRCTH has a 32-bit displacement, so + // this is not necessary there. + if (BRCT != SystemZ::BRCTH) + MIB.addReg(SystemZ::CC, RegState::ImplicitDefine | RegState::Dead); + MI.eraseFromParent(); + return true; +} + +// Compare compares the result of MI against zero. If MI is a suitable load +// instruction and if CCUsers is a single conditional trap on zero, eliminate +// the load and convert the branch to a load-and-trap. Return true on success. +bool SystemZElimCompare::convertToLoadAndTrap( + MachineInstr &MI, MachineInstr &Compare, + SmallVectorImpl<MachineInstr *> &CCUsers) { + unsigned LATOpcode = TII->getLoadAndTrap(MI.getOpcode()); + if (!LATOpcode) + return false; + + // Check whether we have a single CondTrap that traps on zero. + if (CCUsers.size() != 1) + return false; + MachineInstr *Branch = CCUsers[0]; + if (Branch->getOpcode() != SystemZ::CondTrap || + Branch->getOperand(0).getImm() != SystemZ::CCMASK_ICMP || + Branch->getOperand(1).getImm() != SystemZ::CCMASK_CMP_EQ) + return false; + + // We already know that there are no references to the register between + // MI and Compare. Make sure that there are also no references between + // Compare and Branch. + unsigned SrcReg = getCompareSourceReg(Compare); + MachineBasicBlock::iterator MBBI = Compare, MBBE = Branch; + for (++MBBI; MBBI != MBBE; ++MBBI) + if (getRegReferences(*MBBI, SrcReg)) + return false; + + // The transformation is OK. Rebuild Branch as a load-and-trap. + while (Branch->getNumOperands()) + Branch->RemoveOperand(0); + Branch->setDesc(TII->get(LATOpcode)); MachineInstrBuilder(*Branch->getParent()->getParent(), Branch) .addOperand(MI.getOperand(0)) .addOperand(MI.getOperand(1)) - .addOperand(Target) - .addReg(SystemZ::CC, RegState::ImplicitDefine | RegState::Dead); + .addOperand(MI.getOperand(2)) + .addOperand(MI.getOperand(3)); MI.eraseFromParent(); return true; } @@ -347,11 +397,17 @@ bool SystemZElimCompare::optimizeCompareZero( MachineInstr &MI = *MBBI; if (resultTests(MI, SrcReg)) { // Try to remove both MI and Compare by converting a branch to BRCT(G). - // We don't care in this case whether CC is modified between MI and - // Compare. - if (!CCRefs.Use && !SrcRefs && convertToBRCT(MI, Compare, CCUsers)) { - BranchOnCounts += 1; - return true; + // or a load-and-trap instruction. We don't care in this case whether + // CC is modified between MI and Compare. + if (!CCRefs.Use && !SrcRefs) { + if (convertToBRCT(MI, Compare, CCUsers)) { + BranchOnCounts += 1; + return true; + } + if (convertToLoadAndTrap(MI, Compare, CCUsers)) { + LoadAndTraps += 1; + return true; + } } // Try to eliminate Compare by reusing a CC result from MI. if ((!CCRefs && convertToLoadAndTest(MI)) || @@ -403,6 +459,9 @@ bool SystemZElimCompare::fuseCompareOperations( return false; // Make sure that the operands are available at the branch. + // SrcReg2 is the register if the source operand is a register, + // 0 if the source operand is immediate, and the base register + // if the source operand is memory (index is not supported). unsigned SrcReg = Compare.getOperand(0).getReg(); unsigned SrcReg2 = Compare.getOperand(1).isReg() ? Compare.getOperand(1).getReg() : 0; @@ -435,11 +494,16 @@ bool SystemZElimCompare::fuseCompareOperations( Branch->RemoveOperand(0); // Rebuild Branch as a fused compare and branch. + // SrcNOps is the number of MI operands of the compare instruction + // that we need to copy over. + unsigned SrcNOps = 2; + if (FusedOpcode == SystemZ::CLT || FusedOpcode == SystemZ::CLGT) + SrcNOps = 3; Branch->setDesc(TII->get(FusedOpcode)); MachineInstrBuilder MIB(*Branch->getParent()->getParent(), Branch); - MIB.addOperand(Compare.getOperand(0)) - .addOperand(Compare.getOperand(1)) - .addOperand(CCMask); + for (unsigned I = 0; I < SrcNOps; I++) + MIB.addOperand(Compare.getOperand(I)); + MIB.addOperand(CCMask); if (Type == SystemZII::CompareAndBranch) { // Only conditional branches define CC, as they may be converted back diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZExpandPseudo.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZExpandPseudo.cpp new file mode 100644 index 0000000..92ce808 --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZExpandPseudo.cpp @@ -0,0 +1,153 @@ +//==-- SystemZExpandPseudo.cpp - Expand pseudo instructions -------*- C++ -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file contains a pass that expands pseudo instructions into target +// instructions to allow proper scheduling and other late optimizations. This +// pass should be run after register allocation but before the post-regalloc +// scheduling pass. +// +//===----------------------------------------------------------------------===// + +#include "SystemZ.h" +#include "SystemZInstrInfo.h" +#include "SystemZSubtarget.h" +#include "llvm/CodeGen/LivePhysRegs.h" +#include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachineInstrBuilder.h" +using namespace llvm; + +#define SYSTEMZ_EXPAND_PSEUDO_NAME "SystemZ pseudo instruction expansion pass" + +namespace llvm { + void initializeSystemZExpandPseudoPass(PassRegistry&); +} + +namespace { +class SystemZExpandPseudo : public MachineFunctionPass { +public: + static char ID; + SystemZExpandPseudo() : MachineFunctionPass(ID) { + initializeSystemZExpandPseudoPass(*PassRegistry::getPassRegistry()); + } + + const SystemZInstrInfo *TII; + + bool runOnMachineFunction(MachineFunction &Fn) override; + + StringRef getPassName() const override { return SYSTEMZ_EXPAND_PSEUDO_NAME; } + +private: + bool expandMBB(MachineBasicBlock &MBB); + bool expandMI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, + MachineBasicBlock::iterator &NextMBBI); + bool expandLOCRMux(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, + MachineBasicBlock::iterator &NextMBBI); +}; +char SystemZExpandPseudo::ID = 0; +} + +INITIALIZE_PASS(SystemZExpandPseudo, "systemz-expand-pseudo", + SYSTEMZ_EXPAND_PSEUDO_NAME, false, false) + +/// \brief Returns an instance of the pseudo instruction expansion pass. +FunctionPass *llvm::createSystemZExpandPseudoPass(SystemZTargetMachine &TM) { + return new SystemZExpandPseudo(); +} + +// MI is a load-register-on-condition pseudo instruction that could not be +// handled as a single hardware instruction. Replace it by a branch sequence. +bool SystemZExpandPseudo::expandLOCRMux(MachineBasicBlock &MBB, + MachineBasicBlock::iterator MBBI, + MachineBasicBlock::iterator &NextMBBI) { + MachineFunction &MF = *MBB.getParent(); + const BasicBlock *BB = MBB.getBasicBlock(); + MachineInstr &MI = *MBBI; + DebugLoc DL = MI.getDebugLoc(); + unsigned DestReg = MI.getOperand(0).getReg(); + unsigned SrcReg = MI.getOperand(2).getReg(); + unsigned CCValid = MI.getOperand(3).getImm(); + unsigned CCMask = MI.getOperand(4).getImm(); + + LivePhysRegs LiveRegs(&TII->getRegisterInfo()); + LiveRegs.addLiveOuts(MBB); + for (auto I = std::prev(MBB.end()); I != MBBI; --I) + LiveRegs.stepBackward(*I); + + // Splice MBB at MI, moving the rest of the block into RestMBB. + MachineBasicBlock *RestMBB = MF.CreateMachineBasicBlock(BB); + MF.insert(std::next(MachineFunction::iterator(MBB)), RestMBB); + RestMBB->splice(RestMBB->begin(), &MBB, MI, MBB.end()); + RestMBB->transferSuccessors(&MBB); + for (auto I = LiveRegs.begin(); I != LiveRegs.end(); ++I) + RestMBB->addLiveIn(*I); + + // Create a new block MoveMBB to hold the move instruction. + MachineBasicBlock *MoveMBB = MF.CreateMachineBasicBlock(BB); + MF.insert(std::next(MachineFunction::iterator(MBB)), MoveMBB); + MoveMBB->addLiveIn(SrcReg); + for (auto I = LiveRegs.begin(); I != LiveRegs.end(); ++I) + MoveMBB->addLiveIn(*I); + + // At the end of MBB, create a conditional branch to RestMBB if the + // condition is false, otherwise fall through to MoveMBB. + BuildMI(&MBB, DL, TII->get(SystemZ::BRC)) + .addImm(CCValid).addImm(CCMask ^ CCValid).addMBB(RestMBB); + MBB.addSuccessor(RestMBB); + MBB.addSuccessor(MoveMBB); + + // In MoveMBB, emit an instruction to move SrcReg into DestReg, + // then fall through to RestMBB. + TII->copyPhysReg(*MoveMBB, MoveMBB->end(), DL, DestReg, SrcReg, + MI.getOperand(2).isKill()); + MoveMBB->addSuccessor(RestMBB); + + NextMBBI = MBB.end(); + MI.eraseFromParent(); + return true; +} + +/// \brief If MBBI references a pseudo instruction that should be expanded here, +/// do the expansion and return true. Otherwise return false. +bool SystemZExpandPseudo::expandMI(MachineBasicBlock &MBB, + MachineBasicBlock::iterator MBBI, + MachineBasicBlock::iterator &NextMBBI) { + MachineInstr &MI = *MBBI; + switch (MI.getOpcode()) { + case SystemZ::LOCRMux: + return expandLOCRMux(MBB, MBBI, NextMBBI); + default: + break; + } + return false; +} + +/// \brief Iterate over the instructions in basic block MBB and expand any +/// pseudo instructions. Return true if anything was modified. +bool SystemZExpandPseudo::expandMBB(MachineBasicBlock &MBB) { + bool Modified = false; + + MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end(); + while (MBBI != E) { + MachineBasicBlock::iterator NMBBI = std::next(MBBI); + Modified |= expandMI(MBB, MBBI, NMBBI); + MBBI = NMBBI; + } + + return Modified; +} + +bool SystemZExpandPseudo::runOnMachineFunction(MachineFunction &MF) { + TII = static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo()); + + bool Modified = false; + for (auto &MBB : MF) + Modified |= expandMBB(MBB); + return Modified; +} + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZFeatures.td b/contrib/llvm/lib/Target/SystemZ/SystemZFeatures.td new file mode 100644 index 0000000..716e5ad --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZFeatures.td @@ -0,0 +1,171 @@ +//===-- SystemZ.td - SystemZ processors and features ---------*- tblgen -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// Feature definitions. +// +//===----------------------------------------------------------------------===// + +class SystemZFeature<string extname, string intname, string desc> + : Predicate<"Subtarget->has"##intname##"()">, + AssemblerPredicate<"Feature"##intname, extname>, + SubtargetFeature<extname, "Has"##intname, "true", desc>; + +class SystemZMissingFeature<string intname> + : Predicate<"!Subtarget->has"##intname##"()">; + +class SystemZFeatureList<list<SystemZFeature> x> { + list<SystemZFeature> List = x; +} + +class SystemZFeatureAdd<list<SystemZFeature> x, list<SystemZFeature> y> + : SystemZFeatureList<!listconcat(x, y)>; + +//===----------------------------------------------------------------------===// +// +// New features added in the Ninth Edition of the z/Architecture +// +//===----------------------------------------------------------------------===// + +def FeatureDistinctOps : SystemZFeature< + "distinct-ops", "DistinctOps", + "Assume that the distinct-operands facility is installed" +>; + +def FeatureFastSerialization : SystemZFeature< + "fast-serialization", "FastSerialization", + "Assume that the fast-serialization facility is installed" +>; + +def FeatureFPExtension : SystemZFeature< + "fp-extension", "FPExtension", + "Assume that the floating-point extension facility is installed" +>; + +def FeatureHighWord : SystemZFeature< + "high-word", "HighWord", + "Assume that the high-word facility is installed" +>; + +def FeatureInterlockedAccess1 : SystemZFeature< + "interlocked-access1", "InterlockedAccess1", + "Assume that interlocked-access facility 1 is installed" +>; +def FeatureNoInterlockedAccess1 : SystemZMissingFeature<"InterlockedAccess1">; + +def FeatureLoadStoreOnCond : SystemZFeature< + "load-store-on-cond", "LoadStoreOnCond", + "Assume that the load/store-on-condition facility is installed" +>; + +def FeaturePopulationCount : SystemZFeature< + "population-count", "PopulationCount", + "Assume that the population-count facility is installed" +>; + +def Arch9NewFeatures : SystemZFeatureList<[ + FeatureDistinctOps, + FeatureFastSerialization, + FeatureFPExtension, + FeatureHighWord, + FeatureInterlockedAccess1, + FeatureLoadStoreOnCond, + FeaturePopulationCount +]>; + +//===----------------------------------------------------------------------===// +// +// New features added in the Tenth Edition of the z/Architecture +// +//===----------------------------------------------------------------------===// + +def FeatureExecutionHint : SystemZFeature< + "execution-hint", "ExecutionHint", + "Assume that the execution-hint facility is installed" +>; + +def FeatureLoadAndTrap : SystemZFeature< + "load-and-trap", "LoadAndTrap", + "Assume that the load-and-trap facility is installed" +>; + +def FeatureMiscellaneousExtensions : SystemZFeature< + "miscellaneous-extensions", "MiscellaneousExtensions", + "Assume that the miscellaneous-extensions facility is installed" +>; + +def FeatureProcessorAssist : SystemZFeature< + "processor-assist", "ProcessorAssist", + "Assume that the processor-assist facility is installed" +>; + +def FeatureTransactionalExecution : SystemZFeature< + "transactional-execution", "TransactionalExecution", + "Assume that the transactional-execution facility is installed" +>; + +def Arch10NewFeatures : SystemZFeatureList<[ + FeatureExecutionHint, + FeatureLoadAndTrap, + FeatureMiscellaneousExtensions, + FeatureProcessorAssist, + FeatureTransactionalExecution +]>; + +//===----------------------------------------------------------------------===// +// +// New features added in the Eleventh Edition of the z/Architecture +// +//===----------------------------------------------------------------------===// + +def FeatureLoadAndZeroRightmostByte : SystemZFeature< + "load-and-zero-rightmost-byte", "LoadAndZeroRightmostByte", + "Assume that the load-and-zero-rightmost-byte facility is installed" +>; + +def FeatureLoadStoreOnCond2 : SystemZFeature< + "load-store-on-cond-2", "LoadStoreOnCond2", + "Assume that the load/store-on-condition facility 2 is installed" +>; + +def FeatureVector : SystemZFeature< + "vector", "Vector", + "Assume that the vectory facility is installed" +>; +def FeatureNoVector : SystemZMissingFeature<"Vector">; + +def Arch11NewFeatures : SystemZFeatureList<[ + FeatureLoadAndZeroRightmostByte, + FeatureLoadStoreOnCond2, + FeatureVector +]>; + +//===----------------------------------------------------------------------===// +// +// Cumulative supported and unsupported feature sets +// +//===----------------------------------------------------------------------===// + +def Arch8SupportedFeatures + : SystemZFeatureList<[]>; +def Arch9SupportedFeatures + : SystemZFeatureAdd<Arch8SupportedFeatures.List, Arch9NewFeatures.List>; +def Arch10SupportedFeatures + : SystemZFeatureAdd<Arch9SupportedFeatures.List, Arch10NewFeatures.List>; +def Arch11SupportedFeatures + : SystemZFeatureAdd<Arch10SupportedFeatures.List, Arch11NewFeatures.List>; + +def Arch11UnsupportedFeatures + : SystemZFeatureList<[]>; +def Arch10UnsupportedFeatures + : SystemZFeatureAdd<Arch11UnsupportedFeatures.List, Arch11NewFeatures.List>; +def Arch9UnsupportedFeatures + : SystemZFeatureAdd<Arch10UnsupportedFeatures.List, Arch10NewFeatures.List>; +def Arch8UnsupportedFeatures + : SystemZFeatureAdd<Arch9UnsupportedFeatures.List, Arch9NewFeatures.List>; + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp index ccaed49..a28a91e 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp @@ -67,7 +67,7 @@ void SystemZFrameLowering::determineCalleeSaves(MachineFunction &MF, RegScavenger *RS) const { TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS); - MachineFrameInfo *MFFrame = MF.getFrameInfo(); + MachineFrameInfo &MFFrame = MF.getFrameInfo(); const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); bool HasFP = hasFP(MF); SystemZMachineFunctionInfo *MFI = MF.getInfo<SystemZMachineFunctionInfo>(); @@ -82,7 +82,7 @@ void SystemZFrameLowering::determineCalleeSaves(MachineFunction &MF, SavedRegs.set(SystemZ::ArgGPRs[I]); // If there are any landing pads, entering them will modify r6/r7. - if (!MF.getMMI().getLandingPads().empty()) { + if (!MF.getLandingPads().empty()) { SavedRegs.set(SystemZ::R6D); SavedRegs.set(SystemZ::R7D); } @@ -94,7 +94,7 @@ void SystemZFrameLowering::determineCalleeSaves(MachineFunction &MF, // If the function calls other functions, record that the return // address register will be clobbered. - if (MFFrame->hasCalls()) + if (MFFrame.hasCalls()) SavedRegs.set(SystemZ::R14D); // If we are saving GPRs other than the stack pointer, we might as well @@ -276,16 +276,16 @@ restoreCalleeSavedRegisters(MachineBasicBlock &MBB, void SystemZFrameLowering:: processFunctionBeforeFrameFinalized(MachineFunction &MF, RegScavenger *RS) const { - MachineFrameInfo *MFFrame = MF.getFrameInfo(); - uint64_t MaxReach = (MFFrame->estimateStackSize(MF) + + MachineFrameInfo &MFFrame = MF.getFrameInfo(); + uint64_t MaxReach = (MFFrame.estimateStackSize(MF) + SystemZMC::CallFrameSize * 2); if (!isUInt<12>(MaxReach)) { // We may need register scavenging slots if some parts of the frame // are outside the reach of an unsigned 12-bit displacement. // Create 2 for the case where both addresses in an MVC are // out of range. - RS->addScavengingFrameIndex(MFFrame->CreateStackObject(8, 8, false)); - RS->addScavengingFrameIndex(MFFrame->CreateStackObject(8, 8, false)); + RS->addScavengingFrameIndex(MFFrame.CreateStackObject(8, 8, false)); + RS->addScavengingFrameIndex(MFFrame.CreateStackObject(8, 8, false)); } } @@ -321,14 +321,14 @@ static void emitIncrement(MachineBasicBlock &MBB, void SystemZFrameLowering::emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const { assert(&MF.front() == &MBB && "Shrink-wrapping not yet supported"); - MachineFrameInfo *MFFrame = MF.getFrameInfo(); + MachineFrameInfo &MFFrame = MF.getFrameInfo(); auto *ZII = static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo()); SystemZMachineFunctionInfo *ZFI = MF.getInfo<SystemZMachineFunctionInfo>(); MachineBasicBlock::iterator MBBI = MBB.begin(); MachineModuleInfo &MMI = MF.getMMI(); const MCRegisterInfo *MRI = MMI.getContext().getRegisterInfo(); - const std::vector<CalleeSavedInfo> &CSI = MFFrame->getCalleeSavedInfo(); + const std::vector<CalleeSavedInfo> &CSI = MFFrame.getCalleeSavedInfo(); bool HasFP = hasFP(MF); // Debug location must be unknown since the first debug location is used @@ -350,7 +350,7 @@ void SystemZFrameLowering::emitPrologue(MachineFunction &MF, unsigned Reg = Save.getReg(); if (SystemZ::GR64BitRegClass.contains(Reg)) { int64_t Offset = SPOffsetFromCFA + RegSpillOffsets[Reg]; - unsigned CFIIndex = MMI.addFrameInst(MCCFIInstruction::createOffset( + unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( nullptr, MRI->getDwarfRegNum(Reg, true), Offset)); BuildMI(MBB, MBBI, DL, ZII->get(TargetOpcode::CFI_INSTRUCTION)) .addCFIIndex(CFIIndex); @@ -374,7 +374,7 @@ void SystemZFrameLowering::emitPrologue(MachineFunction &MF, emitIncrement(MBB, MBBI, DL, SystemZ::R15D, Delta, ZII); // Add CFI for the allocation. - unsigned CFIIndex = MMI.addFrameInst( + unsigned CFIIndex = MF.addFrameInst( MCCFIInstruction::createDefCfaOffset(nullptr, SPOffsetFromCFA + Delta)); BuildMI(MBB, MBBI, DL, ZII->get(TargetOpcode::CFI_INSTRUCTION)) .addCFIIndex(CFIIndex); @@ -392,7 +392,7 @@ void SystemZFrameLowering::emitPrologue(MachineFunction &MF, // Add CFI for the new frame location. unsigned HardFP = MRI->getDwarfRegNum(SystemZ::R11D, true); - unsigned CFIIndex = MMI.addFrameInst( + unsigned CFIIndex = MF.addFrameInst( MCCFIInstruction::createDefCfaRegister(nullptr, HardFP)); BuildMI(MBB, MBBI, DL, ZII->get(TargetOpcode::CFI_INSTRUCTION)) .addCFIIndex(CFIIndex); @@ -422,7 +422,7 @@ void SystemZFrameLowering::emitPrologue(MachineFunction &MF, int64_t Offset = getFrameIndexReference(MF, Save.getFrameIdx(), IgnoredFrameReg); - unsigned CFIIndex = MMI.addFrameInst(MCCFIInstruction::createOffset( + unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( nullptr, DwarfReg, SPOffsetFromCFA + Offset)); CFIIndexes.push_back(CFIIndex); } @@ -478,14 +478,14 @@ void SystemZFrameLowering::emitEpilogue(MachineFunction &MF, bool SystemZFrameLowering::hasFP(const MachineFunction &MF) const { return (MF.getTarget().Options.DisableFramePointerElim(MF) || - MF.getFrameInfo()->hasVarSizedObjects() || + MF.getFrameInfo().hasVarSizedObjects() || MF.getInfo<SystemZMachineFunctionInfo>()->getManipulatesSP()); } int SystemZFrameLowering::getFrameIndexReference(const MachineFunction &MF, int FI, unsigned &FrameReg) const { - const MachineFrameInfo *MFFrame = MF.getFrameInfo(); + const MachineFrameInfo &MFFrame = MF.getFrameInfo(); const TargetRegisterInfo *RI = MF.getSubtarget().getRegisterInfo(); // Fill in FrameReg output argument. @@ -494,8 +494,8 @@ int SystemZFrameLowering::getFrameIndexReference(const MachineFunction &MF, // Start with the offset of FI from the top of the caller-allocated frame // (i.e. the top of the 160 bytes allocated by the caller). This initial // offset is therefore negative. - int64_t Offset = (MFFrame->getObjectOffset(FI) + - MFFrame->getOffsetAdjustment()); + int64_t Offset = (MFFrame.getObjectOffset(FI) + + MFFrame.getOffsetAdjustment()); // Make the offset relative to the incoming stack pointer. Offset -= getOffsetOfLocalArea(); @@ -508,15 +508,15 @@ int SystemZFrameLowering::getFrameIndexReference(const MachineFunction &MF, uint64_t SystemZFrameLowering:: getAllocatedStackSize(const MachineFunction &MF) const { - const MachineFrameInfo *MFFrame = MF.getFrameInfo(); + const MachineFrameInfo &MFFrame = MF.getFrameInfo(); // Start with the size of the local variables and spill slots. - uint64_t StackSize = MFFrame->getStackSize(); + uint64_t StackSize = MFFrame.getStackSize(); // We need to allocate the ABI-defined 160-byte base area whenever // we allocate stack space for our own use and whenever we call another // function. - if (StackSize || MFFrame->hasVarSizedObjects() || MFFrame->hasCalls()) + if (StackSize || MFFrame.hasVarSizedObjects() || MFFrame.hasCalls()) StackSize += SystemZMC::CallFrameSize; return StackSize; diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.cpp new file mode 100644 index 0000000..fe4b52b --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.cpp @@ -0,0 +1,337 @@ +//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer -----*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines a hazard recognizer for the SystemZ scheduler. +// +// This class is used by the SystemZ scheduling strategy to maintain +// the state during scheduling, and provide cost functions for +// scheduling candidates. This includes: +// +// * Decoder grouping. A decoder group can maximally hold 3 uops, and +// instructions that always begin a new group should be scheduled when +// the current decoder group is empty. +// * Processor resources usage. It is beneficial to balance the use of +// resources. +// +// ===---------------------------------------------------------------------===// + +#include "SystemZHazardRecognizer.h" +#include "llvm/ADT/Statistic.h" + +using namespace llvm; + +#define DEBUG_TYPE "misched" + +// This is the limit of processor resource usage at which the +// scheduler should try to look for other instructions (not using the +// critical resource). +static cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden, + cl::desc("The OOO window for processor " + "resources during scheduling."), + cl::init(8)); + +SystemZHazardRecognizer:: +SystemZHazardRecognizer(const MachineSchedContext *C) : DAG(nullptr), + SchedModel(nullptr) {} + +unsigned SystemZHazardRecognizer:: +getNumDecoderSlots(SUnit *SU) const { + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + if (!SC->isValid()) + return 0; // IMPLICIT_DEF / KILL -- will not make impact in output. + + if (SC->BeginGroup) { + if (!SC->EndGroup) + return 2; // Cracked instruction + else + return 3; // Expanded/group-alone instruction + } + + return 1; // Normal instruction +} + +unsigned SystemZHazardRecognizer::getCurrCycleIdx() { + unsigned Idx = CurrGroupSize; + if (GrpCount % 2) + Idx += 3; + return Idx; +} + +ScheduleHazardRecognizer::HazardType SystemZHazardRecognizer:: +getHazardType(SUnit *m, int Stalls) { + return (fitsIntoCurrentGroup(m) ? NoHazard : Hazard); +} + +void SystemZHazardRecognizer::Reset() { + CurrGroupSize = 0; + clearProcResCounters(); + GrpCount = 0; + LastFPdOpCycleIdx = UINT_MAX; + DEBUG(CurGroupDbg = "";); +} + +bool +SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const { + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + if (!SC->isValid()) + return true; + + // A cracked instruction only fits into schedule if the current + // group is empty. + if (SC->BeginGroup) + return (CurrGroupSize == 0); + + // Since a full group is handled immediately in EmitInstruction(), + // SU should fit into current group. NumSlots should be 1 or 0, + // since it is not a cracked or expanded instruction. + assert ((getNumDecoderSlots(SU) <= 1) && (CurrGroupSize < 3) && + "Expected normal instruction to fit in non-full group!"); + + return true; +} + +void SystemZHazardRecognizer::nextGroup(bool DbgOutput) { + if (CurrGroupSize > 0) { + DEBUG(dumpCurrGroup("Completed decode group")); + DEBUG(CurGroupDbg = "";); + + GrpCount++; + + // Reset counter for next group. + CurrGroupSize = 0; + + // Decrease counters for execution units by one. + for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i) + if (ProcResourceCounters[i] > 0) + ProcResourceCounters[i]--; + + // Clear CriticalResourceIdx if it is now below the threshold. + if (CriticalResourceIdx != UINT_MAX && + (ProcResourceCounters[CriticalResourceIdx] <= + ProcResCostLim)) + CriticalResourceIdx = UINT_MAX; + } + + DEBUG(if (DbgOutput) + dumpProcResourceCounters();); +} + +#ifndef NDEBUG // Debug output +void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const { + OS << "SU(" << SU->NodeNum << "):"; + OS << SchedModel->getInstrInfo()->getName(SU->getInstr()->getOpcode()); + + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + if (!SC->isValid()) + return; + + for (TargetSchedModel::ProcResIter + PI = SchedModel->getWriteProcResBegin(SC), + PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) { + const MCProcResourceDesc &PRD = + *SchedModel->getProcResource(PI->ProcResourceIdx); + std::string FU(PRD.Name); + // trim e.g. Z13_FXaUnit -> FXa + FU = FU.substr(FU.find("_") + 1); + FU.resize(FU.find("Unit")); + OS << "/" << FU; + + if (PI->Cycles > 1) + OS << "(" << PI->Cycles << "cyc)"; + } + + if (SC->NumMicroOps > 1) + OS << "/" << SC->NumMicroOps << "uops"; + if (SC->BeginGroup && SC->EndGroup) + OS << "/GroupsAlone"; + else if (SC->BeginGroup) + OS << "/BeginsGroup"; + else if (SC->EndGroup) + OS << "/EndsGroup"; + if (SU->isUnbuffered) + OS << "/Unbuffered"; +} + +void SystemZHazardRecognizer::dumpCurrGroup(std::string Msg) const { + dbgs() << "+++ " << Msg; + dbgs() << ": "; + + if (CurGroupDbg.empty()) + dbgs() << " <empty>\n"; + else { + dbgs() << "{ " << CurGroupDbg << " }"; + dbgs() << " (" << CurrGroupSize << " decoder slot" + << (CurrGroupSize > 1 ? "s":"") + << ")\n"; + } +} + +void SystemZHazardRecognizer::dumpProcResourceCounters() const { + bool any = false; + + for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i) + if (ProcResourceCounters[i] > 0) { + any = true; + break; + } + + if (!any) + return; + + dbgs() << "+++ Resource counters:\n"; + for (unsigned i = 0; i < SchedModel->getNumProcResourceKinds(); ++i) + if (ProcResourceCounters[i] > 0) { + dbgs() << "+++ Extra schedule for execution unit " + << SchedModel->getProcResource(i)->Name + << ": " << ProcResourceCounters[i] << "\n"; + any = true; + } +} +#endif //NDEBUG + +void SystemZHazardRecognizer::clearProcResCounters() { + ProcResourceCounters.assign(SchedModel->getNumProcResourceKinds(), 0); + CriticalResourceIdx = UINT_MAX; +} + +// Update state with SU as the next scheduled unit. +void SystemZHazardRecognizer:: +EmitInstruction(SUnit *SU) { + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + DEBUG( dumpCurrGroup("Decode group before emission");); + + // If scheduling an SU that must begin a new decoder group, move on + // to next group. + if (!fitsIntoCurrentGroup(SU)) + nextGroup(); + + DEBUG( dbgs() << "+++ HazardRecognizer emitting "; dumpSU(SU, dbgs()); + dbgs() << "\n"; + raw_string_ostream cgd(CurGroupDbg); + if (CurGroupDbg.length()) + cgd << ", "; + dumpSU(SU, cgd);); + + // After returning from a call, we don't know much about the state. + if (SU->getInstr()->isCall()) { + DEBUG (dbgs() << "+++ Clearing state after call.\n";); + clearProcResCounters(); + LastFPdOpCycleIdx = UINT_MAX; + CurrGroupSize += getNumDecoderSlots(SU); + assert (CurrGroupSize <= 3); + nextGroup(); + return; + } + + // Increase counter for execution unit(s). + for (TargetSchedModel::ProcResIter + PI = SchedModel->getWriteProcResBegin(SC), + PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) { + // Don't handle FPd together with the other resources. + if (SchedModel->getProcResource(PI->ProcResourceIdx)->BufferSize == 1) + continue; + int &CurrCounter = + ProcResourceCounters[PI->ProcResourceIdx]; + CurrCounter += PI->Cycles; + // Check if this is now the new critical resource. + if ((CurrCounter > ProcResCostLim) && + (CriticalResourceIdx == UINT_MAX || + (PI->ProcResourceIdx != CriticalResourceIdx && + CurrCounter > + ProcResourceCounters[CriticalResourceIdx]))) { + DEBUG( dbgs() << "+++ New critical resource: " + << SchedModel->getProcResource(PI->ProcResourceIdx)->Name + << "\n";); + CriticalResourceIdx = PI->ProcResourceIdx; + } + } + + // Make note of an instruction that uses a blocking resource (FPd). + if (SU->isUnbuffered) { + LastFPdOpCycleIdx = getCurrCycleIdx(); + DEBUG (dbgs() << "+++ Last FPd cycle index: " + << LastFPdOpCycleIdx << "\n";); + } + + // Insert SU into current group by increasing number of slots used + // in current group. + CurrGroupSize += getNumDecoderSlots(SU); + assert (CurrGroupSize <= 3); + + // Check if current group is now full/ended. If so, move on to next + // group to be ready to evaluate more candidates. + if (CurrGroupSize == 3 || SC->EndGroup) + nextGroup(); +} + +int SystemZHazardRecognizer::groupingCost(SUnit *SU) const { + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + if (!SC->isValid()) + return 0; + + // If SU begins new group, it can either break a current group early + // or fit naturally if current group is empty (negative cost). + if (SC->BeginGroup) { + if (CurrGroupSize) + return 3 - CurrGroupSize; + return -1; + } + + // Similarly, a group-ending SU may either fit well (last in group), or + // end the group prematurely. + if (SC->EndGroup) { + unsigned resultingGroupSize = + (CurrGroupSize + getNumDecoderSlots(SU)); + if (resultingGroupSize < 3) + return (3 - resultingGroupSize); + return -1; + } + + // Most instructions can be placed in any decoder slot. + return 0; +} + +bool SystemZHazardRecognizer::isFPdOpPreferred_distance(const SUnit *SU) { + assert (SU->isUnbuffered); + // If this is the first FPd op, it should be scheduled high. + if (LastFPdOpCycleIdx == UINT_MAX) + return true; + // If this is not the first PFd op, it should go into the other side + // of the processor to use the other FPd unit there. This should + // generally happen if two FPd ops are placed with 2 other + // instructions between them (modulo 6). + if (LastFPdOpCycleIdx > getCurrCycleIdx()) + return ((LastFPdOpCycleIdx - getCurrCycleIdx()) == 3); + return ((getCurrCycleIdx() - LastFPdOpCycleIdx) == 3); +} + +int SystemZHazardRecognizer:: +resourcesCost(SUnit *SU) { + int Cost = 0; + + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + if (!SC->isValid()) + return 0; + + // For a FPd op, either return min or max value as indicated by the + // distance to any prior FPd op. + if (SU->isUnbuffered) + Cost = (isFPdOpPreferred_distance(SU) ? INT_MIN : INT_MAX); + // For other instructions, give a cost to the use of the critical resource. + else if (CriticalResourceIdx != UINT_MAX) { + for (TargetSchedModel::ProcResIter + PI = SchedModel->getWriteProcResBegin(SC), + PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) + if (PI->ProcResourceIdx == CriticalResourceIdx) + Cost = PI->Cycles; + } + + return Cost; +} + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.h b/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.h new file mode 100644 index 0000000..8fa54ee --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZHazardRecognizer.h @@ -0,0 +1,128 @@ +//=-- SystemZHazardRecognizer.h - SystemZ Hazard Recognizer -----*- C++ -*-===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file declares a hazard recognizer for the SystemZ scheduler. +// +// This class is used by the SystemZ scheduling strategy to maintain +// the state during scheduling, and provide cost functions for +// scheduling candidates. This includes: +// +// * Decoder grouping. A decoder group can maximally hold 3 uops, and +// instructions that always begin a new group should be scheduled when +// the current decoder group is empty. +// * Processor resources usage. It is beneficial to balance the use of +// resources. +// +// ===---------------------------------------------------------------------===// + +#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H +#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H + +#include "SystemZSubtarget.h" +#include "llvm/CodeGen/MachineFunction.h" +#include "llvm/CodeGen/MachineScheduler.h" +#include "llvm/CodeGen/ScheduleHazardRecognizer.h" +#include "llvm/ADT/SmallVector.h" +#include "llvm/MC/MCInstrDesc.h" +#include "llvm/Support/raw_ostream.h" +#include <string> + +namespace llvm { + +/// SystemZHazardRecognizer maintains the state during scheduling. +class SystemZHazardRecognizer : public ScheduleHazardRecognizer { + + ScheduleDAGMI *DAG; + const TargetSchedModel *SchedModel; + + /// Keep track of the number of decoder slots used in the current + /// decoder group. + unsigned CurrGroupSize; + + /// The tracking of resources here are quite similar to the common + /// code use of a critical resource. However, z13 differs in the way + /// that it has two processor sides which may be interesting to + /// model in the future (a work in progress). + + /// Counters for the number of uops scheduled per processor + /// resource. + SmallVector<int, 0> ProcResourceCounters; + + /// This is the resource with the greatest queue, which the + /// scheduler tries to avoid. + unsigned CriticalResourceIdx; + + /// Return the number of decoder slots MI requires. + inline unsigned getNumDecoderSlots(SUnit *SU) const; + + /// Return true if MI fits into current decoder group. + bool fitsIntoCurrentGroup(SUnit *SU) const; + + /// Two decoder groups per cycle are formed (for z13), meaning 2x3 + /// instructions. This function returns a number between 0 and 5, + /// representing the current decoder slot of the current cycle. + unsigned getCurrCycleIdx(); + + /// LastFPdOpCycleIdx stores the numbeer returned by getCurrCycleIdx() + /// when a stalling operation is scheduled (which uses the FPd resource). + unsigned LastFPdOpCycleIdx; + + /// A counter of decoder groups scheduled. + unsigned GrpCount; + + unsigned getCurrGroupSize() {return CurrGroupSize;}; + + /// Start next decoder group. + void nextGroup(bool DbgOutput = true); + + /// Clear all counters for processor resources. + void clearProcResCounters(); + + /// With the goal of alternating processor sides for stalling (FPd) + /// ops, return true if it seems good to schedule an FPd op next. + bool isFPdOpPreferred_distance(const SUnit *SU); + +public: + SystemZHazardRecognizer(const MachineSchedContext *C); + + void setDAG(ScheduleDAGMI *dag) { + DAG = dag; + SchedModel = dag->getSchedModel(); + } + + HazardType getHazardType(SUnit *m, int Stalls = 0) override; + void Reset() override; + void EmitInstruction(SUnit *SU) override; + + // Cost functions used by SystemZPostRASchedStrategy while + // evaluating candidates. + + /// Return the cost of decoder grouping for SU. If SU must start a + /// new decoder group, this is negative if this fits the schedule or + /// positive if it would mean ending a group prematurely. For normal + /// instructions this returns 0. + int groupingCost(SUnit *SU) const; + + /// Return the cost of SU in regards to processor resources usage. + /// A positive value means it would be better to wait with SU, while + /// a negative value means it would be good to schedule SU next. + int resourcesCost(SUnit *SU); + +#ifndef NDEBUG + // Debug dumping. + std::string CurGroupDbg; // current group as text + void dumpSU(SUnit *SU, raw_ostream &OS) const; + void dumpCurrGroup(std::string Msg = "") const; + void dumpProcResourceCounters() const; +#endif +}; + +} // namespace llvm + +#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H */ diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp index cd7fcc3..920b6e4 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZISelDAGToDAG.cpp @@ -117,7 +117,7 @@ static uint64_t allOnes(unsigned int Count) { // case the result will be truncated as part of the operation). struct RxSBGOperands { RxSBGOperands(unsigned Op, SDValue N) - : Opcode(Op), BitSize(N.getValueType().getSizeInBits()), + : Opcode(Op), BitSize(N.getValueSizeInBits()), Mask(allOnes(BitSize)), Input(N), Start(64 - BitSize), End(63), Rotate(0) {} @@ -339,7 +339,7 @@ public: } // Override MachineFunctionPass. - const char *getPassName() const override { + StringRef getPassName() const override { return "SystemZ DAG->DAG Pattern Instruction Selection"; } @@ -709,7 +709,7 @@ bool SystemZDAGToDAGISel::detectOrAndInsertion(SDValue &Op, // It's only an insertion if all bits are covered or are known to be zero. // The inner check covers all cases but is more expensive. - uint64_t Used = allOnes(Op.getValueType().getSizeInBits()); + uint64_t Used = allOnes(Op.getValueSizeInBits()); if (Used != (AndMask | InsertMask)) { APInt KnownZero, KnownOne; CurDAG->computeKnownBits(Op.getOperand(0), KnownZero, KnownOne); @@ -749,7 +749,7 @@ bool SystemZDAGToDAGISel::expandRxSBG(RxSBGOperands &RxSBG) const { case ISD::TRUNCATE: { if (RxSBG.Opcode == SystemZ::RNSBG) return false; - uint64_t BitSize = N.getValueType().getSizeInBits(); + uint64_t BitSize = N.getValueSizeInBits(); uint64_t Mask = allOnes(BitSize); if (!refineRxSBGMask(RxSBG, Mask)) return false; @@ -825,19 +825,19 @@ bool SystemZDAGToDAGISel::expandRxSBG(RxSBGOperands &RxSBG) const { case ISD::ZERO_EXTEND: if (RxSBG.Opcode != SystemZ::RNSBG) { // Restrict the mask to the extended operand. - unsigned InnerBitSize = N.getOperand(0).getValueType().getSizeInBits(); + unsigned InnerBitSize = N.getOperand(0).getValueSizeInBits(); if (!refineRxSBGMask(RxSBG, allOnes(InnerBitSize))) return false; RxSBG.Input = N.getOperand(0); return true; } - // Fall through. + LLVM_FALLTHROUGH; case ISD::SIGN_EXTEND: { // Check that the extension bits are don't-care (i.e. are masked out // by the final mask). - unsigned InnerBitSize = N.getOperand(0).getValueType().getSizeInBits(); + unsigned InnerBitSize = N.getOperand(0).getValueSizeInBits(); if (maskMatters(RxSBG, allOnes(RxSBG.BitSize) - allOnes(InnerBitSize))) return false; @@ -851,7 +851,7 @@ bool SystemZDAGToDAGISel::expandRxSBG(RxSBGOperands &RxSBG) const { return false; uint64_t Count = CountNode->getZExtValue(); - unsigned BitSize = N.getValueType().getSizeInBits(); + unsigned BitSize = N.getValueSizeInBits(); if (Count < 1 || Count >= BitSize) return false; @@ -878,7 +878,7 @@ bool SystemZDAGToDAGISel::expandRxSBG(RxSBGOperands &RxSBG) const { return false; uint64_t Count = CountNode->getZExtValue(); - unsigned BitSize = N.getValueType().getSizeInBits(); + unsigned BitSize = N.getValueSizeInBits(); if (Count < 1 || Count >= BitSize) return false; @@ -935,49 +935,55 @@ bool SystemZDAGToDAGISel::tryRISBGZero(SDNode *N) { Count += 1; if (Count == 0) return false; - if (Count == 1) { - // Prefer to use normal shift instructions over RISBG, since they can handle - // all cases and are sometimes shorter. - if (N->getOpcode() != ISD::AND) - return false; - // Prefer register extensions like LLC over RISBG. Also prefer to start - // out with normal ANDs if one instruction would be enough. We can convert - // these ANDs into an RISBG later if a three-address instruction is useful. - if (VT == MVT::i32 || - RISBG.Mask == 0xff || - RISBG.Mask == 0xffff || - SystemZ::isImmLF(~RISBG.Mask) || - SystemZ::isImmHF(~RISBG.Mask)) { - // Force the new mask into the DAG, since it may include known-one bits. - auto *MaskN = cast<ConstantSDNode>(N->getOperand(1).getNode()); - if (MaskN->getZExtValue() != RISBG.Mask) { - SDValue NewMask = CurDAG->getConstant(RISBG.Mask, DL, VT); - N = CurDAG->UpdateNodeOperands(N, N->getOperand(0), NewMask); - SelectCode(N); - return true; - } - return false; - } - } + // Prefer to use normal shift instructions over RISBG, since they can handle + // all cases and are sometimes shorter. + if (Count == 1 && N->getOpcode() != ISD::AND) + return false; - // If the RISBG operands require no rotation and just masks the bottom - // 8/16 bits, attempt to convert this to a LLC zero extension. - if (RISBG.Rotate == 0 && (RISBG.Mask == 0xff || RISBG.Mask == 0xffff)) { - unsigned OpCode = (RISBG.Mask == 0xff ? SystemZ::LLGCR : SystemZ::LLGHR); - if (VT == MVT::i32) { - if (Subtarget->hasHighWord()) - OpCode = (RISBG.Mask == 0xff ? SystemZ::LLCRMux : SystemZ::LLHRMux); - else - OpCode = (RISBG.Mask == 0xff ? SystemZ::LLCR : SystemZ::LLHR); + // Prefer register extensions like LLC over RISBG. Also prefer to start + // out with normal ANDs if one instruction would be enough. We can convert + // these ANDs into an RISBG later if a three-address instruction is useful. + if (RISBG.Rotate == 0) { + bool PreferAnd = false; + // Prefer AND for any 32-bit and-immediate operation. + if (VT == MVT::i32) + PreferAnd = true; + // As well as for any 64-bit operation that can be implemented via LLC(R), + // LLH(R), LLGT(R), or one of the and-immediate instructions. + else if (RISBG.Mask == 0xff || + RISBG.Mask == 0xffff || + RISBG.Mask == 0x7fffffff || + SystemZ::isImmLF(~RISBG.Mask) || + SystemZ::isImmHF(~RISBG.Mask)) + PreferAnd = true; + // And likewise for the LLZRGF instruction, which doesn't have a register + // to register version. + else if (auto *Load = dyn_cast<LoadSDNode>(RISBG.Input)) { + if (Load->getMemoryVT() == MVT::i32 && + (Load->getExtensionType() == ISD::EXTLOAD || + Load->getExtensionType() == ISD::ZEXTLOAD) && + RISBG.Mask == 0xffffff00 && + Subtarget->hasLoadAndZeroRightmostByte()) + PreferAnd = true; + } + if (PreferAnd) { + // Replace the current node with an AND. Note that the current node + // might already be that same AND, in which case it is already CSE'd + // with it, and we must not call ReplaceNode. + SDValue In = convertTo(DL, VT, RISBG.Input); + SDValue Mask = CurDAG->getConstant(RISBG.Mask, DL, VT); + SDValue New = CurDAG->getNode(ISD::AND, DL, VT, In, Mask); + if (N != New.getNode()) { + insertDAGNode(CurDAG, N, Mask); + insertDAGNode(CurDAG, N, New); + ReplaceNode(N, New.getNode()); + N = New.getNode(); + } + // Now, select the machine opcode to implement this operation. + SelectCode(N); + return true; } - - SDValue In = convertTo(DL, VT, RISBG.Input); - SDValue New = convertTo( - DL, VT, SDValue(CurDAG->getMachineNode(OpCode, DL, VT, In), 0)); - ReplaceUses(N, New.getNode()); - CurDAG->RemoveDeadNode(N); - return true; } unsigned Opcode = SystemZ::RISBG; @@ -1136,8 +1142,7 @@ bool SystemZDAGToDAGISel::tryScatter(StoreSDNode *Store, unsigned Opcode) { SDValue Value = Store->getValue(); if (Value.getOpcode() != ISD::EXTRACT_VECTOR_ELT) return false; - if (Store->getMemoryVT().getSizeInBits() != - Value.getValueType().getSizeInBits()) + if (Store->getMemoryVT().getSizeInBits() != Value.getValueSizeInBits()) return false; SDValue ElemV = Value.getOperand(1); @@ -1176,7 +1181,7 @@ bool SystemZDAGToDAGISel::canUseBlockOperation(StoreSDNode *Store, return false; // There's no chance of overlap if the load is invariant. - if (Load->isInvariant()) + if (Load->isInvariant() && Load->isDereferenceable()) return true; // Otherwise we need to check whether there's an alias. @@ -1265,7 +1270,7 @@ void SystemZDAGToDAGISel::Select(SDNode *Node) { if (Node->getOperand(1).getOpcode() != ISD::Constant) if (tryRxSBG(Node, SystemZ::RNSBG)) return; - // Fall through. + LLVM_FALLTHROUGH; case ISD::ROTL: case ISD::SHL: case ISD::SRL: @@ -1291,8 +1296,14 @@ void SystemZDAGToDAGISel::Select(SDNode *Node) { SDValue Op0 = Node->getOperand(0); SDValue Op1 = Node->getOperand(1); // Prefer to put any load first, so that it can be matched as a - // conditional load. - if (Op1.getOpcode() == ISD::LOAD && Op0.getOpcode() != ISD::LOAD) { + // conditional load. Likewise for constants in range for LOCHI. + if ((Op1.getOpcode() == ISD::LOAD && Op0.getOpcode() != ISD::LOAD) || + (Subtarget->hasLoadStoreOnCond2() && + Node->getValueType(0).isInteger() && + Op1.getOpcode() == ISD::Constant && + isInt<16>(cast<ConstantSDNode>(Op1)->getSExtValue()) && + !(Op0.getOpcode() == ISD::Constant && + isInt<16>(cast<ConstantSDNode>(Op0)->getSExtValue())))) { SDValue CCValid = Node->getOperand(2); SDValue CCMask = Node->getOperand(3); uint64_t ConstCCValid = @@ -1310,7 +1321,7 @@ void SystemZDAGToDAGISel::Select(SDNode *Node) { case ISD::INSERT_VECTOR_ELT: { EVT VT = Node->getValueType(0); - unsigned ElemBitSize = VT.getVectorElementType().getSizeInBits(); + unsigned ElemBitSize = VT.getScalarSizeInBits(); if (ElemBitSize == 32) { if (tryGather(Node, SystemZ::VGEF)) return; @@ -1323,7 +1334,7 @@ void SystemZDAGToDAGISel::Select(SDNode *Node) { case ISD::STORE: { auto *Store = cast<StoreSDNode>(Node); - unsigned ElemBitSize = Store->getValue().getValueType().getSizeInBits(); + unsigned ElemBitSize = Store->getValue().getValueSizeInBits(); if (ElemBitSize == 32) { if (tryScatter(Store, SystemZ::VSCEF)) return; @@ -1375,6 +1386,29 @@ SelectInlineAsmMemoryOperand(const SDValue &Op, } if (selectBDXAddr(Form, DispRange, Op, Base, Disp, Index)) { + const TargetRegisterClass *TRC = + Subtarget->getRegisterInfo()->getPointerRegClass(*MF); + SDLoc DL(Base); + SDValue RC = CurDAG->getTargetConstant(TRC->getID(), DL, MVT::i32); + + // Make sure that the base address doesn't go into %r0. + // If it's a TargetFrameIndex or a fixed register, we shouldn't do anything. + if (Base.getOpcode() != ISD::TargetFrameIndex && + Base.getOpcode() != ISD::Register) { + Base = + SDValue(CurDAG->getMachineNode(TargetOpcode::COPY_TO_REGCLASS, + DL, Base.getValueType(), + Base, RC), 0); + } + + // Make sure that the index register isn't assigned to %r0 either. + if (Index.getOpcode() != ISD::Register) { + Index = + SDValue(CurDAG->getMachineNode(TargetOpcode::COPY_TO_REGCLASS, + DL, Index.getValueType(), + Index, RC), 0); + } + OutOps.push_back(Base); OutOps.push_back(Disp); OutOps.push_back(Index); diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp index 14991bb..2d0a06a 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp @@ -20,6 +20,7 @@ #include "llvm/CodeGen/MachineInstrBuilder.h" #include "llvm/CodeGen/MachineRegisterInfo.h" #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h" +#include "llvm/Support/CommandLine.h" #include "llvm/IR/Intrinsics.h" #include <cctype> @@ -531,6 +532,46 @@ bool SystemZTargetLowering::isLegalAddressingMode(const DataLayout &DL, return AM.Scale == 0 || AM.Scale == 1; } +bool SystemZTargetLowering::isFoldableMemAccessOffset(Instruction *I, + int64_t Offset) const { + // This only applies to z13. + if (!Subtarget.hasVector()) + return true; + + // * Use LDE instead of LE/LEY to avoid partial register + // dependencies (LDE only supports small offsets). + // * Utilize the vector registers to hold floating point + // values (vector load / store instructions only support small + // offsets). + + assert (isa<LoadInst>(I) || isa<StoreInst>(I)); + Type *MemAccessTy = (isa<LoadInst>(I) ? I->getType() : + I->getOperand(0)->getType()); + bool IsFPAccess = MemAccessTy->isFloatingPointTy(); + bool IsVectorAccess = MemAccessTy->isVectorTy(); + + // A store of an extracted vector element will be combined into a VSTE type + // instruction. + if (!IsVectorAccess && isa<StoreInst>(I)) { + Value *DataOp = I->getOperand(0); + if (isa<ExtractElementInst>(DataOp)) + IsVectorAccess = true; + } + + // A load which gets inserted into a vector element will be combined into a + // VLE type instruction. + if (!IsVectorAccess && isa<LoadInst>(I) && I->hasOneUse()) { + User *LoadUser = *I->user_begin(); + if (isa<InsertElementInst>(LoadUser)) + IsVectorAccess = true; + } + + if (!isUInt<12>(Offset) && (IsFPAccess || IsVectorAccess)) + return false; + + return true; +} + bool SystemZTargetLowering::isTruncateFree(Type *FromType, Type *ToType) const { if (!FromType->isIntegerTy() || !ToType->isIntegerTy()) return false; @@ -864,7 +905,7 @@ SDValue SystemZTargetLowering::LowerFormalArguments( const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL, SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const { MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo *MFI = MF.getFrameInfo(); + MachineFrameInfo &MFI = MF.getFrameInfo(); MachineRegisterInfo &MRI = MF.getRegInfo(); SystemZMachineFunctionInfo *FuncInfo = MF.getInfo<SystemZMachineFunctionInfo>(); @@ -927,8 +968,8 @@ SDValue SystemZTargetLowering::LowerFormalArguments( assert(VA.isMemLoc() && "Argument not register or memory"); // Create the frame index object for this incoming parameter. - int FI = MFI->CreateFixedObject(LocVT.getSizeInBits() / 8, - VA.getLocMemOffset(), true); + int FI = MFI.CreateFixedObject(LocVT.getSizeInBits() / 8, + VA.getLocMemOffset(), true); // Create the SelectionDAG nodes corresponding to a load // from this parameter. Unpromoted ints and floats are @@ -971,12 +1012,12 @@ SDValue SystemZTargetLowering::LowerFormalArguments( // Likewise the address (in the form of a frame index) of where the // first stack vararg would be. The 1-byte size here is arbitrary. int64_t StackSize = CCInfo.getNextStackOffset(); - FuncInfo->setVarArgsFrameIndex(MFI->CreateFixedObject(1, StackSize, true)); + FuncInfo->setVarArgsFrameIndex(MFI.CreateFixedObject(1, StackSize, true)); // ...and a similar frame index for the caller-allocated save area // that will be used to store the incoming registers. int64_t RegSaveOffset = TFL->getOffsetOfLocalArea(); - unsigned RegSaveIndex = MFI->CreateFixedObject(1, RegSaveOffset, true); + unsigned RegSaveIndex = MFI.CreateFixedObject(1, RegSaveOffset, true); FuncInfo->setRegSaveFrameIndex(RegSaveIndex); // Store the FPR varargs in the reserved frame slots. (We store the @@ -985,7 +1026,7 @@ SDValue SystemZTargetLowering::LowerFormalArguments( SDValue MemOps[SystemZ::NumArgFPRs]; for (unsigned I = NumFixedFPRs; I < SystemZ::NumArgFPRs; ++I) { unsigned Offset = TFL->getRegSpillOffset(SystemZ::ArgFPRs[I]); - int FI = MFI->CreateFixedObject(8, RegSaveOffset + Offset, true); + int FI = MFI.CreateFixedObject(8, RegSaveOffset + Offset, true); SDValue FIN = DAG.getFrameIndex(FI, getPointerTy(DAG.getDataLayout())); unsigned VReg = MF.addLiveIn(SystemZ::ArgFPRs[I], &SystemZ::FP64BitRegClass); @@ -1837,8 +1878,7 @@ static void adjustICmpTruncate(SelectionDAG &DAG, const SDLoc &DL, C.Op1.getOpcode() == ISD::Constant && cast<ConstantSDNode>(C.Op1)->getZExtValue() == 0) { auto *L = cast<LoadSDNode>(C.Op0.getOperand(0)); - if (L->getMemoryVT().getStoreSizeInBits() - <= C.Op0.getValueType().getSizeInBits()) { + if (L->getMemoryVT().getStoreSizeInBits() <= C.Op0.getValueSizeInBits()) { unsigned Type = L->getExtensionType(); if ((Type == ISD::ZEXTLOAD && C.ICmpType != SystemZICMP::SignedOnly) || (Type == ISD::SEXTLOAD && C.ICmpType != SystemZICMP::UnsignedOnly)) { @@ -1857,7 +1897,7 @@ static bool isSimpleShift(SDValue N, unsigned &ShiftVal) { return false; uint64_t Amount = Shift->getZExtValue(); - if (Amount >= N.getValueType().getSizeInBits()) + if (Amount >= N.getValueSizeInBits()) return false; ShiftVal = Amount; @@ -2008,7 +2048,7 @@ static void adjustForTestUnderMask(SelectionDAG &DAG, const SDLoc &DL, // Check whether the combination of mask, comparison value and comparison // type are suitable. - unsigned BitSize = NewC.Op0.getValueType().getSizeInBits(); + unsigned BitSize = NewC.Op0.getValueSizeInBits(); unsigned NewCCMask, ShiftVal; if (NewC.ICmpType != SystemZICMP::SignedOnly && NewC.Op0.getOpcode() == ISD::SHL && @@ -2542,16 +2582,15 @@ SDValue SystemZTargetLowering::lowerTLSGetOffset(GlobalAddressSDNode *Node, SDValue SystemZTargetLowering::lowerThreadPointer(const SDLoc &DL, SelectionDAG &DAG) const { + SDValue Chain = DAG.getEntryNode(); EVT PtrVT = getPointerTy(DAG.getDataLayout()); // The high part of the thread pointer is in access register 0. - SDValue TPHi = DAG.getNode(SystemZISD::EXTRACT_ACCESS, DL, MVT::i32, - DAG.getConstant(0, DL, MVT::i32)); + SDValue TPHi = DAG.getCopyFromReg(Chain, DL, SystemZ::A0, MVT::i32); TPHi = DAG.getNode(ISD::ANY_EXTEND, DL, PtrVT, TPHi); // The low part of the thread pointer is in access register 1. - SDValue TPLo = DAG.getNode(SystemZISD::EXTRACT_ACCESS, DL, MVT::i32, - DAG.getConstant(1, DL, MVT::i32)); + SDValue TPLo = DAG.getCopyFromReg(Chain, DL, SystemZ::A1, MVT::i32); TPLo = DAG.getNode(ISD::ZERO_EXTEND, DL, PtrVT, TPLo); // Merge them into a single 64-bit address. @@ -2691,8 +2730,8 @@ SDValue SystemZTargetLowering::lowerConstantPool(ConstantPoolSDNode *CP, SDValue SystemZTargetLowering::lowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const { MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo *MFI = MF.getFrameInfo(); - MFI->setFrameAddressIsTaken(true); + MachineFrameInfo &MFI = MF.getFrameInfo(); + MFI.setFrameAddressIsTaken(true); SDLoc DL(Op); unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue(); @@ -2703,7 +2742,7 @@ SDValue SystemZTargetLowering::lowerFRAMEADDR(SDValue Op, int BackChainIdx = FI->getFramePointerSaveIndex(); if (!BackChainIdx) { // By definition, the frame address is the address of the back chain. - BackChainIdx = MFI->CreateFixedObject(8, -SystemZMC::CallFrameSize, false); + BackChainIdx = MFI.CreateFixedObject(8, -SystemZMC::CallFrameSize, false); FI->setFramePointerSaveIndex(BackChainIdx); } SDValue BackChain = DAG.getFrameIndex(BackChainIdx, PtrVT); @@ -2719,8 +2758,8 @@ SDValue SystemZTargetLowering::lowerFRAMEADDR(SDValue Op, SDValue SystemZTargetLowering::lowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const { MachineFunction &MF = DAG.getMachineFunction(); - MachineFrameInfo *MFI = MF.getFrameInfo(); - MFI->setReturnAddressIsTaken(true); + MachineFrameInfo &MFI = MF.getFrameInfo(); + MFI.setReturnAddressIsTaken(true); if (verifyReturnAddressArgumentIsConstant(Op, DAG)) return SDValue(); @@ -3080,7 +3119,7 @@ SDValue SystemZTargetLowering::lowerCTPOP(SDValue Op, if (VT.isVector()) { Op = DAG.getNode(ISD::BITCAST, DL, MVT::v16i8, Op); Op = DAG.getNode(SystemZISD::POPCNT, DL, MVT::v16i8, Op); - switch (VT.getVectorElementType().getSizeInBits()) { + switch (VT.getScalarSizeInBits()) { case 8: break; case 16: { @@ -3288,8 +3327,7 @@ SDValue SystemZTargetLowering::lowerATOMIC_LOAD_SUB(SDValue Op, if (NegSrc2.getNode()) return DAG.getAtomic(ISD::ATOMIC_LOAD_ADD, DL, MemVT, Node->getChain(), Node->getBasePtr(), NegSrc2, - Node->getMemOperand(), Node->getOrdering(), - Node->getSynchScope()); + Node->getMemOperand()); // Use the node as-is. return Op; @@ -4355,7 +4393,7 @@ SDValue SystemZTargetLowering::lowerINSERT_VECTOR_ELT(SDValue Op, } // Otherwise bitcast to the equivalent integer form and insert via a GPR. - MVT IntVT = MVT::getIntegerVT(VT.getVectorElementType().getSizeInBits()); + MVT IntVT = MVT::getIntegerVT(VT.getScalarSizeInBits()); MVT IntVecVT = MVT::getVectorVT(IntVT, VT.getVectorNumElements()); SDValue Res = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, IntVecVT, DAG.getNode(ISD::BITCAST, DL, IntVecVT, Op0), @@ -4395,8 +4433,8 @@ SystemZTargetLowering::lowerExtendVectorInreg(SDValue Op, SelectionDAG &DAG, SDValue PackedOp = Op.getOperand(0); EVT OutVT = Op.getValueType(); EVT InVT = PackedOp.getValueType(); - unsigned ToBits = OutVT.getVectorElementType().getSizeInBits(); - unsigned FromBits = InVT.getVectorElementType().getSizeInBits(); + unsigned ToBits = OutVT.getScalarSizeInBits(); + unsigned FromBits = InVT.getScalarSizeInBits(); do { FromBits *= 2; EVT OutVT = MVT::getVectorVT(MVT::getIntegerVT(FromBits), @@ -4413,7 +4451,7 @@ SDValue SystemZTargetLowering::lowerShift(SDValue Op, SelectionDAG &DAG, SDValue Op1 = Op.getOperand(1); SDLoc DL(Op); EVT VT = Op.getValueType(); - unsigned ElemBitSize = VT.getVectorElementType().getSizeInBits(); + unsigned ElemBitSize = VT.getScalarSizeInBits(); // See whether the shift vector is a splat represented as BUILD_VECTOR. if (auto *BVN = dyn_cast<BuildVectorSDNode>(Op1)) { @@ -4591,7 +4629,6 @@ const char *SystemZTargetLowering::getTargetNodeName(unsigned Opcode) const { OPCODE(BR_CCMASK); OPCODE(SELECT_CCMASK); OPCODE(ADJDYNALLOC); - OPCODE(EXTRACT_ACCESS); OPCODE(POPCNT); OPCODE(UMUL_LOHI64); OPCODE(SDIVREM32); @@ -4687,7 +4724,7 @@ const char *SystemZTargetLowering::getTargetNodeName(unsigned Opcode) const { // Return true if VT is a vector whose elements are a whole number of bytes // in width. static bool canTreatAsByteVector(EVT VT) { - return VT.isVector() && VT.getVectorElementType().getSizeInBits() % 8 == 0; + return VT.isVector() && VT.getScalarSizeInBits() % 8 == 0; } // Try to simplify an EXTRACT_VECTOR_ELT from a vector of type VecVT @@ -4748,7 +4785,7 @@ SDValue SystemZTargetLowering::combineExtract(const SDLoc &DL, EVT ResVT, // We're extracting the low part of one operand of the BUILD_VECTOR. Op = Op.getOperand(End / OpBytesPerElement - 1); if (!Op.getValueType().isInteger()) { - EVT VT = MVT::getIntegerVT(Op.getValueType().getSizeInBits()); + EVT VT = MVT::getIntegerVT(Op.getValueSizeInBits()); Op = DAG.getNode(ISD::BITCAST, DL, VT, Op); DCI.AddToWorklist(Op.getNode()); } @@ -4848,8 +4885,7 @@ SDValue SystemZTargetLowering::combineSIGN_EXTEND( SDValue Inner = N0.getOperand(0); if (SraAmt && Inner.hasOneUse() && Inner.getOpcode() == ISD::SHL) { if (auto *ShlAmt = dyn_cast<ConstantSDNode>(Inner.getOperand(1))) { - unsigned Extra = (VT.getSizeInBits() - - N0.getValueType().getSizeInBits()); + unsigned Extra = (VT.getSizeInBits() - N0.getValueSizeInBits()); unsigned NewShlAmt = ShlAmt->getZExtValue() + Extra; unsigned NewSraAmt = SraAmt->getZExtValue() + Extra; EVT ShiftVT = N0.getOperand(1).getValueType(); @@ -4972,8 +5008,8 @@ SDValue SystemZTargetLowering::combineJOIN_DWORDS( SDValue SystemZTargetLowering::combineFP_ROUND( SDNode *N, DAGCombinerInfo &DCI) const { - // (fround (extract_vector_elt X 0)) - // (fround (extract_vector_elt X 1)) -> + // (fpround (extract_vector_elt X 0)) + // (fpround (extract_vector_elt X 1)) -> // (extract_vector_elt (VROUND X) 0) // (extract_vector_elt (VROUND X) 1) // @@ -5070,14 +5106,20 @@ SDValue SystemZTargetLowering::combineSHIFTROT( // Shift/rotate instructions only use the last 6 bits of the second operand // register. If the second operand is the result of an AND with an immediate // value that has its last 6 bits set, we can safely remove the AND operation. + // + // If the AND operation doesn't have the last 6 bits set, we can't remove it + // entirely, but we can still truncate it to a 16-bit value. This prevents + // us from ending up with a NILL with a signed operand, which will cause the + // instruction printer to abort. SDValue N1 = N->getOperand(1); if (N1.getOpcode() == ISD::AND) { - auto *AndMask = dyn_cast<ConstantSDNode>(N1.getOperand(1)); + SDValue AndMaskOp = N1->getOperand(1); + auto *AndMask = dyn_cast<ConstantSDNode>(AndMaskOp); // The AND mask is constant if (AndMask) { auto AmtVal = AndMask->getZExtValue(); - + // Bottom 6 bits are set if ((AmtVal & 0x3f) == 0x3f) { SDValue AndOp = N1->getOperand(0); @@ -5099,6 +5141,26 @@ SDValue SystemZTargetLowering::combineSHIFTROT( return Replace; } + + // We can't remove the AND, but we can use NILL here (normally we would + // use NILF). Only keep the last 16 bits of the mask. The actual + // transformation will be handled by .td definitions. + } else if (AmtVal >> 16 != 0) { + SDValue AndOp = N1->getOperand(0); + + auto NewMask = DAG.getConstant(AndMask->getZExtValue() & 0x0000ffff, + SDLoc(AndMaskOp), + AndMaskOp.getValueType()); + + auto NewAnd = DAG.getNode(N1.getOpcode(), SDLoc(N1), N1.getValueType(), + AndOp, NewMask); + + SDValue Replace = DAG.getNode(N->getOpcode(), SDLoc(N), + N->getValueType(0), N->getOperand(0), + NewAnd); + DCI.AddToWorklist(Replace.getNode()); + + return Replace; } } } @@ -5180,7 +5242,8 @@ static unsigned forceReg(MachineInstr &MI, MachineOperand &Base, // Implement EmitInstrWithCustomInserter for pseudo Select* instruction MI. MachineBasicBlock * SystemZTargetLowering::emitSelect(MachineInstr &MI, - MachineBasicBlock *MBB) const { + MachineBasicBlock *MBB, + unsigned LOCROpcode) const { const SystemZInstrInfo *TII = static_cast<const SystemZInstrInfo *>(Subtarget.getInstrInfo()); @@ -5191,6 +5254,15 @@ SystemZTargetLowering::emitSelect(MachineInstr &MI, unsigned CCMask = MI.getOperand(4).getImm(); DebugLoc DL = MI.getDebugLoc(); + // Use LOCROpcode if possible. + if (LOCROpcode && Subtarget.hasLoadStoreOnCond()) { + BuildMI(*MBB, MI, DL, TII->get(LOCROpcode), DestReg) + .addReg(FalseReg).addReg(TrueReg) + .addImm(CCValid).addImm(CCMask); + MI.eraseFromParent(); + return MBB; + } + MachineBasicBlock *StartMBB = MBB; MachineBasicBlock *JoinMBB = splitBlockBefore(MI, MBB); MachineBasicBlock *FalseMBB = emitBlockAfter(StartMBB); @@ -5976,12 +6048,16 @@ MachineBasicBlock *SystemZTargetLowering::EmitInstrWithCustomInserter( MachineInstr &MI, MachineBasicBlock *MBB) const { switch (MI.getOpcode()) { case SystemZ::Select32Mux: + return emitSelect(MI, MBB, + Subtarget.hasLoadStoreOnCond2()? SystemZ::LOCRMux : 0); case SystemZ::Select32: - case SystemZ::SelectF32: + return emitSelect(MI, MBB, SystemZ::LOCR); case SystemZ::Select64: + return emitSelect(MI, MBB, SystemZ::LOCGR); + case SystemZ::SelectF32: case SystemZ::SelectF64: case SystemZ::SelectF128: - return emitSelect(MI, MBB); + return emitSelect(MI, MBB, 0); case SystemZ::CondStore8Mux: return emitCondStore(MI, MBB, SystemZ::STCMux, 0, false); @@ -5991,6 +6067,10 @@ MachineBasicBlock *SystemZTargetLowering::EmitInstrWithCustomInserter( return emitCondStore(MI, MBB, SystemZ::STHMux, 0, false); case SystemZ::CondStore16MuxInv: return emitCondStore(MI, MBB, SystemZ::STHMux, 0, true); + case SystemZ::CondStore32Mux: + return emitCondStore(MI, MBB, SystemZ::STMux, SystemZ::STOCMux, false); + case SystemZ::CondStore32MuxInv: + return emitCondStore(MI, MBB, SystemZ::STMux, SystemZ::STOCMux, true); case SystemZ::CondStore8: return emitCondStore(MI, MBB, SystemZ::STC, 0, false); case SystemZ::CondStore8Inv: diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.h b/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.h index b1de893..7a21a47 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZISelLowering.h @@ -83,10 +83,6 @@ enum NodeType : unsigned { // base of the dynamically-allocatable area. ADJDYNALLOC, - // Extracts the value of a 32-bit access register. Operand 0 is - // the number of the register. - EXTRACT_ACCESS, - // Count number of bits set in operand 0 per byte. POPCNT, @@ -382,7 +378,7 @@ public: // // (c) there are no multiplication instructions for the widest integer // type (v2i64). - if (VT.getVectorElementType().getSizeInBits() % 8 == 0) + if (VT.getScalarSizeInBits() % 8 == 0) return TypeWidenVector; return TargetLoweringBase::getPreferredVectorAction(VT); } @@ -394,6 +390,7 @@ public: bool isLegalAddImmediate(int64_t Imm) const override; bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM, Type *Ty, unsigned AS) const override; + bool isFoldableMemAccessOffset(Instruction *I, int64_t Offset) const override; bool allowsMisalignedMemoryAccesses(EVT VT, unsigned AS, unsigned Align, bool *Fast) const override; @@ -564,7 +561,8 @@ private: MachineBasicBlock *Target) const; // Implement EmitInstrWithCustomInserter for individual operation types. - MachineBasicBlock *emitSelect(MachineInstr &MI, MachineBasicBlock *BB) const; + MachineBasicBlock *emitSelect(MachineInstr &MI, MachineBasicBlock *BB, + unsigned LOCROpcode) const; MachineBasicBlock *emitCondStore(MachineInstr &MI, MachineBasicBlock *BB, unsigned StoreOpcode, unsigned STOCOpcode, bool Invert) const; diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrBuilder.h b/contrib/llvm/lib/Target/SystemZ/SystemZInstrBuilder.h index 2cb8aba..896b665 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrBuilder.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrBuilder.h @@ -27,7 +27,7 @@ static inline const MachineInstrBuilder & addFrameReference(const MachineInstrBuilder &MIB, int FI) { MachineInstr *MI = MIB; MachineFunction &MF = *MI->getParent()->getParent(); - MachineFrameInfo *MFFrame = MF.getFrameInfo(); + MachineFrameInfo &MFFrame = MF.getFrameInfo(); const MCInstrDesc &MCID = MI->getDesc(); auto Flags = MachineMemOperand::MONone; if (MCID.mayLoad()) @@ -37,7 +37,7 @@ addFrameReference(const MachineInstrBuilder &MIB, int FI) { int64_t Offset = 0; MachineMemOperand *MMO = MF.getMachineMemOperand( MachinePointerInfo::getFixedStack(MF, FI, Offset), Flags, - MFFrame->getObjectSize(FI), MFFrame->getObjectAlignment(FI)); + MFFrame.getObjectSize(FI), MFFrame.getObjectAlignment(FI)); return MIB.addFrameIndex(FI).addImm(Offset).addReg(0).addMemOperand(MMO); } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrFP.td b/contrib/llvm/lib/Target/SystemZ/SystemZInstrFP.td index 8b32047..bb6d27e 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrFP.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrFP.td @@ -27,28 +27,28 @@ defm CondStoreF64 : CondStores<FP64, nonvolatile_store, // Load zero. let hasSideEffects = 0, isAsCheapAsAMove = 1, isMoveImm = 1 in { - def LZER : InherentRRE<"lzer", 0xB374, FP32, (fpimm0)>; - def LZDR : InherentRRE<"lzdr", 0xB375, FP64, (fpimm0)>; - def LZXR : InherentRRE<"lzxr", 0xB376, FP128, (fpimm0)>; + def LZER : InherentRRE<"lzer", 0xB374, FP32, fpimm0>; + def LZDR : InherentRRE<"lzdr", 0xB375, FP64, fpimm0>; + def LZXR : InherentRRE<"lzxr", 0xB376, FP128, fpimm0>; } // Moves between two floating-point registers. let hasSideEffects = 0 in { - def LER : UnaryRR <"le", 0x38, null_frag, FP32, FP32>; - def LDR : UnaryRR <"ld", 0x28, null_frag, FP64, FP64>; - def LXR : UnaryRRE<"lx", 0xB365, null_frag, FP128, FP128>; + def LER : UnaryRR <"ler", 0x38, null_frag, FP32, FP32>; + def LDR : UnaryRR <"ldr", 0x28, null_frag, FP64, FP64>; + def LXR : UnaryRRE<"lxr", 0xB365, null_frag, FP128, FP128>; // For z13 we prefer LDR over LER to avoid partial register dependencies. let isCodeGenOnly = 1 in - def LDR32 : UnaryRR<"ld", 0x28, null_frag, FP32, FP32>; + def LDR32 : UnaryRR<"ldr", 0x28, null_frag, FP32, FP32>; } // Moves between two floating-point registers that also set the condition // codes. let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { - defm LTEBR : LoadAndTestRRE<"lteb", 0xB302, FP32>; - defm LTDBR : LoadAndTestRRE<"ltdb", 0xB312, FP64>; - defm LTXBR : LoadAndTestRRE<"ltxb", 0xB342, FP128>; + defm LTEBR : LoadAndTestRRE<"ltebr", 0xB302, FP32>; + defm LTDBR : LoadAndTestRRE<"ltdbr", 0xB312, FP64>; + defm LTXBR : LoadAndTestRRE<"ltxbr", 0xB342, FP128>; } // Note that LTxBRCompare is not available if we have vector support, // since load-and-test instructions will partially clobber the target @@ -73,13 +73,13 @@ let Predicates = [FeatureVector] in { } // Moves between 64-bit integer and floating-point registers. -def LGDR : UnaryRRE<"lgd", 0xB3CD, bitconvert, GR64, FP64>; -def LDGR : UnaryRRE<"ldg", 0xB3C1, bitconvert, FP64, GR64>; +def LGDR : UnaryRRE<"lgdr", 0xB3CD, bitconvert, GR64, FP64>; +def LDGR : UnaryRRE<"ldgr", 0xB3C1, bitconvert, FP64, GR64>; // fcopysign with an FP32 result. let isCodeGenOnly = 1 in { - def CPSDRss : BinaryRRF<"cpsd", 0xB372, fcopysign, FP32, FP32>; - def CPSDRsd : BinaryRRF<"cpsd", 0xB372, fcopysign, FP32, FP64>; + def CPSDRss : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP32, FP32, FP32>; + def CPSDRsd : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP32, FP32, FP64>; } // The sign of an FP128 is in the high register. @@ -88,8 +88,8 @@ def : Pat<(fcopysign FP32:$src1, FP128:$src2), // fcopysign with an FP64 result. let isCodeGenOnly = 1 in - def CPSDRds : BinaryRRF<"cpsd", 0xB372, fcopysign, FP64, FP32>; -def CPSDRdd : BinaryRRF<"cpsd", 0xB372, fcopysign, FP64, FP64>; + def CPSDRds : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP32>; +def CPSDRdd : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP64>; // The sign of an FP128 is in the high register. def : Pat<(fcopysign FP64:$src1, FP128:$src2), @@ -154,26 +154,26 @@ let SimpleBDXStore = 1 in { // Convert floating-point values to narrower representations, rounding // according to the current mode. The destination of LEXBR and LDXBR // is a 128-bit value, but only the first register of the pair is used. -def LEDBR : UnaryRRE<"ledb", 0xB344, fround, FP32, FP64>; -def LEXBR : UnaryRRE<"lexb", 0xB346, null_frag, FP128, FP128>; -def LDXBR : UnaryRRE<"ldxb", 0xB345, null_frag, FP128, FP128>; +def LEDBR : UnaryRRE<"ledbr", 0xB344, fpround, FP32, FP64>; +def LEXBR : UnaryRRE<"lexbr", 0xB346, null_frag, FP128, FP128>; +def LDXBR : UnaryRRE<"ldxbr", 0xB345, null_frag, FP128, FP128>; -def LEDBRA : UnaryRRF4<"ledbra", 0xB344, FP32, FP64>, +def LEDBRA : TernaryRRFe<"ledbra", 0xB344, FP32, FP64>, Requires<[FeatureFPExtension]>; -def LEXBRA : UnaryRRF4<"lexbra", 0xB346, FP128, FP128>, +def LEXBRA : TernaryRRFe<"lexbra", 0xB346, FP128, FP128>, Requires<[FeatureFPExtension]>; -def LDXBRA : UnaryRRF4<"ldxbra", 0xB345, FP128, FP128>, +def LDXBRA : TernaryRRFe<"ldxbra", 0xB345, FP128, FP128>, Requires<[FeatureFPExtension]>; -def : Pat<(f32 (fround FP128:$src)), +def : Pat<(f32 (fpround FP128:$src)), (EXTRACT_SUBREG (LEXBR FP128:$src), subreg_hr32)>; -def : Pat<(f64 (fround FP128:$src)), +def : Pat<(f64 (fpround FP128:$src)), (EXTRACT_SUBREG (LDXBR FP128:$src), subreg_h64)>; // Extend register floating-point values to wider representations. -def LDEBR : UnaryRRE<"ldeb", 0xB304, fextend, FP64, FP32>; -def LXEBR : UnaryRRE<"lxeb", 0xB306, fextend, FP128, FP32>; -def LXDBR : UnaryRRE<"lxdb", 0xB305, fextend, FP128, FP64>; +def LDEBR : UnaryRRE<"ldebr", 0xB304, fpextend, FP64, FP32>; +def LXEBR : UnaryRRE<"lxebr", 0xB306, fpextend, FP128, FP32>; +def LXDBR : UnaryRRE<"lxdbr", 0xB305, fpextend, FP128, FP64>; // Extend memory floating-point values to wider representations. def LDEB : UnaryRXE<"ldeb", 0xED04, extloadf32, FP64, 4>; @@ -181,23 +181,35 @@ def LXEB : UnaryRXE<"lxeb", 0xED06, extloadf32, FP128, 4>; def LXDB : UnaryRXE<"lxdb", 0xED05, extloadf64, FP128, 8>; // Convert a signed integer register value to a floating-point one. -def CEFBR : UnaryRRE<"cefb", 0xB394, sint_to_fp, FP32, GR32>; -def CDFBR : UnaryRRE<"cdfb", 0xB395, sint_to_fp, FP64, GR32>; -def CXFBR : UnaryRRE<"cxfb", 0xB396, sint_to_fp, FP128, GR32>; +def CEFBR : UnaryRRE<"cefbr", 0xB394, sint_to_fp, FP32, GR32>; +def CDFBR : UnaryRRE<"cdfbr", 0xB395, sint_to_fp, FP64, GR32>; +def CXFBR : UnaryRRE<"cxfbr", 0xB396, sint_to_fp, FP128, GR32>; -def CEGBR : UnaryRRE<"cegb", 0xB3A4, sint_to_fp, FP32, GR64>; -def CDGBR : UnaryRRE<"cdgb", 0xB3A5, sint_to_fp, FP64, GR64>; -def CXGBR : UnaryRRE<"cxgb", 0xB3A6, sint_to_fp, FP128, GR64>; +def CEGBR : UnaryRRE<"cegbr", 0xB3A4, sint_to_fp, FP32, GR64>; +def CDGBR : UnaryRRE<"cdgbr", 0xB3A5, sint_to_fp, FP64, GR64>; +def CXGBR : UnaryRRE<"cxgbr", 0xB3A6, sint_to_fp, FP128, GR64>; + +// The FP extension feature provides versions of the above that allow +// specifying rounding mode and inexact-exception suppression flags. +let Predicates = [FeatureFPExtension] in { + def CEFBRA : TernaryRRFe<"cefbra", 0xB394, FP32, GR32>; + def CDFBRA : TernaryRRFe<"cdfbra", 0xB395, FP64, GR32>; + def CXFBRA : TernaryRRFe<"cxfbra", 0xB396, FP128, GR32>; + + def CEGBRA : TernaryRRFe<"cegbra", 0xB3A4, FP32, GR64>; + def CDGBRA : TernaryRRFe<"cdgbra", 0xB3A5, FP64, GR64>; + def CXGBRA : TernaryRRFe<"cxgbra", 0xB3A6, FP128, GR64>; +} // Convert am unsigned integer register value to a floating-point one. let Predicates = [FeatureFPExtension] in { - def CELFBR : UnaryRRF4<"celfbr", 0xB390, FP32, GR32>; - def CDLFBR : UnaryRRF4<"cdlfbr", 0xB391, FP64, GR32>; - def CXLFBR : UnaryRRF4<"cxlfbr", 0xB392, FP128, GR32>; + def CELFBR : TernaryRRFe<"celfbr", 0xB390, FP32, GR32>; + def CDLFBR : TernaryRRFe<"cdlfbr", 0xB391, FP64, GR32>; + def CXLFBR : TernaryRRFe<"cxlfbr", 0xB392, FP128, GR32>; - def CELGBR : UnaryRRF4<"celgbr", 0xB3A0, FP32, GR64>; - def CDLGBR : UnaryRRF4<"cdlgbr", 0xB3A1, FP64, GR64>; - def CXLGBR : UnaryRRF4<"cxlgbr", 0xB3A2, FP128, GR64>; + def CELGBR : TernaryRRFe<"celgbr", 0xB3A0, FP32, GR64>; + def CDLGBR : TernaryRRFe<"cdlgbr", 0xB3A1, FP64, GR64>; + def CXLGBR : TernaryRRFe<"cxlgbr", 0xB3A2, FP128, GR64>; def : Pat<(f32 (uint_to_fp GR32:$src)), (CELFBR 0, GR32:$src, 0)>; def : Pat<(f64 (uint_to_fp GR32:$src)), (CDLFBR 0, GR32:$src, 0)>; @@ -211,13 +223,13 @@ let Predicates = [FeatureFPExtension] in { // Convert a floating-point register value to a signed integer value, // with the second operand (modifier M3) specifying the rounding mode. let Defs = [CC] in { - def CFEBR : UnaryRRF<"cfeb", 0xB398, GR32, FP32>; - def CFDBR : UnaryRRF<"cfdb", 0xB399, GR32, FP64>; - def CFXBR : UnaryRRF<"cfxb", 0xB39A, GR32, FP128>; + def CFEBR : BinaryRRFe<"cfebr", 0xB398, GR32, FP32>; + def CFDBR : BinaryRRFe<"cfdbr", 0xB399, GR32, FP64>; + def CFXBR : BinaryRRFe<"cfxbr", 0xB39A, GR32, FP128>; - def CGEBR : UnaryRRF<"cgeb", 0xB3A8, GR64, FP32>; - def CGDBR : UnaryRRF<"cgdb", 0xB3A9, GR64, FP64>; - def CGXBR : UnaryRRF<"cgxb", 0xB3AA, GR64, FP128>; + def CGEBR : BinaryRRFe<"cgebr", 0xB3A8, GR64, FP32>; + def CGDBR : BinaryRRFe<"cgdbr", 0xB3A9, GR64, FP64>; + def CGXBR : BinaryRRFe<"cgxbr", 0xB3AA, GR64, FP128>; } // fp_to_sint always rounds towards zero, which is modifier value 5. @@ -229,16 +241,28 @@ def : Pat<(i64 (fp_to_sint FP32:$src)), (CGEBR 5, FP32:$src)>; def : Pat<(i64 (fp_to_sint FP64:$src)), (CGDBR 5, FP64:$src)>; def : Pat<(i64 (fp_to_sint FP128:$src)), (CGXBR 5, FP128:$src)>; +// The FP extension feature provides versions of the above that allow +// also specifying the inexact-exception suppression flag. +let Predicates = [FeatureFPExtension], Defs = [CC] in { + def CFEBRA : TernaryRRFe<"cfebra", 0xB398, GR32, FP32>; + def CFDBRA : TernaryRRFe<"cfdbra", 0xB399, GR32, FP64>; + def CFXBRA : TernaryRRFe<"cfxbra", 0xB39A, GR32, FP128>; + + def CGEBRA : TernaryRRFe<"cgebra", 0xB3A8, GR64, FP32>; + def CGDBRA : TernaryRRFe<"cgdbra", 0xB3A9, GR64, FP64>; + def CGXBRA : TernaryRRFe<"cgxbra", 0xB3AA, GR64, FP128>; +} + // Convert a floating-point register value to an unsigned integer value. let Predicates = [FeatureFPExtension] in { let Defs = [CC] in { - def CLFEBR : UnaryRRF4<"clfebr", 0xB39C, GR32, FP32>; - def CLFDBR : UnaryRRF4<"clfdbr", 0xB39D, GR32, FP64>; - def CLFXBR : UnaryRRF4<"clfxbr", 0xB39E, GR32, FP128>; + def CLFEBR : TernaryRRFe<"clfebr", 0xB39C, GR32, FP32>; + def CLFDBR : TernaryRRFe<"clfdbr", 0xB39D, GR32, FP64>; + def CLFXBR : TernaryRRFe<"clfxbr", 0xB39E, GR32, FP128>; - def CLGEBR : UnaryRRF4<"clgebr", 0xB3AC, GR64, FP32>; - def CLGDBR : UnaryRRF4<"clgdbr", 0xB3AD, GR64, FP64>; - def CLGXBR : UnaryRRF4<"clgxbr", 0xB3AE, GR64, FP128>; + def CLGEBR : TernaryRRFe<"clgebr", 0xB3AC, GR64, FP32>; + def CLGDBR : TernaryRRFe<"clgdbr", 0xB3AD, GR64, FP64>; + def CLGXBR : TernaryRRFe<"clgxbr", 0xB3AE, GR64, FP128>; } def : Pat<(i32 (fp_to_uint FP32:$src)), (CLFEBR 5, FP32:$src, 0)>; @@ -265,50 +289,50 @@ let Predicates = [FeatureFPExtension] in { // Negation (Load Complement). let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { - def LCEBR : UnaryRRE<"lceb", 0xB303, null_frag, FP32, FP32>; - def LCDBR : UnaryRRE<"lcdb", 0xB313, null_frag, FP64, FP64>; - def LCXBR : UnaryRRE<"lcxb", 0xB343, fneg, FP128, FP128>; + def LCEBR : UnaryRRE<"lcebr", 0xB303, null_frag, FP32, FP32>; + def LCDBR : UnaryRRE<"lcdbr", 0xB313, null_frag, FP64, FP64>; + def LCXBR : UnaryRRE<"lcxbr", 0xB343, fneg, FP128, FP128>; } // Generic form, which does not set CC. -def LCDFR : UnaryRRE<"lcdf", 0xB373, fneg, FP64, FP64>; +def LCDFR : UnaryRRE<"lcdfr", 0xB373, fneg, FP64, FP64>; let isCodeGenOnly = 1 in - def LCDFR_32 : UnaryRRE<"lcdf", 0xB373, fneg, FP32, FP32>; + def LCDFR_32 : UnaryRRE<"lcdfr", 0xB373, fneg, FP32, FP32>; // Absolute value (Load Positive). let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { - def LPEBR : UnaryRRE<"lpeb", 0xB300, null_frag, FP32, FP32>; - def LPDBR : UnaryRRE<"lpdb", 0xB310, null_frag, FP64, FP64>; - def LPXBR : UnaryRRE<"lpxb", 0xB340, fabs, FP128, FP128>; + def LPEBR : UnaryRRE<"lpebr", 0xB300, null_frag, FP32, FP32>; + def LPDBR : UnaryRRE<"lpdbr", 0xB310, null_frag, FP64, FP64>; + def LPXBR : UnaryRRE<"lpxbr", 0xB340, fabs, FP128, FP128>; } // Generic form, which does not set CC. -def LPDFR : UnaryRRE<"lpdf", 0xB370, fabs, FP64, FP64>; +def LPDFR : UnaryRRE<"lpdfr", 0xB370, fabs, FP64, FP64>; let isCodeGenOnly = 1 in - def LPDFR_32 : UnaryRRE<"lpdf", 0xB370, fabs, FP32, FP32>; + def LPDFR_32 : UnaryRRE<"lpdfr", 0xB370, fabs, FP32, FP32>; // Negative absolute value (Load Negative). let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { - def LNEBR : UnaryRRE<"lneb", 0xB301, null_frag, FP32, FP32>; - def LNDBR : UnaryRRE<"lndb", 0xB311, null_frag, FP64, FP64>; - def LNXBR : UnaryRRE<"lnxb", 0xB341, fnabs, FP128, FP128>; + def LNEBR : UnaryRRE<"lnebr", 0xB301, null_frag, FP32, FP32>; + def LNDBR : UnaryRRE<"lndbr", 0xB311, null_frag, FP64, FP64>; + def LNXBR : UnaryRRE<"lnxbr", 0xB341, fnabs, FP128, FP128>; } // Generic form, which does not set CC. -def LNDFR : UnaryRRE<"lndf", 0xB371, fnabs, FP64, FP64>; +def LNDFR : UnaryRRE<"lndfr", 0xB371, fnabs, FP64, FP64>; let isCodeGenOnly = 1 in - def LNDFR_32 : UnaryRRE<"lndf", 0xB371, fnabs, FP32, FP32>; + def LNDFR_32 : UnaryRRE<"lndfr", 0xB371, fnabs, FP32, FP32>; // Square root. -def SQEBR : UnaryRRE<"sqeb", 0xB314, fsqrt, FP32, FP32>; -def SQDBR : UnaryRRE<"sqdb", 0xB315, fsqrt, FP64, FP64>; -def SQXBR : UnaryRRE<"sqxb", 0xB316, fsqrt, FP128, FP128>; +def SQEBR : UnaryRRE<"sqebr", 0xB314, fsqrt, FP32, FP32>; +def SQDBR : UnaryRRE<"sqdbr", 0xB315, fsqrt, FP64, FP64>; +def SQXBR : UnaryRRE<"sqxbr", 0xB316, fsqrt, FP128, FP128>; def SQEB : UnaryRXE<"sqeb", 0xED14, loadu<fsqrt>, FP32, 4>; def SQDB : UnaryRXE<"sqdb", 0xED15, loadu<fsqrt>, FP64, 8>; // Round to an integer, with the second operand (modifier M3) specifying // the rounding mode. These forms always check for inexact conditions. -def FIEBR : UnaryRRF<"fieb", 0xB357, FP32, FP32>; -def FIDBR : UnaryRRF<"fidb", 0xB35F, FP64, FP64>; -def FIXBR : UnaryRRF<"fixb", 0xB347, FP128, FP128>; +def FIEBR : BinaryRRFe<"fiebr", 0xB357, FP32, FP32>; +def FIDBR : BinaryRRFe<"fidbr", 0xB35F, FP64, FP64>; +def FIXBR : BinaryRRFe<"fixbr", 0xB347, FP128, FP128>; // frint rounds according to the current mode (modifier 0) and detects // inexact conditions. @@ -319,9 +343,9 @@ def : Pat<(frint FP128:$src), (FIXBR 0, FP128:$src)>; let Predicates = [FeatureFPExtension] in { // Extended forms of the FIxBR instructions. M4 can be set to 4 // to suppress detection of inexact conditions. - def FIEBRA : UnaryRRF4<"fiebra", 0xB357, FP32, FP32>; - def FIDBRA : UnaryRRF4<"fidbra", 0xB35F, FP64, FP64>; - def FIXBRA : UnaryRRF4<"fixbra", 0xB347, FP128, FP128>; + def FIEBRA : TernaryRRFe<"fiebra", 0xB357, FP32, FP32>; + def FIDBRA : TernaryRRFe<"fidbra", 0xB35F, FP64, FP64>; + def FIXBRA : TernaryRRFe<"fixbra", 0xB347, FP128, FP128>; // fnearbyint is like frint but does not detect inexact conditions. def : Pat<(fnearbyint FP32:$src), (FIEBRA 0, FP32:$src, 4)>; @@ -347,9 +371,9 @@ let Predicates = [FeatureFPExtension] in { // Same idea for round, where mode 1 is round towards nearest with // ties away from zero. - def : Pat<(frnd FP32:$src), (FIEBRA 1, FP32:$src, 4)>; - def : Pat<(frnd FP64:$src), (FIDBRA 1, FP64:$src, 4)>; - def : Pat<(frnd FP128:$src), (FIXBRA 1, FP128:$src, 4)>; + def : Pat<(fround FP32:$src), (FIEBRA 1, FP32:$src, 4)>; + def : Pat<(fround FP64:$src), (FIDBRA 1, FP64:$src, 4)>; + def : Pat<(fround FP128:$src), (FIXBRA 1, FP128:$src, 4)>; } //===----------------------------------------------------------------------===// @@ -359,9 +383,9 @@ let Predicates = [FeatureFPExtension] in { // Addition. let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { let isCommutable = 1 in { - def AEBR : BinaryRRE<"aeb", 0xB30A, fadd, FP32, FP32>; - def ADBR : BinaryRRE<"adb", 0xB31A, fadd, FP64, FP64>; - def AXBR : BinaryRRE<"axb", 0xB34A, fadd, FP128, FP128>; + def AEBR : BinaryRRE<"aebr", 0xB30A, fadd, FP32, FP32>; + def ADBR : BinaryRRE<"adbr", 0xB31A, fadd, FP64, FP64>; + def AXBR : BinaryRRE<"axbr", 0xB34A, fadd, FP128, FP128>; } def AEB : BinaryRXE<"aeb", 0xED0A, fadd, FP32, load, 4>; def ADB : BinaryRXE<"adb", 0xED1A, fadd, FP64, load, 8>; @@ -369,9 +393,9 @@ let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { // Subtraction. let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { - def SEBR : BinaryRRE<"seb", 0xB30B, fsub, FP32, FP32>; - def SDBR : BinaryRRE<"sdb", 0xB31B, fsub, FP64, FP64>; - def SXBR : BinaryRRE<"sxb", 0xB34B, fsub, FP128, FP128>; + def SEBR : BinaryRRE<"sebr", 0xB30B, fsub, FP32, FP32>; + def SDBR : BinaryRRE<"sdbr", 0xB31B, fsub, FP64, FP64>; + def SXBR : BinaryRRE<"sxbr", 0xB34B, fsub, FP128, FP128>; def SEB : BinaryRXE<"seb", 0xED0B, fsub, FP32, load, 4>; def SDB : BinaryRXE<"sdb", 0xED1B, fsub, FP64, load, 8>; @@ -379,57 +403,57 @@ let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0xF in { // Multiplication. let isCommutable = 1 in { - def MEEBR : BinaryRRE<"meeb", 0xB317, fmul, FP32, FP32>; - def MDBR : BinaryRRE<"mdb", 0xB31C, fmul, FP64, FP64>; - def MXBR : BinaryRRE<"mxb", 0xB34C, fmul, FP128, FP128>; + def MEEBR : BinaryRRE<"meebr", 0xB317, fmul, FP32, FP32>; + def MDBR : BinaryRRE<"mdbr", 0xB31C, fmul, FP64, FP64>; + def MXBR : BinaryRRE<"mxbr", 0xB34C, fmul, FP128, FP128>; } def MEEB : BinaryRXE<"meeb", 0xED17, fmul, FP32, load, 4>; def MDB : BinaryRXE<"mdb", 0xED1C, fmul, FP64, load, 8>; // f64 multiplication of two FP32 registers. -def MDEBR : BinaryRRE<"mdeb", 0xB30C, null_frag, FP64, FP32>; -def : Pat<(fmul (f64 (fextend FP32:$src1)), (f64 (fextend FP32:$src2))), +def MDEBR : BinaryRRE<"mdebr", 0xB30C, null_frag, FP64, FP32>; +def : Pat<(fmul (f64 (fpextend FP32:$src1)), (f64 (fpextend FP32:$src2))), (MDEBR (INSERT_SUBREG (f64 (IMPLICIT_DEF)), FP32:$src1, subreg_r32), FP32:$src2)>; // f64 multiplication of an FP32 register and an f32 memory. def MDEB : BinaryRXE<"mdeb", 0xED0C, null_frag, FP64, load, 4>; -def : Pat<(fmul (f64 (fextend FP32:$src1)), +def : Pat<(fmul (f64 (fpextend FP32:$src1)), (f64 (extloadf32 bdxaddr12only:$addr))), (MDEB (INSERT_SUBREG (f64 (IMPLICIT_DEF)), FP32:$src1, subreg_r32), bdxaddr12only:$addr)>; // f128 multiplication of two FP64 registers. -def MXDBR : BinaryRRE<"mxdb", 0xB307, null_frag, FP128, FP64>; -def : Pat<(fmul (f128 (fextend FP64:$src1)), (f128 (fextend FP64:$src2))), +def MXDBR : BinaryRRE<"mxdbr", 0xB307, null_frag, FP128, FP64>; +def : Pat<(fmul (f128 (fpextend FP64:$src1)), (f128 (fpextend FP64:$src2))), (MXDBR (INSERT_SUBREG (f128 (IMPLICIT_DEF)), FP64:$src1, subreg_h64), FP64:$src2)>; // f128 multiplication of an FP64 register and an f64 memory. def MXDB : BinaryRXE<"mxdb", 0xED07, null_frag, FP128, load, 8>; -def : Pat<(fmul (f128 (fextend FP64:$src1)), +def : Pat<(fmul (f128 (fpextend FP64:$src1)), (f128 (extloadf64 bdxaddr12only:$addr))), (MXDB (INSERT_SUBREG (f128 (IMPLICIT_DEF)), FP64:$src1, subreg_h64), bdxaddr12only:$addr)>; // Fused multiply-add. -def MAEBR : TernaryRRD<"maeb", 0xB30E, z_fma, FP32>; -def MADBR : TernaryRRD<"madb", 0xB31E, z_fma, FP64>; +def MAEBR : TernaryRRD<"maebr", 0xB30E, z_fma, FP32>; +def MADBR : TernaryRRD<"madbr", 0xB31E, z_fma, FP64>; def MAEB : TernaryRXF<"maeb", 0xED0E, z_fma, FP32, load, 4>; def MADB : TernaryRXF<"madb", 0xED1E, z_fma, FP64, load, 8>; // Fused multiply-subtract. -def MSEBR : TernaryRRD<"mseb", 0xB30F, z_fms, FP32>; -def MSDBR : TernaryRRD<"msdb", 0xB31F, z_fms, FP64>; +def MSEBR : TernaryRRD<"msebr", 0xB30F, z_fms, FP32>; +def MSDBR : TernaryRRD<"msdbr", 0xB31F, z_fms, FP64>; def MSEB : TernaryRXF<"mseb", 0xED0F, z_fms, FP32, load, 4>; def MSDB : TernaryRXF<"msdb", 0xED1F, z_fms, FP64, load, 8>; // Division. -def DEBR : BinaryRRE<"deb", 0xB30D, fdiv, FP32, FP32>; -def DDBR : BinaryRRE<"ddb", 0xB31D, fdiv, FP64, FP64>; -def DXBR : BinaryRRE<"dxb", 0xB34D, fdiv, FP128, FP128>; +def DEBR : BinaryRRE<"debr", 0xB30D, fdiv, FP32, FP32>; +def DDBR : BinaryRRE<"ddbr", 0xB31D, fdiv, FP64, FP64>; +def DXBR : BinaryRRE<"dxbr", 0xB34D, fdiv, FP128, FP128>; def DEB : BinaryRXE<"deb", 0xED0D, fdiv, FP32, load, 4>; def DDB : BinaryRXE<"ddb", 0xED1D, fdiv, FP64, load, 8>; @@ -439,9 +463,9 @@ def DDB : BinaryRXE<"ddb", 0xED1D, fdiv, FP64, load, 8>; //===----------------------------------------------------------------------===// let Defs = [CC], CCValues = 0xF in { - def CEBR : CompareRRE<"ceb", 0xB309, z_fcmp, FP32, FP32>; - def CDBR : CompareRRE<"cdb", 0xB319, z_fcmp, FP64, FP64>; - def CXBR : CompareRRE<"cxb", 0xB349, z_fcmp, FP128, FP128>; + def CEBR : CompareRRE<"cebr", 0xB309, z_fcmp, FP32, FP32>; + def CDBR : CompareRRE<"cdbr", 0xB319, z_fcmp, FP64, FP64>; + def CXBR : CompareRRE<"cxbr", 0xB349, z_fcmp, FP128, FP128>; def CEB : CompareRXE<"ceb", 0xED09, z_fcmp, FP32, load, 4>; def CDB : CompareRXE<"cdb", 0xED19, z_fcmp, FP64, load, 8>; @@ -455,6 +479,26 @@ let Defs = [CC], CCValues = 0xC in { } //===----------------------------------------------------------------------===// +// Floating-point control register instructions +//===----------------------------------------------------------------------===// + +let hasSideEffects = 1 in { + def EFPC : InherentRRE<"efpc", 0xB38C, GR32, int_s390_efpc>; + def STFPC : StoreInherentS<"stfpc", 0xB29C, storei<int_s390_efpc>, 4>; + + def SFPC : SideEffectUnaryRRE<"sfpc", 0xB384, GR32, int_s390_sfpc>; + def LFPC : SideEffectUnaryS<"lfpc", 0xB29D, loadu<int_s390_sfpc>, 4>; + + def SFASR : SideEffectUnaryRRE<"sfasr", 0xB385, GR32, null_frag>; + def LFAS : SideEffectUnaryS<"lfas", 0xB2BD, null_frag, 4>; + + def SRNMB : SideEffectAddressS<"srnmb", 0xB2B8, null_frag, shift12only>, + Requires<[FeatureFPExtension]>; + def SRNM : SideEffectAddressS<"srnm", 0xB299, null_frag, shift12only>; + def SRNMT : SideEffectAddressS<"srnmt", 0xB2B9, null_frag, shift12only>; +} + +//===----------------------------------------------------------------------===// // Peepholes //===----------------------------------------------------------------------===// diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrFormats.td b/contrib/llvm/lib/Target/SystemZ/SystemZInstrFormats.td index 973894d..c727f48 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrFormats.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrFormats.td @@ -29,7 +29,7 @@ class InstSystemZ<int size, dag outs, dag ins, string asmstr, string DispSize = "none"; // Many register-based <INSN>R instructions have a memory-based <INSN> - // counterpart. OpKey uniquely identifies <INSN>, while OpType is + // counterpart. OpKey uniquely identifies <INSN>R, while OpType is // "reg" for <INSN>R and "mem" for <INSN>. string OpKey = ""; string OpType = "none"; @@ -158,6 +158,14 @@ def getThreeOperandOpcode : InstrMapping { // //===----------------------------------------------------------------------===// +class InstE<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<2, outs, ins, asmstr, pattern> { + field bits<16> Inst; + field bits<16> SoftFail = 0; + + let Inst = op; +} + class InstI<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<2, outs, ins, asmstr, pattern> { field bits<16> Inst; @@ -169,7 +177,36 @@ class InstI<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{7-0} = I1; } -class InstRI<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstIE<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> I1; + bits<4> I2; + + let Inst{31-16} = op; + let Inst{15-8} = 0; + let Inst{7-4} = I1; + let Inst{3-0} = I2; +} + +class InstMII<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> M1; + bits<12> RI2; + bits<24> RI3; + + let Inst{47-40} = op; + let Inst{39-36} = M1; + let Inst{35-24} = RI2; + let Inst{23-0} = RI3; +} + +class InstRIa<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; field bits<32> SoftFail = 0; @@ -183,6 +220,34 @@ class InstRI<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{15-0} = I2; } +class InstRIb<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> R1; + bits<16> RI2; + + let Inst{31-24} = op{11-4}; + let Inst{23-20} = R1; + let Inst{19-16} = op{3-0}; + let Inst{15-0} = RI2; +} + +class InstRIc<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> M1; + bits<16> RI2; + + let Inst{31-24} = op{11-4}; + let Inst{23-20} = M1; + let Inst{19-16} = op{3-0}; + let Inst{15-0} = RI2; +} + class InstRIEa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; @@ -255,6 +320,23 @@ class InstRIEd<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{7-0} = op{7-0}; } +class InstRIEe<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> R1; + bits<4> R3; + bits<16> RI2; + + let Inst{47-40} = op{15-8}; + let Inst{39-36} = R1; + let Inst{35-32} = R3; + let Inst{31-16} = RI2; + let Inst{15-8} = 0; + let Inst{7-0} = op{7-0}; +} + class InstRIEf<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; @@ -275,7 +357,24 @@ class InstRIEf<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{7-0} = op{7-0}; } -class InstRIL<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRIEg<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> R1; + bits<4> M3; + bits<16> I2; + + let Inst{47-40} = op{15-8}; + let Inst{39-36} = R1; + let Inst{35-32} = M3; + let Inst{31-16} = I2; + let Inst{15-8} = 0; + let Inst{7-0} = op{7-0}; +} + +class InstRILa<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; field bits<48> SoftFail = 0; @@ -289,6 +388,34 @@ class InstRIL<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{31-0} = I2; } +class InstRILb<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> R1; + bits<32> RI2; + + let Inst{47-40} = op{11-4}; + let Inst{39-36} = R1; + let Inst{35-32} = op{3-0}; + let Inst{31-0} = RI2; +} + +class InstRILc<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> M1; + bits<32> RI2; + + let Inst{47-40} = op{11-4}; + let Inst{39-36} = M1; + let Inst{35-32} = op{3-0}; + let Inst{31-0} = RI2; +} + class InstRIS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; @@ -350,7 +477,7 @@ class InstRRE<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{3-0} = R2; } -class InstRRF<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRRFa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; field bits<32> SoftFail = 0; @@ -358,11 +485,28 @@ class InstRRF<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> bits<4> R1; bits<4> R2; bits<4> R3; - bits<4> R4; + bits<4> M4; let Inst{31-16} = op; let Inst{15-12} = R3; - let Inst{11-8} = R4; + let Inst{11-8} = M4; + let Inst{7-4} = R1; + let Inst{3-0} = R2; +} + +class InstRRFb<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> R1; + bits<4> R2; + bits<4> R3; + bits<4> M4; + + let Inst{31-16} = op; + let Inst{15-12} = R3; + let Inst{11-8} = M4; let Inst{7-4} = R1; let Inst{3-0} = R2; } @@ -383,6 +527,23 @@ class InstRRFc<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{3-0} = R2; } +class InstRRFe<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> R1; + bits<4> R2; + bits<4> M3; + bits<4> M4; + + let Inst{31-16} = op; + let Inst{15-12} = M3; + let Inst{11-8} = M4; + let Inst{7-4} = R1; + let Inst{3-0} = R2; +} + class InstRRS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; @@ -402,7 +563,7 @@ class InstRRS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{7-0} = op{7-0}; } -class InstRX<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRXa<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; field bits<32> SoftFail = 0; @@ -417,6 +578,21 @@ class InstRX<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> let HasIndex = 1; } +class InstRXb<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> M1; + bits<20> XBD2; + + let Inst{31-24} = op; + let Inst{23-20} = M1; + let Inst{19-0} = XBD2; + + let HasIndex = 1; +} + class InstRXE<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; @@ -455,7 +631,7 @@ class InstRXF<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let HasIndex = 1; } -class InstRXY<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRXYa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; field bits<48> SoftFail = 0; @@ -472,7 +648,24 @@ class InstRXY<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let HasIndex = 1; } -class InstRS<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRXYb<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> M1; + bits<28> XBD2; + + let Inst{47-40} = op{15-8}; + let Inst{39-36} = M1; + let Inst{35-8} = XBD2; + let Inst{7-0} = op{7-0}; + + let Has20BitOffset = 1; + let HasIndex = 1; +} + +class InstRSa<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; field bits<32> SoftFail = 0; @@ -487,7 +680,37 @@ class InstRS<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{15-0} = BD2; } -class InstRSY<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstRSb<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> R1; + bits<4> M3; + bits<16> BD2; + + let Inst{31-24} = op; + let Inst{23-20} = R1; + let Inst{19-16} = M3; + let Inst{15-0} = BD2; +} + +class InstRSI<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<4> R1; + bits<4> R3; + bits<16> RI2; + + let Inst{31-24} = op; + let Inst{23-20} = R1; + let Inst{19-16} = R3; + let Inst{15-0} = RI2; +} + +class InstRSYa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; field bits<48> SoftFail = 0; @@ -505,6 +728,24 @@ class InstRSY<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Has20BitOffset = 1; } +class InstRSYb<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> R1; + bits<4> M3; + bits<24> BD2; + + let Inst{47-40} = op{15-8}; + let Inst{39-36} = R1; + let Inst{35-32} = M3; + let Inst{31-8} = BD2; + let Inst{7-0} = op{7-0}; + + let Has20BitOffset = 1; +} + class InstSI<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; @@ -547,7 +788,23 @@ class InstSIY<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Has20BitOffset = 1; } -class InstSS<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> +class InstSMI<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> M1; + bits<16> RI2; + bits<16> BD3; + + let Inst{47-40} = op; + let Inst{39-36} = M1; + let Inst{35-32} = 0; + let Inst{31-16} = BD3; + let Inst{15-0} = RI2; +} + +class InstSSa<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<6, outs, ins, asmstr, pattern> { field bits<48> Inst; field bits<48> SoftFail = 0; @@ -560,6 +817,68 @@ class InstSS<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> let Inst{15-0} = BD2; } +class InstSSd<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<20> RBD1; + bits<16> BD2; + bits<4> R3; + + let Inst{47-40} = op; + let Inst{39-36} = RBD1{19-16}; + let Inst{35-32} = R3; + let Inst{31-16} = RBD1{15-0}; + let Inst{15-0} = BD2; +} + +class InstSSe<bits<8> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<4> R1; + bits<16> BD2; + bits<4> R3; + bits<16> BD4; + + let Inst{47-40} = op; + let Inst{39-36} = R1; + let Inst{35-32} = R3; + let Inst{31-16} = BD2; + let Inst{15-0} = BD4; +} + +class InstSSE<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<16> BD1; + bits<16> BD2; + + let Inst{47-32} = op; + let Inst{31-16} = BD1; + let Inst{15-0} = BD2; +} + +class InstSSF<bits<12> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<6, outs, ins, asmstr, pattern> { + field bits<48> Inst; + field bits<48> SoftFail = 0; + + bits<16> BD1; + bits<16> BD2; + bits<4> R3; + + let Inst{47-40} = op{11-4}; + let Inst{39-36} = R3; + let Inst{35-32} = op{3-0}; + let Inst{31-16} = BD1; + let Inst{15-0} = BD2; +} + class InstS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> : InstSystemZ<4, outs, ins, asmstr, pattern> { field bits<32> Inst; @@ -948,6 +1267,294 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> } //===----------------------------------------------------------------------===// +// Instruction classes for .insn directives +//===----------------------------------------------------------------------===// + +class DirectiveInsnE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstE<0, outs, ins, asmstr, pattern> { + bits<16> enc; + + let Inst = enc; +} + +class DirectiveInsnRI<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRIa<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-24} = enc{31-24}; + let Inst{19-16} = enc{19-16}; +} + +class DirectiveInsnRIE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRIEd<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRIL<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRILa<0, outs, ins, asmstr, pattern> { + bits<48> enc; + string type; + + let Inst{47-40} = enc{47-40}; + let Inst{35-32} = enc{35-32}; +} + +class DirectiveInsnRIS<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRIS<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRR<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRR<0, outs, ins, asmstr, pattern> { + bits<16> enc; + + let Inst{15-8} = enc{15-8}; +} + +class DirectiveInsnRRE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRRE<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-16} = enc{31-16}; +} + +class DirectiveInsnRRF<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRRFa<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-16} = enc{31-16}; +} + +class DirectiveInsnRRS<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRRS<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRS<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRSa<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-24} = enc{31-24}; +} + +// RSE is like RSY except with a 12 bit displacement (instead of 20). +class DirectiveInsnRSE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRSYa<6, outs, ins, asmstr, pattern> { + bits <48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{31-16} = BD2{15-0}; + let Inst{15-8} = 0; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRSI<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRSI<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-24} = enc{31-24}; +} + +class DirectiveInsnRSY<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRSYa<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRX<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRXa<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-24} = enc{31-24}; +} + +class DirectiveInsnRXE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRXE<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let M3 = 0; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRXF<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRXF<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnRXY<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstRXYa<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnS<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstS<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-16} = enc{31-16}; +} + +class DirectiveInsnSI<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSI<0, outs, ins, asmstr, pattern> { + bits<32> enc; + + let Inst{31-24} = enc{31-24}; +} + +class DirectiveInsnSIY<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSIY<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{7-0} = enc{7-0}; +} + +class DirectiveInsnSIL<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSIL<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-32} = enc{47-32}; +} + +class DirectiveInsnSS<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSSd<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; +} + +class DirectiveInsnSSE<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSSE<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-32} = enc{47-32}; +} + +class DirectiveInsnSSF<dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSSF<0, outs, ins, asmstr, pattern> { + bits<48> enc; + + let Inst{47-40} = enc{47-40}; + let Inst{35-32} = enc{35-32}; +} + +//===----------------------------------------------------------------------===// +// Variants of instructions with condition mask +//===----------------------------------------------------------------------===// +// +// For instructions using a condition mask (e.g. conditional branches, +// compare-and-branch instructions, or conditional move instructions), +// we generally need to create multiple instruction patterns: +// +// - One used for code generation, which encodes the condition mask as an +// MI operand, but writes out an extended mnemonic for better readability. +// - One pattern for the base form of the instruction with an explicit +// condition mask (encoded as a plain integer MI operand). +// - Specific patterns for each extended mnemonic, where the condition mask +// is implied by the pattern name and not otherwise encoded at all. +// +// We need the latter primarily for the assembler and disassembler, since the +// assembler parser is not able to decode part of an instruction mnemonic +// into an operand. Thus we provide separate patterns for each mnemonic. +// +// Note that in some cases there are two different mnemonics for the same +// condition mask. In this case we cannot have both instructions available +// to the disassembler at the same time since the encodings are not distinct. +// Therefore the alternate forms are marked isAsmParserOnly. +// +// We don't make one of the two names an alias of the other because +// we need the custom parsing routines to select the correct register class. +// +// This section provides helpers for generating the specific forms. +// +//===----------------------------------------------------------------------===// + +// A class to describe a variant of an instruction with condition mask. +class CondVariant<bits<4> ccmaskin, string suffixin, bit alternatein> { + // The fixed condition mask to use. + bits<4> ccmask = ccmaskin; + + // The suffix to use for the extended assembler mnemonic. + string suffix = suffixin; + + // Whether this is an alternate that needs to be marked isAsmParserOnly. + bit alternate = alternatein; +} + +// Condition mask 15 means "always true", which is used to define +// unconditional branches as a variant of conditional branches. +def CondAlways : CondVariant<15, "", 0>; + +// Condition masks for general instructions that can set all 4 bits. +def CondVariantO : CondVariant<1, "o", 0>; +def CondVariantH : CondVariant<2, "h", 0>; +def CondVariantP : CondVariant<2, "p", 1>; +def CondVariantNLE : CondVariant<3, "nle", 0>; +def CondVariantL : CondVariant<4, "l", 0>; +def CondVariantM : CondVariant<4, "m", 1>; +def CondVariantNHE : CondVariant<5, "nhe", 0>; +def CondVariantLH : CondVariant<6, "lh", 0>; +def CondVariantNE : CondVariant<7, "ne", 0>; +def CondVariantNZ : CondVariant<7, "nz", 1>; +def CondVariantE : CondVariant<8, "e", 0>; +def CondVariantZ : CondVariant<8, "z", 1>; +def CondVariantNLH : CondVariant<9, "nlh", 0>; +def CondVariantHE : CondVariant<10, "he", 0>; +def CondVariantNL : CondVariant<11, "nl", 0>; +def CondVariantNM : CondVariant<11, "nm", 1>; +def CondVariantLE : CondVariant<12, "le", 0>; +def CondVariantNH : CondVariant<13, "nh", 0>; +def CondVariantNP : CondVariant<13, "np", 1>; +def CondVariantNO : CondVariant<14, "no", 0>; + +// A helper class to look up one of the above by name. +class CV<string name> + : CondVariant<!cast<CondVariant>("CondVariant"#name).ccmask, + !cast<CondVariant>("CondVariant"#name).suffix, + !cast<CondVariant>("CondVariant"#name).alternate>; + +// Condition masks for integer instructions (e.g. compare-and-branch). +// This is like the list above, except that condition 3 is not possible +// and that the low bit of the mask is therefore always 0. This means +// that each condition has two names. Conditions "o" and "no" are not used. +def IntCondVariantH : CondVariant<2, "h", 0>; +def IntCondVariantNLE : CondVariant<2, "nle", 1>; +def IntCondVariantL : CondVariant<4, "l", 0>; +def IntCondVariantNHE : CondVariant<4, "nhe", 1>; +def IntCondVariantLH : CondVariant<6, "lh", 0>; +def IntCondVariantNE : CondVariant<6, "ne", 1>; +def IntCondVariantE : CondVariant<8, "e", 0>; +def IntCondVariantNLH : CondVariant<8, "nlh", 1>; +def IntCondVariantHE : CondVariant<10, "he", 0>; +def IntCondVariantNL : CondVariant<10, "nl", 1>; +def IntCondVariantLE : CondVariant<12, "le", 0>; +def IntCondVariantNH : CondVariant<12, "nh", 1>; + +// A helper class to look up one of the above by name. +class ICV<string name> + : CondVariant<!cast<CondVariant>("IntCondVariant"#name).ccmask, + !cast<CondVariant>("IntCondVariant"#name).suffix, + !cast<CondVariant>("IntCondVariant"#name).alternate>; + +//===----------------------------------------------------------------------===// // Instruction definitions with semantics //===----------------------------------------------------------------------===// // @@ -960,11 +1567,32 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // Inherent: // One register output operand and no input operands. // +// StoreInherent: +// One address operand. The instruction stores to the address. +// +// SideEffectInherent: +// No input or output operands, but causes some side effect. +// +// Branch: +// One branch target. The instruction branches to the target. +// +// Call: +// One output operand and one branch target. The instruction stores +// the return address to the output operand and branches to the target. +// +// CmpBranch: +// Two input operands and one optional branch target. The instruction +// compares the two input operands and branches or traps on the result. +// // BranchUnary: -// One register output operand, one register input operand and -// one branch displacement. The instructions stores a modified -// form of the source register in the destination register and -// branches on the result. +// One register output operand, one register input operand and one branch +// target. The instructions stores a modified form of the source register +// in the destination register and branches on the result. +// +// BranchBinary: +// One register output operand, two register input operands and one branch +// target. The instructions stores a modified form of one of the source +// registers in the destination register and branches on the result. // // LoadMultiple: // One address input operand and two explicit output operands. @@ -984,6 +1612,12 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // doesn't write more than the number of bytes specified by the // length operand. // +// LoadAddress: +// One register output operand and one address operand. +// +// SideEffectAddress: +// One address operand. No output operands, but causes some side effect. +// // Unary: // One register output operand and one input operand. // @@ -991,6 +1625,9 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // One address operand and one other input operand. The instruction // stores to the address. // +// SideEffectUnary: +// One input operand. No output operands, but causes some side effect. +// // Binary: // One register output operand and two input operands. // @@ -998,6 +1635,9 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // One address operand and two other input operands. The instruction // stores to the address. // +// SideEffectBinary: +// Two input operands. No output operands, but causes some side effect. +// // Compare: // Two input operands and an implicit CC output operand. // @@ -1008,6 +1648,9 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // Ternary: // One register output operand and three input operands. // +// SideEffectTernary: +// Three input operands. No output operands, but causes some side effect. +// // Quaternary: // One register output operand and four input operands. // @@ -1027,6 +1670,9 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> // One 4-bit immediate operand and one address operand. The immediate // operand is 1 for a load prefetch and 2 for a store prefetch. // +// BranchPreload: +// One 4-bit immediate operand and two address operands. +// // The format determines which input operands are tied to output operands, // and also determines the shape of any address operand. // @@ -1038,10 +1684,10 @@ class InstVRX<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> //===----------------------------------------------------------------------===// class InherentRRE<string mnemonic, bits<16> opcode, RegisterOperand cls, - dag src> + SDPatternOperator operator> : InstRRE<opcode, (outs cls:$R1), (ins), mnemonic#"\t$R1", - [(set cls:$R1, src)]> { + [(set cls:$R1, (operator))]> { let R2 = 0; } @@ -1051,26 +1697,380 @@ class InherentVRIa<string mnemonic, bits<16> opcode, bits<16> value> let M3 = 0; } +class StoreInherentS<string mnemonic, bits<16> opcode, + SDPatternOperator operator, bits<5> bytes> + : InstS<opcode, (outs), (ins bdaddr12only:$BD2), + mnemonic#"\t$BD2", [(operator bdaddr12only:$BD2)]> { + let mayStore = 1; + let AccessBytes = bytes; +} + +class SideEffectInherentE<string mnemonic, bits<16>opcode> + : InstE<opcode, (outs), (ins), mnemonic, []>; + +class SideEffectInherentS<string mnemonic, bits<16> opcode, + SDPatternOperator operator> + : InstS<opcode, (outs), (ins), mnemonic, [(operator)]> { + let BD2 = 0; +} + +// Allow an optional TLS marker symbol to generate TLS call relocations. +class CallRI<string mnemonic, bits<12> opcode> + : InstRIb<opcode, (outs), (ins GR64:$R1, brtarget16tls:$RI2), + mnemonic#"\t$R1, $RI2", []>; + +// Allow an optional TLS marker symbol to generate TLS call relocations. +class CallRIL<string mnemonic, bits<12> opcode> + : InstRILb<opcode, (outs), (ins GR64:$R1, brtarget32tls:$RI2), + mnemonic#"\t$R1, $RI2", []>; + +class CallRR<string mnemonic, bits<8> opcode> + : InstRR<opcode, (outs), (ins GR64:$R1, ADDR64:$R2), + mnemonic#"\t$R1, $R2", []>; + +class CallRX<string mnemonic, bits<8> opcode> + : InstRXa<opcode, (outs), (ins GR64:$R1, bdxaddr12only:$XBD2), + mnemonic#"\t$R1, $XBD2", []>; + +class CondBranchRI<string mnemonic, bits<12> opcode, + SDPatternOperator operator = null_frag> + : InstRIc<opcode, (outs), (ins cond4:$valid, cond4:$M1, brtarget16:$RI2), + !subst("#", "${M1}", mnemonic)#"\t$RI2", + [(operator cond4:$valid, cond4:$M1, bb:$RI2)]> { + let CCMaskFirst = 1; +} + +class AsmCondBranchRI<string mnemonic, bits<12> opcode> + : InstRIc<opcode, (outs), (ins imm32zx4:$M1, brtarget16:$RI2), + mnemonic#"\t$M1, $RI2", []>; + +class FixedCondBranchRI<CondVariant V, string mnemonic, bits<12> opcode, + SDPatternOperator operator = null_frag> + : InstRIc<opcode, (outs), (ins brtarget16:$RI2), + !subst("#", V.suffix, mnemonic)#"\t$RI2", [(operator bb:$RI2)]> { + let isAsmParserOnly = V.alternate; + let M1 = V.ccmask; +} + +class CondBranchRIL<string mnemonic, bits<12> opcode> + : InstRILc<opcode, (outs), (ins cond4:$valid, cond4:$M1, brtarget32:$RI2), + !subst("#", "${M1}", mnemonic)#"\t$RI2", []> { + let CCMaskFirst = 1; +} + +class AsmCondBranchRIL<string mnemonic, bits<12> opcode> + : InstRILc<opcode, (outs), (ins imm32zx4:$M1, brtarget32:$RI2), + mnemonic#"\t$M1, $RI2", []>; + +class FixedCondBranchRIL<CondVariant V, string mnemonic, bits<12> opcode> + : InstRILc<opcode, (outs), (ins brtarget32:$RI2), + !subst("#", V.suffix, mnemonic)#"\t$RI2", []> { + let isAsmParserOnly = V.alternate; + let M1 = V.ccmask; +} + +class CondBranchRR<string mnemonic, bits<8> opcode> + : InstRR<opcode, (outs), (ins cond4:$valid, cond4:$R1, GR64:$R2), + !subst("#", "${R1}", mnemonic)#"\t$R2", []> { + let CCMaskFirst = 1; +} + +class AsmCondBranchRR<string mnemonic, bits<8> opcode> + : InstRR<opcode, (outs), (ins imm32zx4:$R1, GR64:$R2), + mnemonic#"\t$R1, $R2", []>; + +class FixedCondBranchRR<CondVariant V, string mnemonic, bits<8> opcode, + SDPatternOperator operator = null_frag> + : InstRR<opcode, (outs), (ins ADDR64:$R2), + !subst("#", V.suffix, mnemonic)#"\t$R2", [(operator ADDR64:$R2)]> { + let isAsmParserOnly = V.alternate; + let R1 = V.ccmask; +} + +class CondBranchRX<string mnemonic, bits<8> opcode> + : InstRXb<opcode, (outs), (ins cond4:$valid, cond4:$M1, bdxaddr12only:$XBD2), + !subst("#", "${M1}", mnemonic)#"\t$XBD2", []> { + let CCMaskFirst = 1; +} + +class AsmCondBranchRX<string mnemonic, bits<8> opcode> + : InstRXb<opcode, (outs), (ins imm32zx4:$M1, bdxaddr12only:$XBD2), + mnemonic#"\t$M1, $XBD2", []>; + +class FixedCondBranchRX<CondVariant V, string mnemonic, bits<8> opcode> + : InstRXb<opcode, (outs), (ins bdxaddr12only:$XBD2), + !subst("#", V.suffix, mnemonic)#"\t$XBD2", []> { + let isAsmParserOnly = V.alternate; + let M1 = V.ccmask; +} + +class CmpBranchRIEa<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEa<opcode, (outs), (ins cls:$R1, imm:$I2, cond4:$M3), + mnemonic#"$M3\t$R1, $I2", []>; + +class AsmCmpBranchRIEa<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEa<opcode, (outs), (ins cls:$R1, imm:$I2, imm32zx4:$M3), + mnemonic#"\t$R1, $I2, $M3", []>; + +class FixedCmpBranchRIEa<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEa<opcode, (outs), (ins cls:$R1, imm:$I2), + mnemonic#V.suffix#"\t$R1, $I2", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CmpBranchRIEaPair<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRIEa<mnemonic, opcode, cls, imm>; + def Asm : AsmCmpBranchRIEa<mnemonic, opcode, cls, imm>; +} + +class CmpBranchRIEb<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRIEb<opcode, (outs), + (ins cls:$R1, cls:$R2, cond4:$M3, brtarget16:$RI4), + mnemonic#"$M3\t$R1, $R2, $RI4", []>; + +class AsmCmpBranchRIEb<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRIEb<opcode, (outs), + (ins cls:$R1, cls:$R2, imm32zx4:$M3, brtarget16:$RI4), + mnemonic#"\t$R1, $R2, $M3, $RI4", []>; + +class FixedCmpBranchRIEb<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRIEb<opcode, (outs), (ins cls:$R1, cls:$R2, brtarget16:$RI4), + mnemonic#V.suffix#"\t$R1, $R2, $RI4", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CmpBranchRIEbPair<string mnemonic, bits<16> opcode, + RegisterOperand cls> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRIEb<mnemonic, opcode, cls>; + def Asm : AsmCmpBranchRIEb<mnemonic, opcode, cls>; +} + +class CmpBranchRIEc<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEc<opcode, (outs), + (ins cls:$R1, imm:$I2, cond4:$M3, brtarget16:$RI4), + mnemonic#"$M3\t$R1, $I2, $RI4", []>; + +class AsmCmpBranchRIEc<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEc<opcode, (outs), + (ins cls:$R1, imm:$I2, imm32zx4:$M3, brtarget16:$RI4), + mnemonic#"\t$R1, $I2, $M3, $RI4", []>; + +class FixedCmpBranchRIEc<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEc<opcode, (outs), (ins cls:$R1, imm:$I2, brtarget16:$RI4), + mnemonic#V.suffix#"\t$R1, $I2, $RI4", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CmpBranchRIEcPair<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRIEc<mnemonic, opcode, cls, imm>; + def Asm : AsmCmpBranchRIEc<mnemonic, opcode, cls, imm>; +} + +class CmpBranchRRFc<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRFc<opcode, (outs), (ins cls:$R1, cls:$R2, cond4:$M3), + mnemonic#"$M3\t$R1, $R2", []>; + +class AsmCmpBranchRRFc<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRFc<opcode, (outs), (ins cls:$R1, cls:$R2, imm32zx4:$M3), + mnemonic#"\t$R1, $R2, $M3", []>; + +multiclass CmpBranchRRFcPair<string mnemonic, bits<16> opcode, + RegisterOperand cls> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRRFc<mnemonic, opcode, cls>; + def Asm : AsmCmpBranchRRFc<mnemonic, opcode, cls>; +} + +class FixedCmpBranchRRFc<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRFc<opcode, (outs), (ins cls:$R1, cls:$R2), + mnemonic#V.suffix#"\t$R1, $R2", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +class CmpBranchRRS<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRS<opcode, (outs), + (ins cls:$R1, cls:$R2, cond4:$M3, bdaddr12only:$BD4), + mnemonic#"$M3\t$R1, $R2, $BD4", []>; + +class AsmCmpBranchRRS<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRS<opcode, (outs), + (ins cls:$R1, cls:$R2, imm32zx4:$M3, bdaddr12only:$BD4), + mnemonic#"\t$R1, $R2, $M3, $BD4", []>; + +class FixedCmpBranchRRS<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRRS<opcode, (outs), (ins cls:$R1, cls:$R2, bdaddr12only:$BD4), + mnemonic#V.suffix#"\t$R1, $R2, $BD4", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CmpBranchRRSPair<string mnemonic, bits<16> opcode, + RegisterOperand cls> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRRS<mnemonic, opcode, cls>; + def Asm : AsmCmpBranchRRS<mnemonic, opcode, cls>; +} + +class CmpBranchRIS<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIS<opcode, (outs), + (ins cls:$R1, imm:$I2, cond4:$M3, bdaddr12only:$BD4), + mnemonic#"$M3\t$R1, $I2, $BD4", []>; + +class AsmCmpBranchRIS<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIS<opcode, (outs), + (ins cls:$R1, imm:$I2, imm32zx4:$M3, bdaddr12only:$BD4), + mnemonic#"\t$R1, $I2, $M3, $BD4", []>; + +class FixedCmpBranchRIS<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIS<opcode, (outs), (ins cls:$R1, imm:$I2, bdaddr12only:$BD4), + mnemonic#V.suffix#"\t$R1, $I2, $BD4", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CmpBranchRISPair<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRIS<mnemonic, opcode, cls, imm>; + def Asm : AsmCmpBranchRIS<mnemonic, opcode, cls, imm>; +} + +class CmpBranchRSYb<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRSYb<opcode, (outs), (ins cls:$R1, bdaddr20only:$BD2, cond4:$M3), + mnemonic#"$M3\t$R1, $BD2", []>; + +class AsmCmpBranchRSYb<string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRSYb<opcode, (outs), (ins cls:$R1, bdaddr20only:$BD2, imm32zx4:$M3), + mnemonic#"\t$R1, $M3, $BD2", []>; + +multiclass CmpBranchRSYbPair<string mnemonic, bits<16> opcode, + RegisterOperand cls> { + let isCodeGenOnly = 1 in + def "" : CmpBranchRSYb<mnemonic, opcode, cls>; + def Asm : AsmCmpBranchRSYb<mnemonic, opcode, cls>; +} + +class FixedCmpBranchRSYb<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls> + : InstRSYb<opcode, (outs), (ins cls:$R1, bdaddr20only:$BD2), + mnemonic#V.suffix#"\t$R1, $BD2", []> { + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + class BranchUnaryRI<string mnemonic, bits<12> opcode, RegisterOperand cls> - : InstRI<opcode, (outs cls:$R1), (ins cls:$R1src, brtarget16:$I2), - mnemonic##"\t$R1, $I2", []> { - let isBranch = 1; - let isTerminator = 1; + : InstRIb<opcode, (outs cls:$R1), (ins cls:$R1src, brtarget16:$RI2), + mnemonic##"\t$R1, $RI2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchUnaryRIL<string mnemonic, bits<12> opcode, RegisterOperand cls> + : InstRILb<opcode, (outs cls:$R1), (ins cls:$R1src, brtarget32:$RI2), + mnemonic##"\t$R1, $RI2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchUnaryRR<string mnemonic, bits<8> opcode, RegisterOperand cls> + : InstRR<opcode, (outs cls:$R1), (ins cls:$R1src, GR64:$R2), + mnemonic##"\t$R1, $R2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchUnaryRRE<string mnemonic, bits<16> opcode, RegisterOperand cls> + : InstRRE<opcode, (outs cls:$R1), (ins cls:$R1src, GR64:$R2), + mnemonic##"\t$R1, $R2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchUnaryRX<string mnemonic, bits<8> opcode, RegisterOperand cls> + : InstRXa<opcode, (outs cls:$R1), (ins cls:$R1src, bdxaddr12only:$XBD2), + mnemonic##"\t$R1, $XBD2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchUnaryRXY<string mnemonic, bits<16> opcode, RegisterOperand cls> + : InstRXYa<opcode, (outs cls:$R1), (ins cls:$R1src, bdxaddr20only:$XBD2), + mnemonic##"\t$R1, $XBD2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchBinaryRSI<string mnemonic, bits<8> opcode, RegisterOperand cls> + : InstRSI<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, brtarget16:$RI2), + mnemonic##"\t$R1, $R3, $RI2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchBinaryRIEe<string mnemonic, bits<16> opcode, RegisterOperand cls> + : InstRIEe<opcode, (outs cls:$R1), + (ins cls:$R1src, cls:$R3, brtarget16:$RI2), + mnemonic##"\t$R1, $R3, $RI2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchBinaryRS<string mnemonic, bits<8> opcode, RegisterOperand cls> + : InstRSa<opcode, (outs cls:$R1), + (ins cls:$R1src, cls:$R3, bdaddr12only:$BD2), + mnemonic##"\t$R1, $R3, $BD2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +class BranchBinaryRSY<string mnemonic, bits<16> opcode, RegisterOperand cls> + : InstRSYa<opcode, + (outs cls:$R1), (ins cls:$R1src, cls:$R3, bdaddr20only:$BD2), + mnemonic##"\t$R1, $R3, $BD2", []> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; } class LoadMultipleRS<string mnemonic, bits<8> opcode, RegisterOperand cls, AddressingMode mode = bdaddr12only> - : InstRS<opcode, (outs cls:$R1, cls:$R3), (ins mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSa<opcode, (outs cls:$R1, cls:$R3), (ins mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", []> { let mayLoad = 1; } class LoadMultipleRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1, cls:$R3), (ins mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSYa<opcode, (outs cls:$R1, cls:$R3), (ins mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", []> { let mayLoad = 1; } @@ -1093,9 +2093,9 @@ class LoadMultipleVRSa<string mnemonic, bits<16> opcode> class StoreRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls> - : InstRIL<opcode, (outs), (ins cls:$R1, pcrel32:$I2), - mnemonic#"\t$R1, $I2", - [(operator cls:$R1, pcrel32:$I2)]> { + : InstRILb<opcode, (outs), (ins cls:$R1, pcrel32:$RI2), + mnemonic#"\t$R1, $RI2", + [(operator cls:$R1, pcrel32:$RI2)]> { let mayStore = 1; // We want PC-relative addresses to be tried ahead of BD and BDX addresses. // However, BDXs have two extra operands and are therefore 6 units more @@ -1106,10 +2106,10 @@ class StoreRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, class StoreRX<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdxaddr12only> - : InstRX<opcode, (outs), (ins cls:$R1, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(operator cls:$R1, mode:$XBD2)]> { - let OpKey = mnemonic ## cls; + : InstRXa<opcode, (outs), (ins cls:$R1, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(operator cls:$R1, mode:$XBD2)]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let mayStore = 1; let AccessBytes = bytes; @@ -1118,10 +2118,10 @@ class StoreRX<string mnemonic, bits<8> opcode, SDPatternOperator operator, class StoreRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdxaddr20only> - : InstRXY<opcode, (outs), (ins cls:$R1, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(operator cls:$R1, mode:$XBD2)]> { - let OpKey = mnemonic ## cls; + : InstRXYa<opcode, (outs), (ins cls:$R1, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(operator cls:$R1, mode:$XBD2)]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let mayStore = 1; let AccessBytes = bytes; @@ -1161,15 +2161,15 @@ class StoreLengthVRSb<string mnemonic, bits<16> opcode, class StoreMultipleRS<string mnemonic, bits<8> opcode, RegisterOperand cls, AddressingMode mode = bdaddr12only> - : InstRS<opcode, (outs), (ins cls:$R1, cls:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSa<opcode, (outs), (ins cls:$R1, cls:$R3, mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", []> { let mayStore = 1; } class StoreMultipleRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs), (ins cls:$R1, cls:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSYa<opcode, (outs), (ins cls:$R1, cls:$R3, mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", []> { let mayStore = 1; } @@ -1230,12 +2230,17 @@ multiclass StoreSIPair<string mnemonic, bits<8> siOpcode, bits<16> siyOpcode, } } +class StoreSSE<string mnemonic, bits<16> opcode> + : InstSSE<opcode, (outs), (ins bdaddr12only:$BD1, bdaddr12only:$BD2), + mnemonic#"\t$BD1, $BD2", []> { + let mayStore = 1; +} + class CondStoreRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs), (ins cls:$R1, mode:$BD2, cond4:$valid, cond4:$R3), - mnemonic#"$R3\t$R1, $BD2", []>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs), (ins cls:$R1, mode:$BD2, cond4:$valid, cond4:$M3), + mnemonic#"$M3\t$R1, $BD2", []> { let mayStore = 1; let AccessBytes = bytes; let CCMaskLast = 1; @@ -1246,139 +2251,127 @@ class CondStoreRSY<string mnemonic, bits<16> opcode, class AsmCondStoreRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs), (ins cls:$R1, mode:$BD2, imm32zx4:$R3), - mnemonic#"\t$R1, $BD2, $R3", []>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs), (ins cls:$R1, mode:$BD2, imm32zx4:$M3), + mnemonic#"\t$R1, $BD2, $M3", []> { let mayStore = 1; let AccessBytes = bytes; } // Like CondStoreRSY, but with a fixed CC mask. -class FixedCondStoreRSY<string mnemonic, bits<16> opcode, - RegisterOperand cls, bits<4> ccmask, bits<5> bytes, +class FixedCondStoreRSY<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs), (ins cls:$R1, mode:$BD2), - mnemonic#"\t$R1, $BD2", []>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs), (ins cls:$R1, mode:$BD2), + mnemonic#V.suffix#"\t$R1, $BD2", []> { let mayStore = 1; let AccessBytes = bytes; - let R3 = ccmask; + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; } -class UnaryRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, - RegisterOperand cls1, RegisterOperand cls2> - : InstRR<opcode, (outs cls1:$R1), (ins cls2:$R2), - mnemonic#"r\t$R1, $R2", - [(set cls1:$R1, (operator cls2:$R2))]> { - let OpKey = mnemonic ## cls1; - let OpType = "reg"; +multiclass CondStoreRSYPair<string mnemonic, bits<16> opcode, + RegisterOperand cls, bits<5> bytes, + AddressingMode mode = bdaddr20only> { + let isCodeGenOnly = 1 in + def "" : CondStoreRSY<mnemonic, opcode, cls, bytes, mode>; + def Asm : AsmCondStoreRSY<mnemonic, opcode, cls, bytes, mode>; } -class UnaryRRE<string mnemonic, bits<16> opcode, SDPatternOperator operator, - RegisterOperand cls1, RegisterOperand cls2> - : InstRRE<opcode, (outs cls1:$R1), (ins cls2:$R2), - mnemonic#"r\t$R1, $R2", - [(set cls1:$R1, (operator cls2:$R2))]> { - let OpKey = mnemonic ## cls1; - let OpType = "reg"; -} +class SideEffectUnaryI<string mnemonic, bits<8> opcode, Immediate imm> + : InstI<opcode, (outs), (ins imm:$I1), + mnemonic#"\t$I1", []>; -class UnaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, - RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins imm32zx4:$R3, cls2:$R2), - mnemonic#"r\t$R1, $R3, $R2", []> { - let OpKey = mnemonic ## cls1; - let OpType = "reg"; - let R4 = 0; +class SideEffectUnaryRR<string mnemonic, bits<8>opcode, RegisterOperand cls> + : InstRR<opcode, (outs), (ins cls:$R1), + mnemonic#"\t$R1", []> { + let R2 = 0; } -class UnaryRRF4<string mnemonic, bits<16> opcode, RegisterOperand cls1, - RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins imm32zx4:$R3, cls2:$R2, imm32zx4:$R4), - mnemonic#"\t$R1, $R3, $R2, $R4", []>; - -// These instructions are generated by if conversion. The old value of R1 -// is added as an implicit use. -class CondUnaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, - RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins cls2:$R2, cond4:$valid, cond4:$R3), - mnemonic#"r$R3\t$R1, $R2", []>, - Requires<[FeatureLoadStoreOnCond]> { - let CCMaskLast = 1; - let R4 = 0; +class SideEffectUnaryRRE<string mnemonic, bits<16> opcode, RegisterOperand cls, + SDPatternOperator operator> + : InstRRE<opcode, (outs), (ins cls:$R1), + mnemonic#"\t$R1", [(operator cls:$R1)]> { + let R2 = 0; } -class CondUnaryRIE<string mnemonic, bits<16> opcode, RegisterOperand cls, - Immediate imm> - : InstRIEd<opcode, (outs cls:$R1), - (ins imm:$I2, cond4:$valid, cond4:$R3), - mnemonic#"$R3\t$R1, $I2", []>, - Requires<[FeatureLoadStoreOnCond2]> { - let CCMaskLast = 1; +class SideEffectUnaryS<string mnemonic, bits<16> opcode, + SDPatternOperator operator, bits<5> bytes, + AddressingMode mode = bdaddr12only> + : InstS<opcode, (outs), (ins mode:$BD2), + mnemonic#"\t$BD2", [(operator mode:$BD2)]> { + let mayLoad = 1; + let AccessBytes = bytes; } -// Like CondUnaryRRF, but used for the raw assembly form. The condition-code -// mask is the third operand rather than being part of the mnemonic. -class AsmCondUnaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, - RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins cls1:$R1src, cls2:$R2, imm32zx4:$R3), - mnemonic#"r\t$R1, $R2, $R3", []>, - Requires<[FeatureLoadStoreOnCond]> { - let Constraints = "$R1 = $R1src"; - let DisableEncoding = "$R1src"; - let R4 = 0; -} +class SideEffectAddressS<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + AddressingMode mode = bdaddr12only> + : InstS<opcode, (outs), (ins mode:$BD2), + mnemonic#"\t$BD2", [(operator mode:$BD2)]>; -class AsmCondUnaryRIE<string mnemonic, bits<16> opcode, RegisterOperand cls, - Immediate imm> - : InstRIEd<opcode, (outs cls:$R1), - (ins cls:$R1src, imm:$I2, imm32zx4:$R3), - mnemonic#"\t$R1, $I2, $R3", []>, - Requires<[FeatureLoadStoreOnCond2]> { - let Constraints = "$R1 = $R1src"; - let DisableEncoding = "$R1src"; +class LoadAddressRX<string mnemonic, bits<8> opcode, + SDPatternOperator operator, AddressingMode mode> + : InstRXa<opcode, (outs GR64:$R1), (ins mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set GR64:$R1, (operator mode:$XBD2))]>; + +class LoadAddressRXY<string mnemonic, bits<16> opcode, + SDPatternOperator operator, AddressingMode mode> + : InstRXYa<opcode, (outs GR64:$R1), (ins mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set GR64:$R1, (operator mode:$XBD2))]>; + +multiclass LoadAddressRXPair<string mnemonic, bits<8> rxOpcode, + bits<16> rxyOpcode, SDPatternOperator operator> { + let DispKey = mnemonic in { + let DispSize = "12" in + def "" : LoadAddressRX<mnemonic, rxOpcode, operator, laaddr12pair>; + let DispSize = "20" in + def Y : LoadAddressRXY<mnemonic#"y", rxyOpcode, operator, laaddr20pair>; + } } -// Like CondUnaryRRF, but with a fixed CC mask. -class FixedCondUnaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, - RegisterOperand cls2, bits<4> ccmask> - : InstRRF<opcode, (outs cls1:$R1), (ins cls1:$R1src, cls2:$R2), - mnemonic#"\t$R1, $R2", []>, - Requires<[FeatureLoadStoreOnCond]> { - let Constraints = "$R1 = $R1src"; - let DisableEncoding = "$R1src"; - let R3 = ccmask; - let R4 = 0; +class LoadAddressRIL<string mnemonic, bits<12> opcode, + SDPatternOperator operator> + : InstRILb<opcode, (outs GR64:$R1), (ins pcrel32:$RI2), + mnemonic#"\t$R1, $RI2", + [(set GR64:$R1, (operator pcrel32:$RI2))]>; + +class UnaryRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, + RegisterOperand cls1, RegisterOperand cls2> + : InstRR<opcode, (outs cls1:$R1), (ins cls2:$R2), + mnemonic#"\t$R1, $R2", + [(set cls1:$R1, (operator cls2:$R2))]> { + let OpKey = mnemonic#cls1; + let OpType = "reg"; } -class FixedCondUnaryRIE<string mnemonic, bits<16> opcode, RegisterOperand cls, - Immediate imm, bits<4> ccmask> - : InstRIEd<opcode, (outs cls:$R1), - (ins cls:$R1src, imm:$I2), - mnemonic#"\t$R1, $I2", []>, - Requires<[FeatureLoadStoreOnCond2]> { - let Constraints = "$R1 = $R1src"; - let DisableEncoding = "$R1src"; - let R3 = ccmask; +class UnaryRRE<string mnemonic, bits<16> opcode, SDPatternOperator operator, + RegisterOperand cls1, RegisterOperand cls2> + : InstRRE<opcode, (outs cls1:$R1), (ins cls2:$R2), + mnemonic#"\t$R1, $R2", + [(set cls1:$R1, (operator cls2:$R2))]> { + let OpKey = mnemonic#cls1; + let OpType = "reg"; } class UnaryRI<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRI<opcode, (outs cls:$R1), (ins imm:$I2), - mnemonic#"\t$R1, $I2", - [(set cls:$R1, (operator imm:$I2))]>; + : InstRIa<opcode, (outs cls:$R1), (ins imm:$I2), + mnemonic#"\t$R1, $I2", + [(set cls:$R1, (operator imm:$I2))]>; class UnaryRIL<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRIL<opcode, (outs cls:$R1), (ins imm:$I2), - mnemonic#"\t$R1, $I2", - [(set cls:$R1, (operator imm:$I2))]>; + : InstRILa<opcode, (outs cls:$R1), (ins imm:$I2), + mnemonic#"\t$R1, $I2", + [(set cls:$R1, (operator imm:$I2))]>; class UnaryRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls> - : InstRIL<opcode, (outs cls:$R1), (ins pcrel32:$I2), - mnemonic#"\t$R1, $I2", - [(set cls:$R1, (operator pcrel32:$I2))]> { + : InstRILb<opcode, (outs cls:$R1), (ins pcrel32:$RI2), + mnemonic#"\t$R1, $RI2", + [(set cls:$R1, (operator pcrel32:$RI2))]> { let mayLoad = 1; // We want PC-relative addresses to be tried ahead of BD and BDX addresses. // However, BDXs have two extra operands and are therefore 6 units more @@ -1389,13 +2382,12 @@ class UnaryRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, class CondUnaryRSY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), - (ins cls:$R1src, mode:$BD2, cond4:$valid, cond4:$R3), - mnemonic#"$R3\t$R1, $BD2", - [(set cls:$R1, - (z_select_ccmask (load bdaddr20only:$BD2), cls:$R1src, - cond4:$valid, cond4:$R3))]>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs cls:$R1), + (ins cls:$R1src, mode:$BD2, cond4:$valid, cond4:$M3), + mnemonic#"$M3\t$R1, $BD2", + [(set cls:$R1, + (z_select_ccmask (operator bdaddr20only:$BD2), cls:$R1src, + cond4:$valid, cond4:$M3))]> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; let mayLoad = 1; @@ -1408,9 +2400,8 @@ class CondUnaryRSY<string mnemonic, bits<16> opcode, class AsmCondUnaryRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$BD2, imm32zx4:$R3), - mnemonic#"\t$R1, $BD2, $R3", []>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$BD2, imm32zx4:$M3), + mnemonic#"\t$R1, $BD2, $M3", []> { let mayLoad = 1; let AccessBytes = bytes; let Constraints = "$R1 = $R1src"; @@ -1418,26 +2409,36 @@ class AsmCondUnaryRSY<string mnemonic, bits<16> opcode, } // Like CondUnaryRSY, but with a fixed CC mask. -class FixedCondUnaryRSY<string mnemonic, bits<16> opcode, - RegisterOperand cls, bits<4> ccmask, bits<5> bytes, +class FixedCondUnaryRSY<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$BD2), - mnemonic#"\t$R1, $BD2", []>, - Requires<[FeatureLoadStoreOnCond]> { + : InstRSYb<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$BD2), + mnemonic#V.suffix#"\t$R1, $BD2", []> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; - let R3 = ccmask; let mayLoad = 1; let AccessBytes = bytes; + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; } +multiclass CondUnaryRSYPair<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + RegisterOperand cls, bits<5> bytes, + AddressingMode mode = bdaddr20only> { + let isCodeGenOnly = 1 in + def "" : CondUnaryRSY<mnemonic, opcode, operator, cls, bytes, mode>; + def Asm : AsmCondUnaryRSY<mnemonic, opcode, cls, bytes, mode>; +} + + class UnaryRX<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdxaddr12only> - : InstRX<opcode, (outs cls:$R1), (ins mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(set cls:$R1, (operator mode:$XBD2))]> { - let OpKey = mnemonic ## cls; + : InstRXa<opcode, (outs cls:$R1), (ins mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set cls:$R1, (operator mode:$XBD2))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let mayLoad = 1; let AccessBytes = bytes; @@ -1448,7 +2449,7 @@ class UnaryRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, : InstRXE<opcode, (outs cls:$R1), (ins bdxaddr12only:$XBD2), mnemonic#"\t$R1, $XBD2", [(set cls:$R1, (operator bdxaddr12only:$XBD2))]> { - let OpKey = mnemonic ## cls; + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let mayLoad = 1; let AccessBytes = bytes; @@ -1458,10 +2459,10 @@ class UnaryRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, class UnaryRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdxaddr20only> - : InstRXY<opcode, (outs cls:$R1), (ins mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(set cls:$R1, (operator mode:$XBD2))]> { - let OpKey = mnemonic ## cls; + : InstRXYa<opcode, (outs cls:$R1), (ins mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set cls:$R1, (operator mode:$XBD2))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let mayLoad = 1; let AccessBytes = bytes; @@ -1487,6 +2488,10 @@ class UnaryVRIa<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M3 = type; } +class UnaryVRIaGeneric<string mnemonic, bits<16> opcode, Immediate imm> + : InstVRIa<opcode, (outs VR128:$V1), (ins imm:$I2, imm32zx4:$M3), + mnemonic#"\t$V1, $I2, $M3", []>; + class UnaryVRRa<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type = 0, bits<4> m4 = 0, bits<4> m5 = 0> @@ -1498,15 +2503,50 @@ class UnaryVRRa<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M5 = m5; } -multiclass UnaryVRRaSPair<string mnemonic, bits<16> opcode, - SDPatternOperator operator, - SDPatternOperator operator_cc, TypedReg tr1, - TypedReg tr2, bits<4> type, bits<4> modifier = 0, - bits<4> modifier_cc = 1> { - def "" : UnaryVRRa<mnemonic, opcode, operator, tr1, tr2, type, 0, modifier>; +class UnaryVRRaGeneric<string mnemonic, bits<16> opcode, bits<4> m4 = 0, + bits<4> m5 = 0> + : InstVRRa<opcode, (outs VR128:$V1), (ins VR128:$V2, imm32zx4:$M3), + mnemonic#"\t$V1, $V2, $M3", []> { + let M4 = m4; + let M5 = m5; +} + +class UnaryVRRaFloatGeneric<string mnemonic, bits<16> opcode, bits<4> m5 = 0> + : InstVRRa<opcode, (outs VR128:$V1), + (ins VR128:$V2, imm32zx4:$M3, imm32zx4:$M4), + mnemonic#"\t$V1, $V2, $M3, $M4", []> { + let M5 = m5; +} + +// Declare a pair of instructions, one which sets CC and one which doesn't. +// The CC-setting form ends with "S" and sets the low bit of M5. +// The form that does not set CC has an extra operand to optionally allow +// specifying arbitrary M5 values in assembler. +multiclass UnaryExtraVRRaSPair<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + SDPatternOperator operator_cc, + TypedReg tr1, TypedReg tr2, bits<4> type> { + let M3 = type, M4 = 0 in + def "" : InstVRRa<opcode, (outs tr1.op:$V1), + (ins tr2.op:$V2, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $M5", []>; + def : Pat<(tr1.vt (operator (tr2.vt tr2.op:$V2))), + (!cast<Instruction>(NAME) tr2.op:$V2, 0)>; + def : InstAlias<mnemonic#"\t$V1, $V2", + (!cast<Instruction>(NAME) tr1.op:$V1, tr2.op:$V2, 0)>; let Defs = [CC] in - def S : UnaryVRRa<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, 0, - modifier_cc>; + def S : UnaryVRRa<mnemonic##"s", opcode, operator_cc, tr1, tr2, + type, 0, 1>; +} + +multiclass UnaryExtraVRRaSPairGeneric<string mnemonic, bits<16> opcode> { + let M4 = 0 in + def "" : InstVRRa<opcode, (outs VR128:$V1), + (ins VR128:$V2, imm32zx4:$M3, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $M3, $M5", []>; + def : InstAlias<mnemonic#"\t$V1, $V2, $M3", + (!cast<Instruction>(NAME) VR128:$V1, VR128:$V2, + imm32zx4:$M3, 0)>; } class UnaryVRX<string mnemonic, bits<16> opcode, SDPatternOperator operator, @@ -1519,12 +2559,43 @@ class UnaryVRX<string mnemonic, bits<16> opcode, SDPatternOperator operator, let AccessBytes = bytes; } +class UnaryVRXGeneric<string mnemonic, bits<16> opcode> + : InstVRX<opcode, (outs VR128:$V1), (ins bdxaddr12only:$XBD2, imm32zx4:$M3), + mnemonic#"\t$V1, $XBD2, $M3", []> { + let mayLoad = 1; +} + +class SideEffectBinaryRX<string mnemonic, bits<8> opcode, + RegisterOperand cls> + : InstRXa<opcode, (outs), (ins cls:$R1, bdxaddr12only:$XBD2), + mnemonic##"\t$R1, $XBD2", []>; + +class SideEffectBinaryRILPC<string mnemonic, bits<12> opcode, + RegisterOperand cls> + : InstRILb<opcode, (outs), (ins cls:$R1, pcrel32:$RI2), + mnemonic##"\t$R1, $RI2", []> { + // We want PC-relative addresses to be tried ahead of BD and BDX addresses. + // However, BDXs have two extra operands and are therefore 6 units more + // complex. + let AddedComplexity = 7; +} + +class SideEffectBinaryIE<string mnemonic, bits<16> opcode, + Immediate imm1, Immediate imm2> + : InstIE<opcode, (outs), (ins imm1:$I1, imm2:$I2), + mnemonic#"\t$I1, $I2", []>; + +class SideEffectBinarySIL<string mnemonic, bits<16> opcode, + SDPatternOperator operator, Immediate imm> + : InstSIL<opcode, (outs), (ins bdaddr12only:$BD1, imm:$I2), + mnemonic#"\t$BD1, $I2", [(operator bdaddr12only:$BD1, imm:$I2)]>; + class BinaryRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls1, RegisterOperand cls2> : InstRR<opcode, (outs cls1:$R1), (ins cls1:$R1src, cls2:$R2), - mnemonic#"r\t$R1, $R2", + mnemonic#"\t$R1, $R2", [(set cls1:$R1, (operator cls1:$R1src, cls2:$R2))]> { - let OpKey = mnemonic ## cls1; + let OpKey = mnemonic#cls1; let OpType = "reg"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -1533,30 +2604,21 @@ class BinaryRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, class BinaryRRE<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls1, RegisterOperand cls2> : InstRRE<opcode, (outs cls1:$R1), (ins cls1:$R1src, cls2:$R2), - mnemonic#"r\t$R1, $R2", + mnemonic#"\t$R1, $R2", [(set cls1:$R1, (operator cls1:$R1src, cls2:$R2))]> { - let OpKey = mnemonic ## cls1; + let OpKey = mnemonic#cls1; let OpType = "reg"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; } -class BinaryRRF<string mnemonic, bits<16> opcode, SDPatternOperator operator, - RegisterOperand cls1, RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins cls1:$R2, cls2:$R3), - mnemonic#"r\t$R1, $R3, $R2", - [(set cls1:$R1, (operator cls1:$R2, cls2:$R3))]> { - let OpKey = mnemonic ## cls1; - let OpType = "reg"; - let R4 = 0; -} - -class BinaryRRFK<string mnemonic, bits<16> opcode, SDPatternOperator operator, - RegisterOperand cls1, RegisterOperand cls2> - : InstRRF<opcode, (outs cls1:$R1), (ins cls1:$R2, cls2:$R3), - mnemonic#"rk\t$R1, $R2, $R3", - [(set cls1:$R1, (operator cls1:$R2, cls2:$R3))]> { - let R4 = 0; +class BinaryRRFa<string mnemonic, bits<16> opcode, SDPatternOperator operator, + RegisterOperand cls1, RegisterOperand cls2, + RegisterOperand cls3> + : InstRRFa<opcode, (outs cls1:$R1), (ins cls2:$R2, cls3:$R3), + mnemonic#"\t$R1, $R2, $R3", + [(set cls1:$R1, (operator cls2:$R2, cls3:$R3))]> { + let M4 = 0; } multiclass BinaryRRAndK<string mnemonic, bits<8> opcode1, bits<16> opcode2, @@ -1564,7 +2626,7 @@ multiclass BinaryRRAndK<string mnemonic, bits<8> opcode1, bits<16> opcode2, RegisterOperand cls2> { let NumOpsKey = mnemonic in { let NumOpsValue = "3" in - def K : BinaryRRFK<mnemonic, opcode2, null_frag, cls1, cls2>, + def K : BinaryRRFa<mnemonic#"k", opcode2, null_frag, cls1, cls1, cls2>, Requires<[FeatureDistinctOps]>; let NumOpsValue = "2", isConvertibleToThreeAddress = 1 in def "" : BinaryRR<mnemonic, opcode1, operator, cls1, cls2>; @@ -1576,18 +2638,73 @@ multiclass BinaryRREAndK<string mnemonic, bits<16> opcode1, bits<16> opcode2, RegisterOperand cls2> { let NumOpsKey = mnemonic in { let NumOpsValue = "3" in - def K : BinaryRRFK<mnemonic, opcode2, null_frag, cls1, cls2>, + def K : BinaryRRFa<mnemonic#"k", opcode2, null_frag, cls1, cls1, cls2>, Requires<[FeatureDistinctOps]>; let NumOpsValue = "2", isConvertibleToThreeAddress = 1 in def "" : BinaryRRE<mnemonic, opcode1, operator, cls1, cls2>; } } +class BinaryRRFb<string mnemonic, bits<16> opcode, SDPatternOperator operator, + RegisterOperand cls1, RegisterOperand cls2, + RegisterOperand cls3> + : InstRRFb<opcode, (outs cls1:$R1), (ins cls2:$R2, cls3:$R3), + mnemonic#"\t$R1, $R3, $R2", + [(set cls1:$R1, (operator cls2:$R2, cls3:$R3))]> { + let M4 = 0; +} + +class BinaryRRFe<string mnemonic, bits<16> opcode, RegisterOperand cls1, + RegisterOperand cls2> + : InstRRFe<opcode, (outs cls1:$R1), (ins imm32zx4:$M3, cls2:$R2), + mnemonic#"\t$R1, $M3, $R2", []> { + let M4 = 0; +} + +class CondBinaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, + RegisterOperand cls2> + : InstRRFc<opcode, (outs cls1:$R1), + (ins cls1:$R1src, cls2:$R2, cond4:$valid, cond4:$M3), + mnemonic#"$M3\t$R1, $R2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let CCMaskLast = 1; +} + +// Like CondBinaryRRF, but used for the raw assembly form. The condition-code +// mask is the third operand rather than being part of the mnemonic. +class AsmCondBinaryRRF<string mnemonic, bits<16> opcode, RegisterOperand cls1, + RegisterOperand cls2> + : InstRRFc<opcode, (outs cls1:$R1), + (ins cls1:$R1src, cls2:$R2, imm32zx4:$M3), + mnemonic#"\t$R1, $R2, $M3", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +// Like CondBinaryRRF, but with a fixed CC mask. +class FixedCondBinaryRRF<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls1, RegisterOperand cls2> + : InstRRFc<opcode, (outs cls1:$R1), (ins cls1:$R1src, cls2:$R2), + mnemonic#V.suffix#"\t$R1, $R2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CondBinaryRRFPair<string mnemonic, bits<16> opcode, + RegisterOperand cls1, RegisterOperand cls2> { + let isCodeGenOnly = 1 in + def "" : CondBinaryRRF<mnemonic, opcode, cls1, cls2>; + def Asm : AsmCondBinaryRRF<mnemonic, opcode, cls1, cls2>; +} + class BinaryRI<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRI<opcode, (outs cls:$R1), (ins cls:$R1src, imm:$I2), - mnemonic#"\t$R1, $I2", - [(set cls:$R1, (operator cls:$R1src, imm:$I2))]> { + : InstRIa<opcode, (outs cls:$R1), (ins cls:$R1src, imm:$I2), + mnemonic#"\t$R1, $I2", + [(set cls:$R1, (operator cls:$R1src, imm:$I2))]> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; } @@ -1610,20 +2727,61 @@ multiclass BinaryRIAndK<string mnemonic, bits<12> opcode1, bits<16> opcode2, } } +class CondBinaryRIE<string mnemonic, bits<16> opcode, RegisterOperand cls, + Immediate imm> + : InstRIEg<opcode, (outs cls:$R1), + (ins cls:$R1src, imm:$I2, cond4:$valid, cond4:$M3), + mnemonic#"$M3\t$R1, $I2", + [(set cls:$R1, (z_select_ccmask imm:$I2, cls:$R1src, + cond4:$valid, cond4:$M3))]> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let CCMaskLast = 1; +} + +// Like CondBinaryRIE, but used for the raw assembly form. The condition-code +// mask is the third operand rather than being part of the mnemonic. +class AsmCondBinaryRIE<string mnemonic, bits<16> opcode, RegisterOperand cls, + Immediate imm> + : InstRIEg<opcode, (outs cls:$R1), + (ins cls:$R1src, imm:$I2, imm32zx4:$M3), + mnemonic#"\t$R1, $I2, $M3", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; +} + +// Like CondBinaryRIE, but with a fixed CC mask. +class FixedCondBinaryRIE<CondVariant V, string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> + : InstRIEg<opcode, (outs cls:$R1), (ins cls:$R1src, imm:$I2), + mnemonic#V.suffix#"\t$R1, $I2", []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let isAsmParserOnly = V.alternate; + let M3 = V.ccmask; +} + +multiclass CondBinaryRIEPair<string mnemonic, bits<16> opcode, + RegisterOperand cls, Immediate imm> { + let isCodeGenOnly = 1 in + def "" : CondBinaryRIE<mnemonic, opcode, cls, imm>; + def Asm : AsmCondBinaryRIE<mnemonic, opcode, cls, imm>; +} + class BinaryRIL<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRIL<opcode, (outs cls:$R1), (ins cls:$R1src, imm:$I2), - mnemonic#"\t$R1, $I2", - [(set cls:$R1, (operator cls:$R1src, imm:$I2))]> { + : InstRILa<opcode, (outs cls:$R1), (ins cls:$R1src, imm:$I2), + mnemonic#"\t$R1, $I2", + [(set cls:$R1, (operator cls:$R1src, imm:$I2))]> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; } class BinaryRS<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls> - : InstRS<opcode, (outs cls:$R1), (ins cls:$R1src, shift12only:$BD2), - mnemonic#"\t$R1, $BD2", - [(set cls:$R1, (operator cls:$R1src, shift12only:$BD2))]> { + : InstRSa<opcode, (outs cls:$R1), (ins cls:$R1src, shift12only:$BD2), + mnemonic#"\t$R1, $BD2", + [(set cls:$R1, (operator cls:$R1src, shift12only:$BD2))]> { let R3 = 0; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -1631,9 +2789,9 @@ class BinaryRS<string mnemonic, bits<8> opcode, SDPatternOperator operator, class BinaryRSY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls> - : InstRSY<opcode, (outs cls:$R1), (ins cls:$R3, shift20only:$BD2), - mnemonic#"\t$R1, $R3, $BD2", - [(set cls:$R1, (operator cls:$R3, shift20only:$BD2))]>; + : InstRSYa<opcode, (outs cls:$R1), (ins cls:$R3, shift20only:$BD2), + mnemonic#"\t$R1, $R3, $BD2", + [(set cls:$R1, (operator cls:$R3, shift20only:$BD2))]>; multiclass BinaryRSAndK<string mnemonic, bits<8> opcode1, bits<16> opcode2, SDPatternOperator operator, RegisterOperand cls> { @@ -1649,10 +2807,10 @@ multiclass BinaryRSAndK<string mnemonic, bits<8> opcode1, bits<16> opcode2, class BinaryRX<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls, SDPatternOperator load, bits<5> bytes, AddressingMode mode = bdxaddr12only> - : InstRX<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(set cls:$R1, (operator cls:$R1src, (load mode:$XBD2)))]> { - let OpKey = mnemonic ## cls; + : InstRXa<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set cls:$R1, (operator cls:$R1src, (load mode:$XBD2)))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -1666,7 +2824,7 @@ class BinaryRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, mnemonic#"\t$R1, $XBD2", [(set cls:$R1, (operator cls:$R1src, (load bdxaddr12only:$XBD2)))]> { - let OpKey = mnemonic ## cls; + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -1678,10 +2836,10 @@ class BinaryRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, class BinaryRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, SDPatternOperator load, bits<5> bytes, AddressingMode mode = bdxaddr20only> - : InstRXY<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(set cls:$R1, (operator cls:$R1src, (load mode:$XBD2)))]> { - let OpKey = mnemonic ## cls; + : InstRXYa<opcode, (outs cls:$R1), (ins cls:$R1src, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(set cls:$R1, (operator cls:$R1src, (load mode:$XBD2)))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -1731,6 +2889,12 @@ multiclass BinarySIPair<string mnemonic, bits<8> siOpcode, } } +class BinarySSF<string mnemonic, bits<12> opcode, RegisterOperand cls> + : InstSSF<opcode, (outs cls:$R3), (ins bdaddr12pair:$BD1, bdaddr12pair:$BD2), + mnemonic#"\t$R3, $BD1, $BD2", []> { + let mayLoad = 1; +} + class BinaryVRIb<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr, bits<4> type> : InstVRIb<opcode, (outs tr.op:$V1), (ins imm32zx8:$I2, imm32zx8:$I3), @@ -1739,6 +2903,11 @@ class BinaryVRIb<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } +class BinaryVRIbGeneric<string mnemonic, bits<16> opcode> + : InstVRIb<opcode, (outs VR128:$V1), + (ins imm32zx8:$I2, imm32zx8:$I3, imm32zx4:$M4), + mnemonic#"\t$V1, $I2, $I3, $M4", []>; + class BinaryVRIc<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type> : InstVRIc<opcode, (outs tr1.op:$V1), (ins tr2.op:$V3, imm32zx16:$I2), @@ -1748,6 +2917,11 @@ class BinaryVRIc<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } +class BinaryVRIcGeneric<string mnemonic, bits<16> opcode> + : InstVRIc<opcode, (outs VR128:$V1), + (ins VR128:$V3, imm32zx16:$I2, imm32zx4:$M4), + mnemonic#"\t$V1, $V3, $I2, $M4", []>; + class BinaryVRIe<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type, bits<4> m5> : InstVRIe<opcode, (outs tr1.op:$V1), (ins tr2.op:$V2, imm32zx12:$I3), @@ -1758,13 +2932,26 @@ class BinaryVRIe<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M5 = m5; } -class BinaryVRRa<string mnemonic, bits<16> opcode> - : InstVRRa<opcode, (outs VR128:$V1), (ins VR128:$V2, imm32zx4:$M3), - mnemonic#"\t$V1, $V2, $M3", []> { - let M4 = 0; - let M5 = 0; +class BinaryVRIeFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRIe<opcode, (outs VR128:$V1), + (ins VR128:$V2, imm32zx12:$I3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $I3, $M4, $M5", []>; + +class BinaryVRRa<string mnemonic, bits<16> opcode, SDPatternOperator operator, + TypedReg tr1, TypedReg tr2, bits<4> type = 0, bits<4> m4 = 0> + : InstVRRa<opcode, (outs tr1.op:$V1), (ins tr2.op:$V2, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $M5", + [(set tr1.op:$V1, (tr1.vt (operator (tr2.vt tr2.op:$V2), + imm32zx12:$M5)))]> { + let M3 = type; + let M4 = m4; } +class BinaryVRRaFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRRa<opcode, (outs VR128:$V1), + (ins VR128:$V2, imm32zx4:$M3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $M3, $M4, $M5", []>; + class BinaryVRRb<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type = 0, bits<4> modifier = 0> @@ -1781,12 +2968,47 @@ class BinaryVRRb<string mnemonic, bits<16> opcode, SDPatternOperator operator, multiclass BinaryVRRbSPair<string mnemonic, bits<16> opcode, SDPatternOperator operator, SDPatternOperator operator_cc, TypedReg tr1, - TypedReg tr2, bits<4> type, - bits<4> modifier = 0, bits<4> modifier_cc = 1> { - def "" : BinaryVRRb<mnemonic, opcode, operator, tr1, tr2, type, modifier>; + TypedReg tr2, bits<4> type, bits<4> modifier = 0> { + def "" : BinaryVRRb<mnemonic, opcode, operator, tr1, tr2, type, + !and (modifier, 14)>; let Defs = [CC] in def S : BinaryVRRb<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, - modifier_cc>; + !add (!and (modifier, 14), 1)>; +} + +class BinaryVRRbSPairGeneric<string mnemonic, bits<16> opcode> + : InstVRRb<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $M4, $M5", []>; + +// Declare a pair of instructions, one which sets CC and one which doesn't. +// The CC-setting form ends with "S" and sets the low bit of M5. +// The form that does not set CC has an extra operand to optionally allow +// specifying arbitrary M5 values in assembler. +multiclass BinaryExtraVRRbSPair<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + SDPatternOperator operator_cc, + TypedReg tr1, TypedReg tr2, bits<4> type> { + let M4 = type in + def "" : InstVRRb<opcode, (outs tr1.op:$V1), + (ins tr2.op:$V2, tr2.op:$V3, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $M5", []>; + def : Pat<(tr1.vt (operator (tr2.vt tr2.op:$V2), (tr2.vt tr2.op:$V3))), + (!cast<Instruction>(NAME) tr2.op:$V2, tr2.op:$V3, 0)>; + def : InstAlias<mnemonic#"\t$V1, $V2, $V3", + (!cast<Instruction>(NAME) tr1.op:$V1, tr2.op:$V2, + tr2.op:$V3, 0)>; + let Defs = [CC] in + def S : BinaryVRRb<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, 1>; +} + +multiclass BinaryExtraVRRbSPairGeneric<string mnemonic, bits<16> opcode> { + def "" : InstVRRb<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $M4, $M5", []>; + def : InstAlias<mnemonic#"\t$V1, $V2, $V3, $M4", + (!cast<Instruction>(NAME) VR128:$V1, VR128:$V2, VR128:$V3, + imm32zx4:$M4, 0)>; } class BinaryVRRc<string mnemonic, bits<16> opcode, SDPatternOperator operator, @@ -1801,17 +3023,42 @@ class BinaryVRRc<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M6 = m6; } +class BinaryVRRcGeneric<string mnemonic, bits<16> opcode, bits<4> m5 = 0, + bits<4> m6 = 0> + : InstVRRc<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4), + mnemonic#"\t$V1, $V2, $V3, $M4", []> { + let M5 = m5; + let M6 = m6; +} + +class BinaryVRRcFloatGeneric<string mnemonic, bits<16> opcode, bits<4> m6 = 0> + : InstVRRc<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $M4, $M5", []> { + let M6 = m6; +} + +// Declare a pair of instructions, one which sets CC and one which doesn't. +// The CC-setting form ends with "S" and sets the low bit of M5. multiclass BinaryVRRcSPair<string mnemonic, bits<16> opcode, SDPatternOperator operator, SDPatternOperator operator_cc, TypedReg tr1, TypedReg tr2, bits<4> type, bits<4> m5, - bits<4> modifier = 0, bits<4> modifier_cc = 1> { - def "" : BinaryVRRc<mnemonic, opcode, operator, tr1, tr2, type, m5, modifier>; + bits<4> modifier = 0> { + def "" : BinaryVRRc<mnemonic, opcode, operator, tr1, tr2, type, + m5, !and (modifier, 14)>; let Defs = [CC] in def S : BinaryVRRc<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, - m5, modifier_cc>; + m5, !add (!and (modifier, 14), 1)>; } +class BinaryVRRcSPairFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRRc<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4, imm32zx4:$M5, + imm32zx4:$M6), + mnemonic#"\t$V1, $V2, $V3, $M4, $M5, $M6", []>; + class BinaryVRRf<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr> : InstVRRf<opcode, (outs tr.op:$V1), (ins GR64:$R2, GR64:$R3), @@ -1827,6 +3074,11 @@ class BinaryVRSa<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } +class BinaryVRSaGeneric<string mnemonic, bits<16> opcode> + : InstVRSa<opcode, (outs VR128:$V1), + (ins VR128:$V3, shift12only:$BD2, imm32zx4:$M4), + mnemonic#"\t$V1, $V3, $BD2, $M4", []>; + class BinaryVRSb<string mnemonic, bits<16> opcode, SDPatternOperator operator, bits<5> bytes> : InstVRSb<opcode, (outs VR128:$V1), (ins GR32:$R3, bdaddr12only:$BD2), @@ -1845,6 +3097,11 @@ class BinaryVRSc<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } +class BinaryVRScGeneric<string mnemonic, bits<16> opcode> + : InstVRSc<opcode, (outs GR64:$R1), + (ins VR128:$V3, shift12only:$BD2, imm32zx4: $M4), + mnemonic#"\t$R1, $V3, $BD2, $M4", []>; + class BinaryVRX<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr, bits<5> bytes> : InstVRX<opcode, (outs VR128:$V1), (ins bdxaddr12only:$XBD2, imm32zx4:$M3), @@ -1873,12 +3130,18 @@ class StoreBinaryVRX<string mnemonic, bits<16> opcode, let AccessBytes = bytes; } +class MemoryBinarySSd<string mnemonic, bits<8> opcode, + RegisterOperand cls> + : InstSSd<opcode, (outs), + (ins bdraddr12only:$RBD1, bdaddr12only:$BD2, cls:$R3), + mnemonic#"\t$RBD1, $BD2, $R3", []>; + class CompareRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls1, RegisterOperand cls2> : InstRR<opcode, (outs), (ins cls1:$R1, cls2:$R2), - mnemonic#"r\t$R1, $R2", + mnemonic#"\t$R1, $R2", [(operator cls1:$R1, cls2:$R2)]> { - let OpKey = mnemonic ## cls1; + let OpKey = mnemonic#cls1; let OpType = "reg"; let isCompare = 1; } @@ -1886,34 +3149,34 @@ class CompareRR<string mnemonic, bits<8> opcode, SDPatternOperator operator, class CompareRRE<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls1, RegisterOperand cls2> : InstRRE<opcode, (outs), (ins cls1:$R1, cls2:$R2), - mnemonic#"r\t$R1, $R2", + mnemonic#"\t$R1, $R2", [(operator cls1:$R1, cls2:$R2)]> { - let OpKey = mnemonic ## cls1; + let OpKey = mnemonic#cls1; let OpType = "reg"; let isCompare = 1; } class CompareRI<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRI<opcode, (outs), (ins cls:$R1, imm:$I2), - mnemonic#"\t$R1, $I2", - [(operator cls:$R1, imm:$I2)]> { + : InstRIa<opcode, (outs), (ins cls:$R1, imm:$I2), + mnemonic#"\t$R1, $I2", + [(operator cls:$R1, imm:$I2)]> { let isCompare = 1; } class CompareRIL<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : InstRIL<opcode, (outs), (ins cls:$R1, imm:$I2), - mnemonic#"\t$R1, $I2", - [(operator cls:$R1, imm:$I2)]> { + : InstRILa<opcode, (outs), (ins cls:$R1, imm:$I2), + mnemonic#"\t$R1, $I2", + [(operator cls:$R1, imm:$I2)]> { let isCompare = 1; } class CompareRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, RegisterOperand cls, SDPatternOperator load> - : InstRIL<opcode, (outs), (ins cls:$R1, pcrel32:$I2), - mnemonic#"\t$R1, $I2", - [(operator cls:$R1, (load pcrel32:$I2))]> { + : InstRILb<opcode, (outs), (ins cls:$R1, pcrel32:$RI2), + mnemonic#"\t$R1, $RI2", + [(operator cls:$R1, (load pcrel32:$RI2))]> { let isCompare = 1; let mayLoad = 1; // We want PC-relative addresses to be tried ahead of BD and BDX addresses. @@ -1925,10 +3188,10 @@ class CompareRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator, class CompareRX<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls, SDPatternOperator load, bits<5> bytes, AddressingMode mode = bdxaddr12only> - : InstRX<opcode, (outs), (ins cls:$R1, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(operator cls:$R1, (load mode:$XBD2))]> { - let OpKey = mnemonic ## cls; + : InstRXa<opcode, (outs), (ins cls:$R1, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(operator cls:$R1, (load mode:$XBD2))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let isCompare = 1; let mayLoad = 1; @@ -1940,7 +3203,7 @@ class CompareRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, : InstRXE<opcode, (outs), (ins cls:$R1, bdxaddr12only:$XBD2), mnemonic#"\t$R1, $XBD2", [(operator cls:$R1, (load bdxaddr12only:$XBD2))]> { - let OpKey = mnemonic ## cls; + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let isCompare = 1; let mayLoad = 1; @@ -1951,10 +3214,10 @@ class CompareRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, class CompareRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, SDPatternOperator load, bits<5> bytes, AddressingMode mode = bdxaddr20only> - : InstRXY<opcode, (outs), (ins cls:$R1, mode:$XBD2), - mnemonic#"\t$R1, $XBD2", - [(operator cls:$R1, (load mode:$XBD2))]> { - let OpKey = mnemonic ## cls; + : InstRXYa<opcode, (outs), (ins cls:$R1, mode:$XBD2), + mnemonic#"\t$R1, $XBD2", + [(operator cls:$R1, (load mode:$XBD2))]> { + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let isCompare = 1; let mayLoad = 1; @@ -2026,6 +3289,22 @@ class CompareVRRa<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M5 = 0; } +class CompareVRRaGeneric<string mnemonic, bits<16> opcode> + : InstVRRa<opcode, (outs), (ins VR128:$V1, VR128:$V2, imm32zx4:$M3), + mnemonic#"\t$V1, $V2, $M3", []> { + let isCompare = 1; + let M4 = 0; + let M5 = 0; +} + +class CompareVRRaFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRRa<opcode, (outs), + (ins VR64:$V1, VR64:$V2, imm32zx4:$M3, imm32zx4:$M4), + mnemonic#"\t$V1, $V2, $M3, $M4", []> { + let isCompare = 1; + let M5 = 0; +} + class TestRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls> : InstRXE<opcode, (outs), (ins cls:$R1, bdxaddr12only:$XBD2), @@ -2034,12 +3313,30 @@ class TestRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M3 = 0; } +class SideEffectTernaryRRFc<string mnemonic, bits<16> opcode, + RegisterOperand cls1, RegisterOperand cls2, + Immediate imm> + : InstRRFc<opcode, (outs), (ins cls1:$R1, cls2:$R2, imm:$M3), + mnemonic#"\t$R1, $R2, $M3", []>; + +class SideEffectTernarySSF<string mnemonic, bits<12> opcode, + RegisterOperand cls> + : InstSSF<opcode, (outs), + (ins bdaddr12only:$BD1, bdaddr12only:$BD2, cls:$R3), + mnemonic#"\t$BD1, $BD2, $R3", []>; + +class TernaryRRFe<string mnemonic, bits<16> opcode, RegisterOperand cls1, + RegisterOperand cls2> + : InstRRFe<opcode, (outs cls1:$R1), + (ins imm32zx4:$M3, cls2:$R2, imm32zx4:$M4), + mnemonic#"\t$R1, $M3, $R2, $M4", []>; + class TernaryRRD<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls> : InstRRD<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, cls:$R2), - mnemonic#"r\t$R1, $R3, $R2", + mnemonic#"\t$R1, $R3, $R2", [(set cls:$R1, (operator cls:$R1src, cls:$R3, cls:$R2))]> { - let OpKey = mnemonic ## cls; + let OpKey = mnemonic#cls; let OpType = "reg"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -2047,9 +3344,9 @@ class TernaryRRD<string mnemonic, bits<16> opcode, class TernaryRS<string mnemonic, bits<8> opcode, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr12only> - : InstRS<opcode, (outs cls:$R1), - (ins cls:$R1src, imm32zx4:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSb<opcode, (outs cls:$R1), + (ins cls:$R1src, imm32zx4:$M3, mode:$BD2), + mnemonic#"\t$R1, $M3, $BD2", []> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -2059,9 +3356,9 @@ class TernaryRS<string mnemonic, bits<8> opcode, RegisterOperand cls, class TernaryRSY<string mnemonic, bits<16> opcode, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), - (ins cls:$R1src, imm32zx4:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", []> { + : InstRSYb<opcode, (outs cls:$R1), + (ins cls:$R1src, imm32zx4:$M3, mode:$BD2), + mnemonic#"\t$R1, $M3, $BD2", []> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -2086,7 +3383,7 @@ class TernaryRXF<string mnemonic, bits<16> opcode, SDPatternOperator operator, mnemonic#"\t$R1, $R3, $XBD2", [(set cls:$R1, (operator cls:$R1src, cls:$R3, (load bdxaddr12only:$XBD2)))]> { - let OpKey = mnemonic ## cls; + let OpKey = mnemonic#"r"#cls; let OpType = "mem"; let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; @@ -2127,6 +3424,11 @@ class TernaryVRRa<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M3 = type; } +class TernaryVRRaFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRRa<opcode, (outs VR128:$V1), + (ins VR128:$V2, imm32zx4:$M3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $M3, $M4, $M5", []>; + class TernaryVRRb<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type, SDPatternOperator m5mask, bits<4> m5or> @@ -2140,23 +3442,36 @@ class TernaryVRRb<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } -multiclass TernaryVRRbSPair<string mnemonic, bits<16> opcode, - SDPatternOperator operator, - SDPatternOperator operator_cc, TypedReg tr1, - TypedReg tr2, bits<4> type, bits<4> m5or> { +// Declare a pair of instructions, one which sets CC and one which doesn't. +// The CC-setting form ends with "S" and sets the low bit of M5. +// Also create aliases to make use of M5 operand optional in assembler. +multiclass TernaryOptVRRbSPair<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + SDPatternOperator operator_cc, + TypedReg tr1, TypedReg tr2, bits<4> type, + bits<4> modifier = 0> { def "" : TernaryVRRb<mnemonic, opcode, operator, tr1, tr2, type, - imm32zx4even, !and (m5or, 14)>; + imm32zx4even, !and (modifier, 14)>; def : InstAlias<mnemonic#"\t$V1, $V2, $V3", (!cast<Instruction>(NAME) tr1.op:$V1, tr2.op:$V2, tr2.op:$V3, 0)>; let Defs = [CC] in def S : TernaryVRRb<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, - imm32zx4even, !add(!and (m5or, 14), 1)>; + imm32zx4even, !add(!and (modifier, 14), 1)>; def : InstAlias<mnemonic#"s\t$V1, $V2, $V3", (!cast<Instruction>(NAME#"S") tr1.op:$V1, tr2.op:$V2, tr2.op:$V3, 0)>; } +multiclass TernaryOptVRRbSPairGeneric<string mnemonic, bits<16> opcode> { + def "" : InstVRRb<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, imm32zx4:$M4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $M4, $M5", []>; + def : InstAlias<mnemonic#"\t$V1, $V2, $V3, $M4", + (!cast<Instruction>(NAME) VR128:$V1, VR128:$V2, VR128:$V3, + imm32zx4:$M4, 0)>; +} + class TernaryVRRc<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2> : InstVRRc<opcode, (outs tr1.op:$V1), @@ -2181,6 +3496,13 @@ class TernaryVRRd<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M6 = 0; } +class TernaryVRRdGeneric<string mnemonic, bits<16> opcode> + : InstVRRd<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, VR128:$V4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $V4, $M5", []> { + let M6 = 0; +} + class TernaryVRRe<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> m5 = 0, bits<4> type = 0> : InstVRRe<opcode, (outs tr1.op:$V1), @@ -2193,6 +3515,11 @@ class TernaryVRRe<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M6 = type; } +class TernaryVRReFloatGeneric<string mnemonic, bits<16> opcode> + : InstVRRe<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, VR128:$V4, imm32zx4:$M5, imm32zx4:$M6), + mnemonic#"\t$V1, $V2, $V3, $V4, $M5, $M6", []>; + class TernaryVRSb<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, RegisterOperand cls, bits<4> type> : InstVRSb<opcode, (outs tr1.op:$V1), @@ -2206,6 +3533,14 @@ class TernaryVRSb<string mnemonic, bits<16> opcode, SDPatternOperator operator, let M4 = type; } +class TernaryVRSbGeneric<string mnemonic, bits<16> opcode> + : InstVRSb<opcode, (outs VR128:$V1), + (ins VR128:$V1src, GR64:$R3, shift12only:$BD2, imm32zx4:$M4), + mnemonic#"\t$V1, $R3, $BD2, $M4", []> { + let Constraints = "$V1 = $V1src"; + let DisableEncoding = "$V1src"; +} + class TernaryVRV<string mnemonic, bits<16> opcode, bits<5> bytes, Immediate index> : InstVRV<opcode, (outs VR128:$V1), @@ -2245,6 +3580,15 @@ class QuaternaryVRId<string mnemonic, bits<16> opcode, SDPatternOperator operato let M5 = type; } +class QuaternaryVRIdGeneric<string mnemonic, bits<16> opcode> + : InstVRId<opcode, (outs VR128:$V1), + (ins VR128:$V1src, VR128:$V2, VR128:$V3, + imm32zx8:$I4, imm32zx4:$M5), + mnemonic#"\t$V1, $V2, $V3, $I4, $M5", []> { + let Constraints = "$V1 = $V1src"; + let DisableEncoding = "$V1src"; +} + class QuaternaryVRRd<string mnemonic, bits<16> opcode, SDPatternOperator operator, TypedReg tr1, TypedReg tr2, bits<4> type, SDPatternOperator m6mask, bits<4> m6or> @@ -2259,37 +3603,57 @@ class QuaternaryVRRd<string mnemonic, bits<16> opcode, let M5 = type; } -multiclass QuaternaryVRRdSPair<string mnemonic, bits<16> opcode, - SDPatternOperator operator, - SDPatternOperator operator_cc, TypedReg tr1, - TypedReg tr2, bits<4> type, bits<4> m6or> { +// Declare a pair of instructions, one which sets CC and one which doesn't. +// The CC-setting form ends with "S" and sets the low bit of M6. +// Also create aliases to make use of M6 operand optional in assembler. +multiclass QuaternaryOptVRRdSPair<string mnemonic, bits<16> opcode, + SDPatternOperator operator, + SDPatternOperator operator_cc, + TypedReg tr1, TypedReg tr2, bits<4> type, + bits<4> modifier = 0> { def "" : QuaternaryVRRd<mnemonic, opcode, operator, tr1, tr2, type, - imm32zx4even, !and (m6or, 14)>; + imm32zx4even, !and (modifier, 14)>; def : InstAlias<mnemonic#"\t$V1, $V2, $V3, $V4", (!cast<Instruction>(NAME) tr1.op:$V1, tr2.op:$V2, tr2.op:$V3, tr2.op:$V4, 0)>; let Defs = [CC] in def S : QuaternaryVRRd<mnemonic##"s", opcode, operator_cc, tr1, tr2, type, - imm32zx4even, !add (!and (m6or, 14), 1)>; + imm32zx4even, !add (!and (modifier, 14), 1)>; def : InstAlias<mnemonic#"s\t$V1, $V2, $V3, $V4", (!cast<Instruction>(NAME#"S") tr1.op:$V1, tr2.op:$V2, tr2.op:$V3, tr2.op:$V4, 0)>; } +multiclass QuaternaryOptVRRdSPairGeneric<string mnemonic, bits<16> opcode> { + def "" : InstVRRd<opcode, (outs VR128:$V1), + (ins VR128:$V2, VR128:$V3, VR128:$V4, + imm32zx4:$M5, imm32zx4:$M6), + mnemonic#"\t$V1, $V2, $V3, $V4, $M5, $M6", []>; + def : InstAlias<mnemonic#"\t$V1, $V2, $V3, $V4, $M5", + (!cast<Instruction>(NAME) VR128:$V1, VR128:$V2, VR128:$V3, + VR128:$V4, imm32zx4:$M5, 0)>; +} + +class SideEffectQuaternarySSe<string mnemonic, bits<8> opcode, + RegisterOperand cls> + : InstSSe<opcode, (outs), + (ins cls:$R1, bdaddr12only:$BD2, cls:$R3, bdaddr12only:$BD4), + mnemonic#"\t$R1, $BD2, $R3, $BD4", []>; + class LoadAndOpRSY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), (ins cls:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", - [(set cls:$R1, (operator mode:$BD2, cls:$R3))]> { + : InstRSYa<opcode, (outs cls:$R1), (ins cls:$R3, mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", + [(set cls:$R1, (operator mode:$BD2, cls:$R3))]> { let mayLoad = 1; let mayStore = 1; } class CmpSwapRS<string mnemonic, bits<8> opcode, SDPatternOperator operator, RegisterOperand cls, AddressingMode mode = bdaddr12only> - : InstRS<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", - [(set cls:$R1, (operator mode:$BD2, cls:$R1src, cls:$R3))]> { + : InstRSa<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", + [(set cls:$R1, (operator mode:$BD2, cls:$R1src, cls:$R3))]> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; let mayLoad = 1; @@ -2298,9 +3662,9 @@ class CmpSwapRS<string mnemonic, bits<8> opcode, SDPatternOperator operator, class CmpSwapRSY<string mnemonic, bits<16> opcode, SDPatternOperator operator, RegisterOperand cls, AddressingMode mode = bdaddr20only> - : InstRSY<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, mode:$BD2), - mnemonic#"\t$R1, $R3, $BD2", - [(set cls:$R1, (operator mode:$BD2, cls:$R1src, cls:$R3))]> { + : InstRSYa<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, mode:$BD2), + mnemonic#"\t$R1, $R3, $BD2", + [(set cls:$R1, (operator mode:$BD2, cls:$R1src, cls:$R3))]> { let Constraints = "$R1 = $R1src"; let DisableEncoding = "$R1src"; let mayLoad = 1; @@ -2328,21 +3692,31 @@ class RotateSelectRIEf<string mnemonic, bits<16> opcode, RegisterOperand cls1, } class PrefetchRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator> - : InstRXY<opcode, (outs), (ins imm32zx4:$R1, bdxaddr20only:$XBD2), - mnemonic##"\t$R1, $XBD2", - [(operator imm32zx4:$R1, bdxaddr20only:$XBD2)]>; + : InstRXYb<opcode, (outs), (ins imm32zx4:$M1, bdxaddr20only:$XBD2), + mnemonic##"\t$M1, $XBD2", + [(operator imm32zx4:$M1, bdxaddr20only:$XBD2)]>; class PrefetchRILPC<string mnemonic, bits<12> opcode, SDPatternOperator operator> - : InstRIL<opcode, (outs), (ins imm32zx4:$R1, pcrel32:$I2), - mnemonic##"\t$R1, $I2", - [(operator imm32zx4:$R1, pcrel32:$I2)]> { + : InstRILc<opcode, (outs), (ins imm32zx4:$M1, pcrel32:$RI2), + mnemonic##"\t$M1, $RI2", + [(operator imm32zx4:$M1, pcrel32:$RI2)]> { // We want PC-relative addresses to be tried ahead of BD and BDX addresses. // However, BDXs have two extra operands and are therefore 6 units more // complex. let AddedComplexity = 7; } +class BranchPreloadSMI<string mnemonic, bits<8> opcode> + : InstSMI<opcode, (outs), + (ins imm32zx4:$M1, brtarget16bpp:$RI2, bdxaddr12only:$BD3), + mnemonic#"\t$M1, $RI2, $BD3", []>; + +class BranchPreloadMII<string mnemonic, bits<8> opcode> + : InstMII<opcode, (outs), + (ins imm32zx4:$M1, brtarget12bpp:$RI2, brtarget24bpp:$RI3), + mnemonic#"\t$M1, $RI2, $RI3", []>; + // A floating-point load-and test operation. Create both a normal unary // operation and one that acts as a comparison against zero. // Note that the comparison against zero operation is not available if we @@ -2371,6 +3745,11 @@ class Pseudo<dag outs, dag ins, list<dag> pattern> let isCodeGenOnly = 1; } +// Like SideEffectBinarySIL, but expanded later. +class SideEffectBinarySILPseudo<SDPatternOperator operator, Immediate imm> + : Pseudo<(outs), (ins bdaddr12only:$BD1, imm:$I2), + [(operator bdaddr12only:$BD1, imm:$I2)]>; + // Like UnaryRI, but expanded after RA depending on the choice of register. class UnaryRIPseudo<SDPatternOperator operator, RegisterOperand cls, Immediate imm> @@ -2383,7 +3762,7 @@ class UnaryRXYPseudo<string key, SDPatternOperator operator, AddressingMode mode = bdxaddr20only> : Pseudo<(outs cls:$R1), (ins mode:$XBD2), [(set cls:$R1, (operator mode:$XBD2))]> { - let OpKey = key ## cls; + let OpKey = key#"r"#cls; let OpType = "mem"; let mayLoad = 1; let Has20BitOffset = 1; @@ -2396,7 +3775,7 @@ class UnaryRRPseudo<string key, SDPatternOperator operator, RegisterOperand cls1, RegisterOperand cls2> : Pseudo<(outs cls1:$R1), (ins cls2:$R2), [(set cls1:$R1, (operator cls2:$R2))]> { - let OpKey = key ## cls1; + let OpKey = key#cls1; let OpType = "reg"; } @@ -2430,7 +3809,9 @@ multiclass BinaryRIAndKPseudo<string key, SDPatternOperator operator, // Like CompareRI, but expanded after RA depending on the choice of register. class CompareRIPseudo<SDPatternOperator operator, RegisterOperand cls, Immediate imm> - : Pseudo<(outs), (ins cls:$R1, imm:$I2), [(operator cls:$R1, imm:$I2)]>; + : Pseudo<(outs), (ins cls:$R1, imm:$I2), [(operator cls:$R1, imm:$I2)]> { + let isCompare = 1; +} // Like CompareRXY, but expanded after RA depending on the choice of register. class CompareRXYPseudo<SDPatternOperator operator, RegisterOperand cls, @@ -2444,6 +3825,54 @@ class CompareRXYPseudo<SDPatternOperator operator, RegisterOperand cls, let AccessBytes = bytes; } +// Like CondBinaryRRF, but expanded after RA depending on the choice of +// register. +class CondBinaryRRFPseudo<RegisterOperand cls1, RegisterOperand cls2> + : Pseudo<(outs cls1:$R1), + (ins cls1:$R1src, cls2:$R2, cond4:$valid, cond4:$M3), []> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let CCMaskLast = 1; +} + +// Like CondBinaryRIE, but expanded after RA depending on the choice of +// register. +class CondBinaryRIEPseudo<RegisterOperand cls, Immediate imm> + : Pseudo<(outs cls:$R1), + (ins cls:$R1src, imm:$I2, cond4:$valid, cond4:$M3), + [(set cls:$R1, (z_select_ccmask imm:$I2, cls:$R1src, + cond4:$valid, cond4:$M3))]> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let CCMaskLast = 1; +} + +// Like CondUnaryRSY, but expanded after RA depending on the choice of +// register. +class CondUnaryRSYPseudo<SDPatternOperator operator, RegisterOperand cls, + bits<5> bytes, AddressingMode mode = bdaddr20only> + : Pseudo<(outs cls:$R1), + (ins cls:$R1src, mode:$BD2, cond4:$valid, cond4:$R3), + [(set cls:$R1, + (z_select_ccmask (operator mode:$BD2), cls:$R1src, + cond4:$valid, cond4:$R3))]> { + let Constraints = "$R1 = $R1src"; + let DisableEncoding = "$R1src"; + let mayLoad = 1; + let AccessBytes = bytes; + let CCMaskLast = 1; +} + +// Like CondStoreRSY, but expanded after RA depending on the choice of +// register. +class CondStoreRSYPseudo<RegisterOperand cls, bits<5> bytes, + AddressingMode mode = bdaddr20only> + : Pseudo<(outs), (ins cls:$R1, mode:$BD2, cond4:$valid, cond4:$R3), []> { + let mayStore = 1; + let AccessBytes = bytes; + let CCMaskLast = 1; +} + // Like StoreRXY, but expanded after RA depending on the choice of register. class StoreRXYPseudo<SDPatternOperator operator, RegisterOperand cls, bits<5> bytes, AddressingMode mode = bdxaddr20only> @@ -2509,6 +3938,7 @@ class AtomicLoadBinary<SDPatternOperator operator, RegisterOperand cls, let mayLoad = 1; let mayStore = 1; let usesCustomInserter = 1; + let hasNoSchedulingInfo = 1; } // Specializations of AtomicLoadWBinary. @@ -2535,6 +3965,7 @@ class AtomicLoadWBinary<SDPatternOperator operator, dag pat, let mayLoad = 1; let mayStore = 1; let usesCustomInserter = 1; + let hasNoSchedulingInfo = 1; } // Specializations of AtomicLoadWBinary. @@ -2550,10 +3981,10 @@ class AtomicLoadWBinaryImm<SDPatternOperator operator, Immediate imm> // another instruction to handle the excess. multiclass MemorySS<string mnemonic, bits<8> opcode, SDPatternOperator sequence, SDPatternOperator loop> { - def "" : InstSS<opcode, (outs), (ins bdladdr12onlylen8:$BDL1, - bdaddr12only:$BD2), - mnemonic##"\t$BDL1, $BD2", []>; - let usesCustomInserter = 1 in { + def "" : InstSSa<opcode, (outs), (ins bdladdr12onlylen8:$BDL1, + bdaddr12only:$BD2), + mnemonic##"\t$BDL1, $BD2", []>; + let usesCustomInserter = 1, hasNoSchedulingInfo = 1 in { def Sequence : Pseudo<(outs), (ins bdaddr12only:$dest, bdaddr12only:$src, imm64:$length), [(sequence bdaddr12only:$dest, bdaddr12only:$src, @@ -2579,7 +4010,7 @@ multiclass StringRRE<string mnemonic, bits<16> opcode, let Constraints = "$R1 = $R1src, $R2 = $R2src"; let DisableEncoding = "$R1src, $R2src"; } - let usesCustomInserter = 1 in + let usesCustomInserter = 1, hasNoSchedulingInfo = 1 in def Loop : Pseudo<(outs GR64:$end), (ins GR64:$start1, GR64:$start2, GR32:$char), [(set GR64:$end, (operator GR64:$start1, GR64:$start2, diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp index 4084e93..3565d5f 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp @@ -88,10 +88,10 @@ void SystemZInstrInfo::splitMove(MachineBasicBlock::iterator MI, void SystemZInstrInfo::splitAdjDynAlloc(MachineBasicBlock::iterator MI) const { MachineBasicBlock *MBB = MI->getParent(); MachineFunction &MF = *MBB->getParent(); - MachineFrameInfo *MFFrame = MF.getFrameInfo(); + MachineFrameInfo &MFFrame = MF.getFrameInfo(); MachineOperand &OffsetMO = MI->getOperand(2); - uint64_t Offset = (MFFrame->getMaxCallFrameSize() + + uint64_t Offset = (MFFrame.getMaxCallFrameSize() + SystemZMC::CallFrameSize + OffsetMO.getImm()); unsigned NewOpcode = getOpcodeForOffset(SystemZ::LA, Offset); @@ -149,6 +149,37 @@ void SystemZInstrInfo::expandRXYPseudo(MachineInstr &MI, unsigned LowOpcode, MI.setDesc(get(Opcode)); } +// MI is a load-on-condition pseudo instruction with a single register +// (source or destination) operand. Replace it with LowOpcode if the +// register is a low GR32 and HighOpcode if the register is a high GR32. +void SystemZInstrInfo::expandLOCPseudo(MachineInstr &MI, unsigned LowOpcode, + unsigned HighOpcode) const { + unsigned Reg = MI.getOperand(0).getReg(); + unsigned Opcode = isHighReg(Reg) ? HighOpcode : LowOpcode; + MI.setDesc(get(Opcode)); +} + +// MI is a load-register-on-condition pseudo instruction. Replace it with +// LowOpcode if source and destination are both low GR32s and HighOpcode if +// source and destination are both high GR32s. +void SystemZInstrInfo::expandLOCRPseudo(MachineInstr &MI, unsigned LowOpcode, + unsigned HighOpcode) const { + unsigned DestReg = MI.getOperand(0).getReg(); + unsigned SrcReg = MI.getOperand(2).getReg(); + bool DestIsHigh = isHighReg(DestReg); + bool SrcIsHigh = isHighReg(SrcReg); + + if (!DestIsHigh && !SrcIsHigh) + MI.setDesc(get(LowOpcode)); + else if (DestIsHigh && SrcIsHigh) + MI.setDesc(get(HighOpcode)); + + // If we were unable to implement the pseudo with a single instruction, we + // need to convert it back into a branch sequence. This cannot be done here + // since the caller of expandPostRAPseudo does not handle changes to the CFG + // correctly. This change is defered to the SystemZExpandPseudo pass. +} + // MI is an RR-style pseudo instruction that zero-extends the low Size bits // of one GRX32 into another. Replace it with LowOpcode if both operands // are low registers, otherwise use RISB[LH]G. @@ -172,7 +203,7 @@ void SystemZInstrInfo::expandLoadStackGuard(MachineInstr *MI) const { MachineInstr *Ear1MI = MF.CloneMachineInstr(MI); MBB->insert(MI, Ear1MI); Ear1MI->setDesc(get(SystemZ::EAR)); - MachineInstrBuilder(MF, Ear1MI).addImm(0); + MachineInstrBuilder(MF, Ear1MI).addReg(SystemZ::A0); // sllg <reg>, <reg>, 32 MachineInstr *SllgMI = MF.CloneMachineInstr(MI); @@ -184,7 +215,7 @@ void SystemZInstrInfo::expandLoadStackGuard(MachineInstr *MI) const { MachineInstr *Ear2MI = MF.CloneMachineInstr(MI); MBB->insert(MI, Ear2MI); Ear2MI->setDesc(get(SystemZ::EAR)); - MachineInstrBuilder(MF, Ear2MI).addImm(1); + MachineInstrBuilder(MF, Ear2MI).addReg(SystemZ::A1); // lg <reg>, 40(<reg>) MI->setDesc(get(SystemZ::LG)); @@ -222,6 +253,36 @@ void SystemZInstrInfo::emitGRX32Move(MachineBasicBlock &MBB, .addImm(32 - Size).addImm(128 + 31).addImm(Rotate); } + +MachineInstr *SystemZInstrInfo::commuteInstructionImpl(MachineInstr &MI, + bool NewMI, + unsigned OpIdx1, + unsigned OpIdx2) const { + auto cloneIfNew = [NewMI](MachineInstr &MI) -> MachineInstr & { + if (NewMI) + return *MI.getParent()->getParent()->CloneMachineInstr(&MI); + return MI; + }; + + switch (MI.getOpcode()) { + case SystemZ::LOCRMux: + case SystemZ::LOCFHR: + case SystemZ::LOCR: + case SystemZ::LOCGR: { + auto &WorkingMI = cloneIfNew(MI); + // Invert condition. + unsigned CCValid = WorkingMI.getOperand(3).getImm(); + unsigned CCMask = WorkingMI.getOperand(4).getImm(); + WorkingMI.getOperand(4).setImm(CCMask ^ CCValid); + return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /*NewMI=*/false, + OpIdx1, OpIdx2); + } + default: + return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2); + } +} + + // If MI is a simple load or store for a frame object, return the register // it loads or stores and set FrameIndex to the index of the frame object. // Return 0 otherwise. @@ -252,7 +313,7 @@ bool SystemZInstrInfo::isStackSlotCopy(const MachineInstr &MI, int &DestFrameIndex, int &SrcFrameIndex) const { // Check for MVC 0(Length,FI1),0(FI2) - const MachineFrameInfo *MFI = MI.getParent()->getParent()->getFrameInfo(); + const MachineFrameInfo &MFI = MI.getParent()->getParent()->getFrameInfo(); if (MI.getOpcode() != SystemZ::MVC || !MI.getOperand(0).isFI() || MI.getOperand(1).getImm() != 0 || !MI.getOperand(3).isFI() || MI.getOperand(4).getImm() != 0) @@ -262,8 +323,8 @@ bool SystemZInstrInfo::isStackSlotCopy(const MachineInstr &MI, int64_t Length = MI.getOperand(2).getImm(); unsigned FI1 = MI.getOperand(0).getIndex(); unsigned FI2 = MI.getOperand(3).getIndex(); - if (MFI->getObjectSize(FI1) != Length || - MFI->getObjectSize(FI2) != Length) + if (MFI.getObjectSize(FI1) != Length || + MFI.getObjectSize(FI2) != Length) return false; DestFrameIndex = FI1; @@ -363,7 +424,10 @@ bool SystemZInstrInfo::analyzeBranch(MachineBasicBlock &MBB, return false; } -unsigned SystemZInstrInfo::RemoveBranch(MachineBasicBlock &MBB) const { +unsigned SystemZInstrInfo::removeBranch(MachineBasicBlock &MBB, + int *BytesRemoved) const { + assert(!BytesRemoved && "code size not handled"); + // Most of the code and comments here are boilerplate. MachineBasicBlock::iterator I = MBB.end(); unsigned Count = 0; @@ -386,25 +450,27 @@ unsigned SystemZInstrInfo::RemoveBranch(MachineBasicBlock &MBB) const { } bool SystemZInstrInfo:: -ReverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const { +reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const { assert(Cond.size() == 2 && "Invalid condition"); Cond[1].setImm(Cond[1].getImm() ^ Cond[0].getImm()); return false; } -unsigned SystemZInstrInfo::InsertBranch(MachineBasicBlock &MBB, +unsigned SystemZInstrInfo::insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB, MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond, - const DebugLoc &DL) const { + const DebugLoc &DL, + int *BytesAdded) const { // In this function we output 32-bit branches, which should always // have enough range. They can be shortened and relaxed by later code // in the pipeline, if desired. // Shouldn't be a fall through. - assert(TBB && "InsertBranch must not be told to insert a fallthrough"); + assert(TBB && "insertBranch must not be told to insert a fallthrough"); assert((Cond.size() == 2 || Cond.size() == 0) && "SystemZ branch conditions have one component!"); + assert(!BytesAdded && "code size not handled"); if (Cond.empty()) { // Unconditional branch? @@ -520,30 +586,128 @@ bool SystemZInstrInfo::optimizeCompareInstr( removeIPMBasedCompare(Compare, SrcReg, MRI, &RI); } -// If Opcode is a move that has a conditional variant, return that variant, -// otherwise return 0. -static unsigned getConditionalMove(unsigned Opcode) { - switch (Opcode) { - case SystemZ::LR: return SystemZ::LOCR; - case SystemZ::LGR: return SystemZ::LOCGR; - default: return 0; + +bool SystemZInstrInfo::canInsertSelect(const MachineBasicBlock &MBB, + ArrayRef<MachineOperand> Pred, + unsigned TrueReg, unsigned FalseReg, + int &CondCycles, int &TrueCycles, + int &FalseCycles) const { + // Not all subtargets have LOCR instructions. + if (!STI.hasLoadStoreOnCond()) + return false; + if (Pred.size() != 2) + return false; + + // Check register classes. + const MachineRegisterInfo &MRI = MBB.getParent()->getRegInfo(); + const TargetRegisterClass *RC = + RI.getCommonSubClass(MRI.getRegClass(TrueReg), MRI.getRegClass(FalseReg)); + if (!RC) + return false; + + // We have LOCR instructions for 32 and 64 bit general purpose registers. + if ((STI.hasLoadStoreOnCond2() && + SystemZ::GRX32BitRegClass.hasSubClassEq(RC)) || + SystemZ::GR32BitRegClass.hasSubClassEq(RC) || + SystemZ::GR64BitRegClass.hasSubClassEq(RC)) { + CondCycles = 2; + TrueCycles = 2; + FalseCycles = 2; + return true; } + + // Can't do anything else. + return false; } -static unsigned getConditionalLoadImmediate(unsigned Opcode) { - switch (Opcode) { - case SystemZ::LHI: return SystemZ::LOCHI; - case SystemZ::LGHI: return SystemZ::LOCGHI; - default: return 0; +void SystemZInstrInfo::insertSelect(MachineBasicBlock &MBB, + MachineBasicBlock::iterator I, + const DebugLoc &DL, unsigned DstReg, + ArrayRef<MachineOperand> Pred, + unsigned TrueReg, + unsigned FalseReg) const { + MachineRegisterInfo &MRI = MBB.getParent()->getRegInfo(); + const TargetRegisterClass *RC = MRI.getRegClass(DstReg); + + assert(Pred.size() == 2 && "Invalid condition"); + unsigned CCValid = Pred[0].getImm(); + unsigned CCMask = Pred[1].getImm(); + + unsigned Opc; + if (SystemZ::GRX32BitRegClass.hasSubClassEq(RC)) { + if (STI.hasLoadStoreOnCond2()) + Opc = SystemZ::LOCRMux; + else { + Opc = SystemZ::LOCR; + MRI.constrainRegClass(DstReg, &SystemZ::GR32BitRegClass); + } + } else if (SystemZ::GR64BitRegClass.hasSubClassEq(RC)) + Opc = SystemZ::LOCGR; + else + llvm_unreachable("Invalid register class"); + + BuildMI(MBB, I, DL, get(Opc), DstReg) + .addReg(FalseReg).addReg(TrueReg) + .addImm(CCValid).addImm(CCMask); +} + +bool SystemZInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI, + unsigned Reg, + MachineRegisterInfo *MRI) const { + unsigned DefOpc = DefMI.getOpcode(); + if (DefOpc != SystemZ::LHIMux && DefOpc != SystemZ::LHI && + DefOpc != SystemZ::LGHI) + return false; + if (DefMI.getOperand(0).getReg() != Reg) + return false; + int32_t ImmVal = (int32_t)DefMI.getOperand(1).getImm(); + + unsigned UseOpc = UseMI.getOpcode(); + unsigned NewUseOpc; + unsigned UseIdx; + int CommuteIdx = -1; + switch (UseOpc) { + case SystemZ::LOCRMux: + if (!STI.hasLoadStoreOnCond2()) + return false; + NewUseOpc = SystemZ::LOCHIMux; + if (UseMI.getOperand(2).getReg() == Reg) + UseIdx = 2; + else if (UseMI.getOperand(1).getReg() == Reg) + UseIdx = 2, CommuteIdx = 1; + else + return false; + break; + case SystemZ::LOCGR: + if (!STI.hasLoadStoreOnCond2()) + return false; + NewUseOpc = SystemZ::LOCGHI; + if (UseMI.getOperand(2).getReg() == Reg) + UseIdx = 2; + else if (UseMI.getOperand(1).getReg() == Reg) + UseIdx = 2, CommuteIdx = 1; + else + return false; + break; + default: + return false; } + + if (CommuteIdx != -1) + if (!commuteInstruction(UseMI, false, CommuteIdx, UseIdx)) + return false; + + bool DeleteDef = MRI->hasOneNonDBGUse(Reg); + UseMI.setDesc(get(NewUseOpc)); + UseMI.getOperand(UseIdx).ChangeToImmediate(ImmVal); + if (DeleteDef) + DefMI.eraseFromParent(); + + return true; } bool SystemZInstrInfo::isPredicable(MachineInstr &MI) const { unsigned Opcode = MI.getOpcode(); - if (STI.hasLoadStoreOnCond() && getConditionalMove(Opcode)) - return true; - if (STI.hasLoadStoreOnCond2() && getConditionalLoadImmediate(Opcode)) - return true; if (Opcode == SystemZ::Return || Opcode == SystemZ::Trap || Opcode == SystemZ::CallJG || @@ -595,26 +759,6 @@ bool SystemZInstrInfo::PredicateInstruction( unsigned CCMask = Pred[1].getImm(); assert(CCMask > 0 && CCMask < 15 && "Invalid predicate"); unsigned Opcode = MI.getOpcode(); - if (STI.hasLoadStoreOnCond()) { - if (unsigned CondOpcode = getConditionalMove(Opcode)) { - MI.setDesc(get(CondOpcode)); - MachineInstrBuilder(*MI.getParent()->getParent(), MI) - .addImm(CCValid) - .addImm(CCMask) - .addReg(SystemZ::CC, RegState::Implicit); - return true; - } - } - if (STI.hasLoadStoreOnCond2()) { - if (unsigned CondOpcode = getConditionalLoadImmediate(Opcode)) { - MI.setDesc(get(CondOpcode)); - MachineInstrBuilder(*MI.getParent()->getParent(), MI) - .addImm(CCValid) - .addImm(CCMask) - .addReg(SystemZ::CC, RegState::Implicit); - return true; - } - } if (Opcode == SystemZ::Trap) { MI.setDesc(get(SystemZ::CondTrap)); MachineInstrBuilder(*MI.getParent()->getParent(), MI) @@ -690,6 +834,14 @@ void SystemZInstrInfo::copyPhysReg(MachineBasicBlock &MBB, Opcode = SystemZ::VLR64; else if (SystemZ::VR128BitRegClass.contains(DestReg, SrcReg)) Opcode = SystemZ::VLR; + else if (SystemZ::AR32BitRegClass.contains(DestReg, SrcReg)) + Opcode = SystemZ::CPYA; + else if (SystemZ::AR32BitRegClass.contains(DestReg) && + SystemZ::GR32BitRegClass.contains(SrcReg)) + Opcode = SystemZ::SAR; + else if (SystemZ::GR32BitRegClass.contains(DestReg) && + SystemZ::AR32BitRegClass.contains(SrcReg)) + Opcode = SystemZ::EAR; else llvm_unreachable("Impossible reg-to-reg copy"); @@ -875,8 +1027,8 @@ MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl( MachineBasicBlock::iterator InsertPt, int FrameIndex, LiveIntervals *LIS) const { const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); - const MachineFrameInfo *MFI = MF.getFrameInfo(); - unsigned Size = MFI->getObjectSize(FrameIndex); + const MachineFrameInfo &MFI = MF.getFrameInfo(); + unsigned Size = MFI.getObjectSize(FrameIndex); unsigned Opcode = MI.getOpcode(); if (Ops.size() == 2 && Ops[0] == 0 && Ops[1] == 1) { @@ -1077,6 +1229,18 @@ bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { expandRXYPseudo(MI, SystemZ::L, SystemZ::LFH); return true; + case SystemZ::LOCMux: + expandLOCPseudo(MI, SystemZ::LOC, SystemZ::LOCFH); + return true; + + case SystemZ::LOCHIMux: + expandLOCPseudo(MI, SystemZ::LOCHI, SystemZ::LOCHHI); + return true; + + case SystemZ::LOCRMux: + expandLOCRPseudo(MI, SystemZ::LOCR, SystemZ::LOCFHR); + return true; + case SystemZ::STCMux: expandRXYPseudo(MI, SystemZ::STC, SystemZ::STCH); return true; @@ -1089,6 +1253,10 @@ bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { expandRXYPseudo(MI, SystemZ::ST, SystemZ::STFH); return true; + case SystemZ::STOCMux: + expandLOCPseudo(MI, SystemZ::STOC, SystemZ::STOCFH); + return true; + case SystemZ::LHIMux: expandRIPseudo(MI, SystemZ::LHI, SystemZ::IIHF, true); return true; @@ -1153,6 +1321,10 @@ bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { expandRIPseudo(MI, SystemZ::AFI, SystemZ::AIH, false); return true; + case SystemZ::CHIMux: + expandRIPseudo(MI, SystemZ::CHI, SystemZ::CIH, false); + return true; + case SystemZ::CFIMux: expandRIPseudo(MI, SystemZ::CFI, SystemZ::CIH, false); return true; @@ -1194,7 +1366,7 @@ bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { } } -uint64_t SystemZInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const { +unsigned SystemZInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const { if (MI.getOpcode() == TargetOpcode::INLINEASM) { const MachineFunction *MF = MI.getParent()->getParent(); const char *AsmStr = MI.getOperand(0).getSymbolName(); @@ -1218,6 +1390,7 @@ SystemZInstrInfo::getBranchInfo(const MachineInstr &MI) const { MI.getOperand(1).getImm(), &MI.getOperand(2)); case SystemZ::BRCT: + case SystemZ::BRCTH: return SystemZII::Branch(SystemZII::BranchCT, SystemZ::CCMASK_ICMP, SystemZ::CCMASK_CMP_NE, &MI.getOperand(2)); @@ -1403,6 +1576,14 @@ unsigned SystemZInstrInfo::getFusedCompare(unsigned Opcode, case SystemZ::CLGFI: if (!(MI && isUInt<8>(MI->getOperand(1).getImm()))) return 0; + break; + case SystemZ::CL: + case SystemZ::CLG: + if (!STI.hasMiscellaneousExtensions()) + return 0; + if (!(MI && MI->getOperand(3).getReg() == 0)) + return 0; + break; } switch (Type) { case SystemZII::CompareAndBranch: @@ -1486,6 +1667,10 @@ unsigned SystemZInstrInfo::getFusedCompare(unsigned Opcode, return SystemZ::CLFIT; case SystemZ::CLGFI: return SystemZ::CLGIT; + case SystemZ::CL: + return SystemZ::CLT; + case SystemZ::CLG: + return SystemZ::CLGT; default: return 0; } @@ -1493,6 +1678,25 @@ unsigned SystemZInstrInfo::getFusedCompare(unsigned Opcode, return 0; } +unsigned SystemZInstrInfo::getLoadAndTrap(unsigned Opcode) const { + if (!STI.hasLoadAndTrap()) + return 0; + switch (Opcode) { + case SystemZ::L: + case SystemZ::LY: + return SystemZ::LAT; + case SystemZ::LG: + return SystemZ::LGAT; + case SystemZ::LFH: + return SystemZ::LFHAT; + case SystemZ::LLGF: + return SystemZ::LLGFAT; + case SystemZ::LLGT: + return SystemZ::LLGTAT; + } + return 0; +} + void SystemZInstrInfo::loadImmediate(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, unsigned Reg, uint64_t Value) const { @@ -1511,3 +1715,38 @@ void SystemZInstrInfo::loadImmediate(MachineBasicBlock &MBB, } BuildMI(MBB, MBBI, DL, get(Opcode), Reg).addImm(Value); } + +bool SystemZInstrInfo:: +areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb, + AliasAnalysis *AA) const { + + if (!MIa.hasOneMemOperand() || !MIb.hasOneMemOperand()) + return false; + + // If mem-operands show that the same address Value is used by both + // instructions, check for non-overlapping offsets and widths. Not + // sure if a register based analysis would be an improvement... + + MachineMemOperand *MMOa = *MIa.memoperands_begin(); + MachineMemOperand *MMOb = *MIb.memoperands_begin(); + const Value *VALa = MMOa->getValue(); + const Value *VALb = MMOb->getValue(); + bool SameVal = (VALa && VALb && (VALa == VALb)); + if (!SameVal) { + const PseudoSourceValue *PSVa = MMOa->getPseudoValue(); + const PseudoSourceValue *PSVb = MMOb->getPseudoValue(); + if (PSVa && PSVb && (PSVa == PSVb)) + SameVal = true; + } + if (SameVal) { + int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset(); + int WidthA = MMOa->getSize(), WidthB = MMOb->getSize(); + int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB; + int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA; + int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB; + if (LowOffset + LowWidth <= HighOffset) + return true; + } + + return false; +} diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.h b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.h index 010010b..794b193 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.h @@ -142,6 +142,10 @@ class SystemZInstrInfo : public SystemZGenInstrInfo { unsigned LowOpcodeK, unsigned HighOpcode) const; void expandRXYPseudo(MachineInstr &MI, unsigned LowOpcode, unsigned HighOpcode) const; + void expandLOCPseudo(MachineInstr &MI, unsigned LowOpcode, + unsigned HighOpcode) const; + void expandLOCRPseudo(MachineInstr &MI, unsigned LowOpcode, + unsigned HighOpcode) const; void expandZExtPseudo(MachineInstr &MI, unsigned LowOpcode, unsigned Size) const; void expandLoadStackGuard(MachineInstr *MI) const; @@ -149,7 +153,23 @@ class SystemZInstrInfo : public SystemZGenInstrInfo { const DebugLoc &DL, unsigned DestReg, unsigned SrcReg, unsigned LowLowOpcode, unsigned Size, bool KillSrc) const; virtual void anchor(); - + +protected: + /// Commutes the operands in the given instruction by changing the operands + /// order and/or changing the instruction's opcode and/or the immediate value + /// operand. + /// + /// The arguments 'CommuteOpIdx1' and 'CommuteOpIdx2' specify the operands + /// to be commuted. + /// + /// Do not call this method for a non-commutable instruction or + /// non-commutable operands. + /// Even though the instruction is commutable, the method may still + /// fail to commute the operands, null pointer is returned in such cases. + MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI, + unsigned CommuteOpIdx1, + unsigned CommuteOpIdx2) const override; + public: explicit SystemZInstrInfo(SystemZSubtarget &STI); @@ -164,15 +184,25 @@ public: MachineBasicBlock *&FBB, SmallVectorImpl<MachineOperand> &Cond, bool AllowModify) const override; - unsigned RemoveBranch(MachineBasicBlock &MBB) const override; - unsigned InsertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB, + unsigned removeBranch(MachineBasicBlock &MBB, + int *BytesRemoved = nullptr) const override; + unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB, MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond, - const DebugLoc &DL) const override; + const DebugLoc &DL, + int *BytesAdded = nullptr) const override; bool analyzeCompare(const MachineInstr &MI, unsigned &SrcReg, unsigned &SrcReg2, int &Mask, int &Value) const override; bool optimizeCompareInstr(MachineInstr &CmpInstr, unsigned SrcReg, unsigned SrcReg2, int Mask, int Value, const MachineRegisterInfo *MRI) const override; + bool canInsertSelect(const MachineBasicBlock&, ArrayRef<MachineOperand> Cond, + unsigned, unsigned, int&, int&, int&) const override; + void insertSelect(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, + const DebugLoc &DL, unsigned DstReg, + ArrayRef<MachineOperand> Cond, unsigned TrueReg, + unsigned FalseReg) const override; + bool FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI, unsigned Reg, + MachineRegisterInfo *MRI) const override; bool isPredicable(MachineInstr &MI) const override; bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles, unsigned ExtraPredCycles, @@ -212,14 +242,14 @@ public: MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI, LiveIntervals *LIS = nullptr) const override; bool expandPostRAPseudo(MachineInstr &MBBI) const override; - bool ReverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const + bool reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const override; // Return the SystemZRegisterInfo, which this class owns. const SystemZRegisterInfo &getRegisterInfo() const { return RI; } // Return the size in bytes of MI. - uint64_t getInstSizeInBytes(const MachineInstr &MI) const; + unsigned getInstSizeInBytes(const MachineInstr &MI) const override; // Return true if MI is a conditional or unconditional branch. // When returning true, set Cond to the mask of condition-code @@ -256,11 +286,23 @@ public: SystemZII::FusedCompareType Type, const MachineInstr *MI = nullptr) const; + // If Opcode is a LOAD opcode for with an associated LOAD AND TRAP + // operation exists, returh the opcode for the latter, otherwise return 0. + unsigned getLoadAndTrap(unsigned Opcode) const; + // Emit code before MBBI in MI to move immediate value Value into // physical register Reg. void loadImmediate(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, unsigned Reg, uint64_t Value) const; + + // Sometimes, it is possible for the target to tell, even without + // aliasing information, that two MIs access different memory + // addresses. This function returns true if two MIs access different + // memory addresses and false otherwise. + bool + areMemAccessesTriviallyDisjoint(MachineInstr &MIa, MachineInstr &MIb, + AliasAnalysis *AA = nullptr) const override; }; } // end namespace llvm diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.td b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.td index c510ca7..d63525f 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrInfo.td @@ -11,10 +11,12 @@ // Stack allocation //===----------------------------------------------------------------------===// -def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt), - [(callseq_start timm:$amt)]>; -def ADJCALLSTACKUP : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2), - [(callseq_end timm:$amt1, timm:$amt2)]>; +let hasNoSchedulingInfo = 1 in { + def ADJCALLSTACKDOWN : Pseudo<(outs), (ins i64imm:$amt), + [(callseq_start timm:$amt)]>; + def ADJCALLSTACKUP : Pseudo<(outs), (ins i64imm:$amt1, i64imm:$amt2), + [(callseq_end timm:$amt1, timm:$amt2)]>; +} let hasSideEffects = 0 in { // Takes as input the value of the stack pointer after a dynamic allocation @@ -29,348 +31,225 @@ let hasSideEffects = 0 in { } //===----------------------------------------------------------------------===// -// Control flow instructions +// Branch instructions //===----------------------------------------------------------------------===// -// A return instruction (br %r14). -let isReturn = 1, isTerminator = 1, isBarrier = 1, hasCtrlDep = 1 in - def Return : Alias<2, (outs), (ins), [(z_retflag)]>; - -// A conditional return instruction (bcr <cond>, %r14). -let isReturn = 1, isTerminator = 1, hasCtrlDep = 1, CCMaskFirst = 1, Uses = [CC] in - def CondReturn : Alias<2, (outs), (ins cond4:$valid, cond4:$R1), []>; - -// Fused compare and conditional returns. -let isReturn = 1, isTerminator = 1, hasCtrlDep = 1 in { - def CRBReturn : Alias<6, (outs), (ins GR32:$R1, GR32:$R2, cond4:$M3), []>; - def CGRBReturn : Alias<6, (outs), (ins GR64:$R1, GR64:$R2, cond4:$M3), []>; - def CIBReturn : Alias<6, (outs), (ins GR32:$R1, imm32sx8:$I2, cond4:$M3), []>; - def CGIBReturn : Alias<6, (outs), (ins GR64:$R1, imm64sx8:$I2, cond4:$M3), []>; - def CLRBReturn : Alias<6, (outs), (ins GR32:$R1, GR32:$R2, cond4:$M3), []>; - def CLGRBReturn : Alias<6, (outs), (ins GR64:$R1, GR64:$R2, cond4:$M3), []>; - def CLIBReturn : Alias<6, (outs), (ins GR32:$R1, imm32zx8:$I2, cond4:$M3), []>; - def CLGIBReturn : Alias<6, (outs), (ins GR64:$R1, imm64zx8:$I2, cond4:$M3), []>; -} - -// Unconditional branches. R1 is the condition-code mask (all 1s). -let isBranch = 1, isTerminator = 1, isBarrier = 1, R1 = 15 in { - let isIndirectBranch = 1 in - def BR : InstRR<0x07, (outs), (ins ADDR64:$R2), - "br\t$R2", [(brind ADDR64:$R2)]>; - - // An assembler extended mnemonic for BRC. - def J : InstRI<0xA74, (outs), (ins brtarget16:$I2), "j\t$I2", - [(br bb:$I2)]>; - - // An assembler extended mnemonic for BRCL. (The extension is "G" - // rather than "L" because "JL" is "Jump if Less".) - def JG : InstRIL<0xC04, (outs), (ins brtarget32:$I2), "jg\t$I2", []>; -} +// Conditional branches. +let isBranch = 1, isTerminator = 1, Uses = [CC] in { + // It's easier for LLVM to handle these branches in their raw BRC/BRCL form + // with the condition-code mask being the first operand. It seems friendlier + // to use mnemonic forms like JE and JLH when writing out the assembly though. + let isCodeGenOnly = 1 in { + // An assembler extended mnemonic for BRC. + def BRC : CondBranchRI <"j#", 0xA74, z_br_ccmask>; + // An assembler extended mnemonic for BRCL. (The extension is "G" + // rather than "L" because "JL" is "Jump if Less".) + def BRCL : CondBranchRIL<"jg#", 0xC04>; + let isIndirectBranch = 1 in { + def BC : CondBranchRX<"b#", 0x47>; + def BCR : CondBranchRR<"b#r", 0x07>; + } + } -// FIXME: This trap instruction should be marked as isTerminator, but there is -// currently a general bug that allows non-terminators to be placed between -// terminators. Temporarily leave this unmarked until the bug is fixed. -let isBarrier = 1, hasCtrlDep = 1 in { - def Trap : Alias<4, (outs), (ins), [(trap)]>; -} + // Allow using the raw forms directly from the assembler (and occasional + // special code generation needs) as well. + def BRCAsm : AsmCondBranchRI <"brc", 0xA74>; + def BRCLAsm : AsmCondBranchRIL<"brcl", 0xC04>; + let isIndirectBranch = 1 in { + def BCAsm : AsmCondBranchRX<"bc", 0x47>; + def BCRAsm : AsmCondBranchRR<"bcr", 0x07>; + } -let isTerminator = 1, hasCtrlDep = 1, Uses = [CC] in { - def CondTrap : Alias<4, (outs), (ins cond4:$valid, cond4:$R1), []>; + // Define AsmParser extended mnemonics for each general condition-code mask + // (integer or floating-point) + foreach V = [ "E", "NE", "H", "NH", "L", "NL", "HE", "NHE", "LE", "NLE", + "Z", "NZ", "P", "NP", "M", "NM", "LH", "NLH", "O", "NO" ] in { + def JAsm#V : FixedCondBranchRI <CV<V>, "j#", 0xA74>; + def JGAsm#V : FixedCondBranchRIL<CV<V>, "jg#", 0xC04>; + let isIndirectBranch = 1 in { + def BAsm#V : FixedCondBranchRX <CV<V>, "b#", 0x47>; + def BRAsm#V : FixedCondBranchRR <CV<V>, "b#r", 0x07>; + } + } } -// Conditional branches. It's easier for LLVM to handle these branches -// in their raw BRC/BRCL form, with the 4-bit condition-code mask being -// the first operand. It seems friendlier to use mnemonic forms like -// JE and JLH when writing out the assembly though. -let isBranch = 1, isTerminator = 1, Uses = [CC] in { - let isCodeGenOnly = 1, CCMaskFirst = 1 in { - def BRC : InstRI<0xA74, (outs), (ins cond4:$valid, cond4:$R1, - brtarget16:$I2), "j$R1\t$I2", - [(z_br_ccmask cond4:$valid, cond4:$R1, bb:$I2)]>; - def BRCL : InstRIL<0xC04, (outs), (ins cond4:$valid, cond4:$R1, - brtarget32:$I2), "jg$R1\t$I2", []>; - let isIndirectBranch = 1 in - def BCR : InstRR<0x07, (outs), (ins cond4:$valid, cond4:$R1, GR64:$R2), - "b${R1}r\t$R2", []>; - } - def AsmBRC : InstRI<0xA74, (outs), (ins imm32zx4:$R1, brtarget16:$I2), - "brc\t$R1, $I2", []>; - def AsmBRCL : InstRIL<0xC04, (outs), (ins imm32zx4:$R1, brtarget32:$I2), - "brcl\t$R1, $I2", []>; +// Unconditional branches. These are in fact simply variants of the +// conditional branches with the condition mask set to "always". +let isBranch = 1, isTerminator = 1, isBarrier = 1 in { + def J : FixedCondBranchRI <CondAlways, "j", 0xA74, br>; + def JG : FixedCondBranchRIL<CondAlways, "jg", 0xC04>; let isIndirectBranch = 1 in { - def AsmBC : InstRX<0x47, (outs), (ins imm32zx4:$R1, bdxaddr12only:$XBD2), - "bc\t$R1, $XBD2", []>; - def AsmBCR : InstRR<0x07, (outs), (ins imm32zx4:$R1, GR64:$R2), - "bcr\t$R1, $R2", []>; + def B : FixedCondBranchRX<CondAlways, "b", 0x47>; + def BR : FixedCondBranchRR<CondAlways, "br", 0x07, brind>; } } -def AsmNop : InstAlias<"nop\t$XBD", (AsmBC 0, bdxaddr12only:$XBD), 0>; -def AsmNopR : InstAlias<"nopr\t$R", (AsmBCR 0, GR64:$R), 0>; +// NOPs. These are again variants of the conditional branches, +// with the condition mask set to "never". +def NOP : InstAlias<"nop\t$XBD", (BCAsm 0, bdxaddr12only:$XBD), 0>; +def NOPR : InstAlias<"nopr\t$R", (BCRAsm 0, GR64:$R), 0>; -// Fused compare-and-branch instructions. As for normal branches, -// we handle these instructions internally in their raw CRJ-like form, -// but use assembly macros like CRJE when writing them out. +// Fused compare-and-branch instructions. // // These instructions do not use or clobber the condition codes. -// We nevertheless pretend that they clobber CC, so that we can lower -// them to separate comparisons and BRCLs if the branch ends up being -// out of range. -multiclass CompareBranches<Operand ccmask, string pos1, string pos2> { - let isBranch = 1, isTerminator = 1, Defs = [CC] in { - def RJ : InstRIEb<0xEC76, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3, - brtarget16:$RI4), - "crj"##pos1##"\t$R1, $R2"##pos2##", $RI4", []>; - def GRJ : InstRIEb<0xEC64, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3, - brtarget16:$RI4), - "cgrj"##pos1##"\t$R1, $R2"##pos2##", $RI4", []>; - def IJ : InstRIEc<0xEC7E, (outs), (ins GR32:$R1, imm32sx8:$I2, ccmask:$M3, - brtarget16:$RI4), - "cij"##pos1##"\t$R1, $I2"##pos2##", $RI4", []>; - def GIJ : InstRIEc<0xEC7C, (outs), (ins GR64:$R1, imm64sx8:$I2, ccmask:$M3, - brtarget16:$RI4), - "cgij"##pos1##"\t$R1, $I2"##pos2##", $RI4", []>; - def LRJ : InstRIEb<0xEC77, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3, - brtarget16:$RI4), - "clrj"##pos1##"\t$R1, $R2"##pos2##", $RI4", []>; - def LGRJ : InstRIEb<0xEC65, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3, - brtarget16:$RI4), - "clgrj"##pos1##"\t$R1, $R2"##pos2##", $RI4", []>; - def LIJ : InstRIEc<0xEC7F, (outs), (ins GR32:$R1, imm32zx8:$I2, ccmask:$M3, - brtarget16:$RI4), - "clij"##pos1##"\t$R1, $I2"##pos2##", $RI4", []>; - def LGIJ : InstRIEc<0xEC7D, (outs), (ins GR64:$R1, imm64zx8:$I2, ccmask:$M3, - brtarget16:$RI4), - "clgij"##pos1##"\t$R1, $I2"##pos2##", $RI4", []>; - let isIndirectBranch = 1 in { - def RB : InstRRS<0xECF6, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3, - bdaddr12only:$BD4), - "crb"##pos1##"\t$R1, $R2"##pos2##", $BD4", []>; - def GRB : InstRRS<0xECE4, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3, - bdaddr12only:$BD4), - "cgrb"##pos1##"\t$R1, $R2"##pos2##", $BD4", []>; - def IB : InstRIS<0xECFE, (outs), (ins GR32:$R1, imm32sx8:$I2, ccmask:$M3, - bdaddr12only:$BD4), - "cib"##pos1##"\t$R1, $I2"##pos2##", $BD4", []>; - def GIB : InstRIS<0xECFC, (outs), (ins GR64:$R1, imm64sx8:$I2, ccmask:$M3, - bdaddr12only:$BD4), - "cgib"##pos1##"\t$R1, $I2"##pos2##", $BD4", []>; - def LRB : InstRRS<0xECF7, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3, - bdaddr12only:$BD4), - "clrb"##pos1##"\t$R1, $R2"##pos2##", $BD4", []>; - def LGRB : InstRRS<0xECE5, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3, - bdaddr12only:$BD4), - "clgrb"##pos1##"\t$R1, $R2"##pos2##", $BD4", []>; - def LIB : InstRIS<0xECFF, (outs), (ins GR32:$R1, imm32zx8:$I2, ccmask:$M3, - bdaddr12only:$BD4), - "clib"##pos1##"\t$R1, $I2"##pos2##", $BD4", []>; - def LGIB : InstRIS<0xECFD, (outs), (ins GR64:$R1, imm64zx8:$I2, ccmask:$M3, - bdaddr12only:$BD4), - "clgib"##pos1##"\t$R1, $I2"##pos2##", $BD4", []>; - } +// We nevertheless pretend that the relative compare-and-branch +// instructions clobber CC, so that we can lower them to separate +// comparisons and BRCLs if the branch ends up being out of range. +let isBranch = 1, isTerminator = 1 in { + // As for normal branches, we handle these instructions internally in + // their raw CRJ-like form, but use assembly macros like CRJE when writing + // them out. Using the *Pair multiclasses, we also create the raw forms. + let Defs = [CC] in { + defm CRJ : CmpBranchRIEbPair<"crj", 0xEC76, GR32>; + defm CGRJ : CmpBranchRIEbPair<"cgrj", 0xEC64, GR64>; + defm CIJ : CmpBranchRIEcPair<"cij", 0xEC7E, GR32, imm32sx8>; + defm CGIJ : CmpBranchRIEcPair<"cgij", 0xEC7C, GR64, imm64sx8>; + defm CLRJ : CmpBranchRIEbPair<"clrj", 0xEC77, GR32>; + defm CLGRJ : CmpBranchRIEbPair<"clgrj", 0xEC65, GR64>; + defm CLIJ : CmpBranchRIEcPair<"clij", 0xEC7F, GR32, imm32zx8>; + defm CLGIJ : CmpBranchRIEcPair<"clgij", 0xEC7D, GR64, imm64zx8>; } - - let isTerminator = 1, hasCtrlDep = 1 in { - def RT : InstRRFc<0xB972, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3), - "crt"##pos1##"\t$R1, $R2"##pos2, []>; - def GRT : InstRRFc<0xB960, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3), - "cgrt"##pos1##"\t$R1, $R2"##pos2, []>; - def LRT : InstRRFc<0xB973, (outs), (ins GR32:$R1, GR32:$R2, ccmask:$M3), - "clrt"##pos1##"\t$R1, $R2"##pos2, []>; - def LGRT : InstRRFc<0xB961, (outs), (ins GR64:$R1, GR64:$R2, ccmask:$M3), - "clgrt"##pos1##"\t$R1, $R2"##pos2, []>; - def IT : InstRIEa<0xEC72, (outs), (ins GR32:$R1, imm32sx16:$I2, ccmask:$M3), - "cit"##pos1##"\t$R1, $I2"##pos2, []>; - def GIT : InstRIEa<0xEC70, (outs), (ins GR64:$R1, imm32sx16:$I2, ccmask:$M3), - "cgit"##pos1##"\t$R1, $I2"##pos2, []>; - def LFIT : InstRIEa<0xEC73, (outs), (ins GR32:$R1, imm32zx16:$I2, ccmask:$M3), - "clfit"##pos1##"\t$R1, $I2"##pos2, []>; - def LGIT : InstRIEa<0xEC71, (outs), (ins GR64:$R1, imm32zx16:$I2, ccmask:$M3), - "clgit"##pos1##"\t$R1, $I2"##pos2, []>; - } -} -let isCodeGenOnly = 1 in - defm C : CompareBranches<cond4, "$M3", "">; -defm AsmC : CompareBranches<imm32zx4, "", ", $M3">; - -// Define AsmParser mnemonics for each general condition-code mask -// (integer or floating-point) -multiclass CondExtendedMnemonicA<bits<4> ccmask, string name> { - let isBranch = 1, isTerminator = 1, R1 = ccmask in { - def J : InstRI<0xA74, (outs), (ins brtarget16:$I2), - "j"##name##"\t$I2", []>; - def JG : InstRIL<0xC04, (outs), (ins brtarget32:$I2), - "jg"##name##"\t$I2", []>; - def BR : InstRR<0x07, (outs), (ins ADDR64:$R2), "b"##name##"r\t$R2", []>; + let isIndirectBranch = 1 in { + defm CRB : CmpBranchRRSPair<"crb", 0xECF6, GR32>; + defm CGRB : CmpBranchRRSPair<"cgrb", 0xECE4, GR64>; + defm CIB : CmpBranchRISPair<"cib", 0xECFE, GR32, imm32sx8>; + defm CGIB : CmpBranchRISPair<"cgib", 0xECFC, GR64, imm64sx8>; + defm CLRB : CmpBranchRRSPair<"clrb", 0xECF7, GR32>; + defm CLGRB : CmpBranchRRSPair<"clgrb", 0xECE5, GR64>; + defm CLIB : CmpBranchRISPair<"clib", 0xECFF, GR32, imm32zx8>; + defm CLGIB : CmpBranchRISPair<"clgib", 0xECFD, GR64, imm64zx8>; } - def LOCR : FixedCondUnaryRRF<"locr"##name, 0xB9F2, GR32, GR32, ccmask>; - def LOCGR : FixedCondUnaryRRF<"locgr"##name, 0xB9E2, GR64, GR64, ccmask>; - def LOCHI : FixedCondUnaryRIE<"lochi"##name, 0xEC42, GR64, imm32sx16, - ccmask>; - def LOCGHI: FixedCondUnaryRIE<"locghi"##name, 0xEC46, GR64, imm64sx16, - ccmask>; - def LOC : FixedCondUnaryRSY<"loc"##name, 0xEBF2, GR32, ccmask, 4>; - def LOCG : FixedCondUnaryRSY<"locg"##name, 0xEBE2, GR64, ccmask, 8>; - def STOC : FixedCondStoreRSY<"stoc"##name, 0xEBF3, GR32, ccmask, 4>; - def STOCG : FixedCondStoreRSY<"stocg"##name, 0xEBE3, GR64, ccmask, 8>; -} - -multiclass CondExtendedMnemonic<bits<4> ccmask, string name1, string name2> - : CondExtendedMnemonicA<ccmask, name1> { - let isAsmParserOnly = 1 in - defm Alt : CondExtendedMnemonicA<ccmask, name2>; -} - -defm AsmO : CondExtendedMnemonicA<1, "o">; -defm AsmH : CondExtendedMnemonic<2, "h", "p">; -defm AsmNLE : CondExtendedMnemonicA<3, "nle">; -defm AsmL : CondExtendedMnemonic<4, "l", "m">; -defm AsmNHE : CondExtendedMnemonicA<5, "nhe">; -defm AsmLH : CondExtendedMnemonicA<6, "lh">; -defm AsmNE : CondExtendedMnemonic<7, "ne", "nz">; -defm AsmE : CondExtendedMnemonic<8, "e", "z">; -defm AsmNLH : CondExtendedMnemonicA<9, "nlh">; -defm AsmHE : CondExtendedMnemonicA<10, "he">; -defm AsmNL : CondExtendedMnemonic<11, "nl", "nm">; -defm AsmLE : CondExtendedMnemonicA<12, "le">; -defm AsmNH : CondExtendedMnemonic<13, "nh", "np">; -defm AsmNO : CondExtendedMnemonicA<14, "no">; - -// Define AsmParser mnemonics for each integer condition-code mask. -// This is like the list above, except that condition 3 is not possible -// and that the low bit of the mask is therefore always 0. This means -// that each condition has two names. Conditions "o" and "no" are not used. -// -// We don't make one of the two names an alias of the other because -// we need the custom parsing routines to select the correct register class. -multiclass IntCondExtendedMnemonicA<bits<4> ccmask, string name> { - let isBranch = 1, isTerminator = 1, M3 = ccmask in { - def CRJ : InstRIEb<0xEC76, (outs), (ins GR32:$R1, GR32:$R2, - brtarget16:$RI4), - "crj"##name##"\t$R1, $R2, $RI4", []>; - def CGRJ : InstRIEb<0xEC64, (outs), (ins GR64:$R1, GR64:$R2, - brtarget16:$RI4), - "cgrj"##name##"\t$R1, $R2, $RI4", []>; - def CIJ : InstRIEc<0xEC7E, (outs), (ins GR32:$R1, imm32sx8:$I2, - brtarget16:$RI4), - "cij"##name##"\t$R1, $I2, $RI4", []>; - def CGIJ : InstRIEc<0xEC7C, (outs), (ins GR64:$R1, imm64sx8:$I2, - brtarget16:$RI4), - "cgij"##name##"\t$R1, $I2, $RI4", []>; - def CLRJ : InstRIEb<0xEC77, (outs), (ins GR32:$R1, GR32:$R2, - brtarget16:$RI4), - "clrj"##name##"\t$R1, $R2, $RI4", []>; - def CLGRJ : InstRIEb<0xEC65, (outs), (ins GR64:$R1, GR64:$R2, - brtarget16:$RI4), - "clgrj"##name##"\t$R1, $R2, $RI4", []>; - def CLIJ : InstRIEc<0xEC7F, (outs), (ins GR32:$R1, imm32zx8:$I2, - brtarget16:$RI4), - "clij"##name##"\t$R1, $I2, $RI4", []>; - def CLGIJ : InstRIEc<0xEC7D, (outs), (ins GR64:$R1, imm64zx8:$I2, - brtarget16:$RI4), - "clgij"##name##"\t$R1, $I2, $RI4", []>; + + // Define AsmParser mnemonics for each integer condition-code mask. + foreach V = [ "E", "H", "L", "HE", "LE", "LH", + "NE", "NH", "NL", "NHE", "NLE", "NLH" ] in { + let Defs = [CC] in { + def CRJAsm#V : FixedCmpBranchRIEb<ICV<V>, "crj", 0xEC76, GR32>; + def CGRJAsm#V : FixedCmpBranchRIEb<ICV<V>, "cgrj", 0xEC64, GR64>; + def CIJAsm#V : FixedCmpBranchRIEc<ICV<V>, "cij", 0xEC7E, GR32, + imm32sx8>; + def CGIJAsm#V : FixedCmpBranchRIEc<ICV<V>, "cgij", 0xEC7C, GR64, + imm64sx8>; + def CLRJAsm#V : FixedCmpBranchRIEb<ICV<V>, "clrj", 0xEC77, GR32>; + def CLGRJAsm#V : FixedCmpBranchRIEb<ICV<V>, "clgrj", 0xEC65, GR64>; + def CLIJAsm#V : FixedCmpBranchRIEc<ICV<V>, "clij", 0xEC7F, GR32, + imm32zx8>; + def CLGIJAsm#V : FixedCmpBranchRIEc<ICV<V>, "clgij", 0xEC7D, GR64, + imm64zx8>; + } let isIndirectBranch = 1 in { - def CRB : InstRRS<0xECF6, (outs), (ins GR32:$R1, GR32:$R2, - bdaddr12only:$BD4), - "crb"##name##"\t$R1, $R2, $BD4", []>; - def CGRB : InstRRS<0xECE4, (outs), (ins GR64:$R1, GR64:$R2, - bdaddr12only:$BD4), - "cgrb"##name##"\t$R1, $R2, $BD4", []>; - def CIB : InstRIS<0xECFE, (outs), (ins GR32:$R1, imm32sx8:$I2, - bdaddr12only:$BD4), - "cib"##name##"\t$R1, $I2, $BD4", []>; - def CGIB : InstRIS<0xECFC, (outs), (ins GR64:$R1, imm64sx8:$I2, - bdaddr12only:$BD4), - "cgib"##name##"\t$R1, $I2, $BD4", []>; - def CLRB : InstRRS<0xECF7, (outs), (ins GR32:$R1, GR32:$R2, - bdaddr12only:$BD4), - "clrb"##name##"\t$R1, $R2, $BD4", []>; - def CLGRB : InstRRS<0xECE5, (outs), (ins GR64:$R1, GR64:$R2, - bdaddr12only:$BD4), - "clgrb"##name##"\t$R1, $R2, $BD4", []>; - def CLIB : InstRIS<0xECFF, (outs), (ins GR32:$R1, imm32zx8:$I2, - bdaddr12only:$BD4), - "clib"##name##"\t$R1, $I2, $BD4", []>; - def CLGIB : InstRIS<0xECFD, (outs), (ins GR64:$R1, imm64zx8:$I2, - bdaddr12only:$BD4), - "clgib"##name##"\t$R1, $I2, $BD4", []>; + def CRBAsm#V : FixedCmpBranchRRS<ICV<V>, "crb", 0xECF6, GR32>; + def CGRBAsm#V : FixedCmpBranchRRS<ICV<V>, "cgrb", 0xECE4, GR64>; + def CIBAsm#V : FixedCmpBranchRIS<ICV<V>, "cib", 0xECFE, GR32, + imm32sx8>; + def CGIBAsm#V : FixedCmpBranchRIS<ICV<V>, "cgib", 0xECFC, GR64, + imm64sx8>; + def CLRBAsm#V : FixedCmpBranchRRS<ICV<V>, "clrb", 0xECF7, GR32>; + def CLGRBAsm#V : FixedCmpBranchRRS<ICV<V>, "clgrb", 0xECE5, GR64>; + def CLIBAsm#V : FixedCmpBranchRIS<ICV<V>, "clib", 0xECFF, GR32, + imm32zx8>; + def CLGIBAsm#V : FixedCmpBranchRIS<ICV<V>, "clgib", 0xECFD, GR64, + imm64zx8>; } } +} - let hasCtrlDep = 1, isTerminator = 1, M3 = ccmask in { - def CRT : InstRRFc<0xB972, (outs), (ins GR32:$R1, GR32:$R2), - "crt"##name##"\t$R1, $R2", []>; - def CGRT : InstRRFc<0xB960, (outs), (ins GR64:$R1, GR64:$R2), - "cgrt"##name##"\t$R1, $R2", []>; - def CLRT : InstRRFc<0xB973, (outs), (ins GR32:$R1, GR32:$R2), - "clrt"##name##"\t$R1, $R2", []>; - def CLGRT : InstRRFc<0xB961, (outs), (ins GR64:$R1, GR64:$R2), - "clgrt"##name##"\t$R1, $R2", []>; - def CIT : InstRIEa<0xEC72, (outs), (ins GR32:$R1, imm32sx16:$I2), - "cit"##name##"\t$R1, $I2", []>; - def CGIT : InstRIEa<0xEC70, (outs), (ins GR64:$R1, imm32sx16:$I2), - "cgit"##name##"\t$R1, $I2", []>; - def CLFIT : InstRIEa<0xEC73, (outs), (ins GR32:$R1, imm32zx16:$I2), - "clfit"##name##"\t$R1, $I2", []>; - def CLGIT : InstRIEa<0xEC71, (outs), (ins GR64:$R1, imm32zx16:$I2), - "clgit"##name##"\t$R1, $I2", []>; +// Decrement a register and branch if it is nonzero. These don't clobber CC, +// but we might need to split long relative branches into sequences that do. +let isBranch = 1, isTerminator = 1 in { + let Defs = [CC] in { + def BRCT : BranchUnaryRI<"brct", 0xA76, GR32>; + def BRCTG : BranchUnaryRI<"brctg", 0xA77, GR64>; } + // This doesn't need to clobber CC since we never need to split it. + def BRCTH : BranchUnaryRIL<"brcth", 0xCC6, GRH32>, + Requires<[FeatureHighWord]>; + + def BCT : BranchUnaryRX<"bct", 0x46,GR32>; + def BCTR : BranchUnaryRR<"bctr", 0x06, GR32>; + def BCTG : BranchUnaryRXY<"bctg", 0xE346, GR64>; + def BCTGR : BranchUnaryRRE<"bctgr", 0xB946, GR64>; } -multiclass IntCondExtendedMnemonic<bits<4> ccmask, string name1, string name2> - : IntCondExtendedMnemonicA<ccmask, name1> { - let isAsmParserOnly = 1 in - defm Alt : IntCondExtendedMnemonicA<ccmask, name2>; -} -defm AsmJH : IntCondExtendedMnemonic<2, "h", "nle">; -defm AsmJL : IntCondExtendedMnemonic<4, "l", "nhe">; -defm AsmJLH : IntCondExtendedMnemonic<6, "lh", "ne">; -defm AsmJE : IntCondExtendedMnemonic<8, "e", "nlh">; -defm AsmJHE : IntCondExtendedMnemonic<10, "he", "nl">; -defm AsmJLE : IntCondExtendedMnemonic<12, "le", "nh">; -// Decrement a register and branch if it is nonzero. These don't clobber CC, -// but we might need to split long branches into sequences that do. -let Defs = [CC] in { - def BRCT : BranchUnaryRI<"brct", 0xA76, GR32>; - def BRCTG : BranchUnaryRI<"brctg", 0xA77, GR64>; +let isBranch = 1, isTerminator = 1 in { + let Defs = [CC] in { + def BRXH : BranchBinaryRSI<"brxh", 0x84, GR32>; + def BRXLE : BranchBinaryRSI<"brxle", 0x85, GR32>; + def BRXHG : BranchBinaryRIEe<"brxhg", 0xEC44, GR64>; + def BRXLG : BranchBinaryRIEe<"brxlg", 0xEC45, GR64>; + } + def BXH : BranchBinaryRS<"bxh", 0x86, GR32>; + def BXLE : BranchBinaryRS<"bxle", 0x87, GR32>; + def BXHG : BranchBinaryRSY<"bxhg", 0xEB44, GR64>; + def BXLEG : BranchBinaryRSY<"bxleg", 0xEB45, GR64>; } //===----------------------------------------------------------------------===// -// Select instructions +// Trap instructions //===----------------------------------------------------------------------===// -def Select32Mux : SelectWrapper<GRX32>, Requires<[FeatureHighWord]>; -def Select32 : SelectWrapper<GR32>; -def Select64 : SelectWrapper<GR64>; +// Unconditional trap. +// FIXME: This trap instruction should be marked as isTerminator, but there is +// currently a general bug that allows non-terminators to be placed between +// terminators. Temporarily leave this unmarked until the bug is fixed. +let isBarrier = 1, hasCtrlDep = 1 in + def Trap : Alias<4, (outs), (ins), [(trap)]>; -// We don't define 32-bit Mux stores because the low-only STOC should -// always be used if possible. -defm CondStore8Mux : CondStores<GRX32, nonvolatile_truncstorei8, - nonvolatile_anyextloadi8, bdxaddr20only>, - Requires<[FeatureHighWord]>; -defm CondStore16Mux : CondStores<GRX32, nonvolatile_truncstorei16, - nonvolatile_anyextloadi16, bdxaddr20only>, - Requires<[FeatureHighWord]>; -defm CondStore8 : CondStores<GR32, nonvolatile_truncstorei8, - nonvolatile_anyextloadi8, bdxaddr20only>; -defm CondStore16 : CondStores<GR32, nonvolatile_truncstorei16, - nonvolatile_anyextloadi16, bdxaddr20only>; -defm CondStore32 : CondStores<GR32, nonvolatile_store, - nonvolatile_load, bdxaddr20only>; +// Conditional trap. +let isTerminator = 1, hasCtrlDep = 1, Uses = [CC] in + def CondTrap : Alias<4, (outs), (ins cond4:$valid, cond4:$R1), []>; -defm : CondStores64<CondStore8, CondStore8Inv, nonvolatile_truncstorei8, - nonvolatile_anyextloadi8, bdxaddr20only>; -defm : CondStores64<CondStore16, CondStore16Inv, nonvolatile_truncstorei16, - nonvolatile_anyextloadi16, bdxaddr20only>; -defm : CondStores64<CondStore32, CondStore32Inv, nonvolatile_truncstorei32, - nonvolatile_anyextloadi32, bdxaddr20only>; -defm CondStore64 : CondStores<GR64, nonvolatile_store, - nonvolatile_load, bdxaddr20only>; +// Fused compare-and-trap instructions. +let isTerminator = 1, hasCtrlDep = 1 in { + // These patterns work the same way as for compare-and-branch. + defm CRT : CmpBranchRRFcPair<"crt", 0xB972, GR32>; + defm CGRT : CmpBranchRRFcPair<"cgrt", 0xB960, GR64>; + defm CLRT : CmpBranchRRFcPair<"clrt", 0xB973, GR32>; + defm CLGRT : CmpBranchRRFcPair<"clgrt", 0xB961, GR64>; + defm CIT : CmpBranchRIEaPair<"cit", 0xEC72, GR32, imm32sx16>; + defm CGIT : CmpBranchRIEaPair<"cgit", 0xEC70, GR64, imm64sx16>; + defm CLFIT : CmpBranchRIEaPair<"clfit", 0xEC73, GR32, imm32zx16>; + defm CLGIT : CmpBranchRIEaPair<"clgit", 0xEC71, GR64, imm64zx16>; + let Predicates = [FeatureMiscellaneousExtensions] in { + defm CLT : CmpBranchRSYbPair<"clt", 0xEB23, GR32>; + defm CLGT : CmpBranchRSYbPair<"clgt", 0xEB2B, GR64>; + } + + foreach V = [ "E", "H", "L", "HE", "LE", "LH", + "NE", "NH", "NL", "NHE", "NLE", "NLH" ] in { + def CRTAsm#V : FixedCmpBranchRRFc<ICV<V>, "crt", 0xB972, GR32>; + def CGRTAsm#V : FixedCmpBranchRRFc<ICV<V>, "cgrt", 0xB960, GR64>; + def CLRTAsm#V : FixedCmpBranchRRFc<ICV<V>, "clrt", 0xB973, GR32>; + def CLGRTAsm#V : FixedCmpBranchRRFc<ICV<V>, "clgrt", 0xB961, GR64>; + def CITAsm#V : FixedCmpBranchRIEa<ICV<V>, "cit", 0xEC72, GR32, + imm32sx16>; + def CGITAsm#V : FixedCmpBranchRIEa<ICV<V>, "cgit", 0xEC70, GR64, + imm64sx16>; + def CLFITAsm#V : FixedCmpBranchRIEa<ICV<V>, "clfit", 0xEC73, GR32, + imm32zx16>; + def CLGITAsm#V : FixedCmpBranchRIEa<ICV<V>, "clgit", 0xEC71, GR64, + imm64zx16>; + let Predicates = [FeatureMiscellaneousExtensions] in { + def CLTAsm#V : FixedCmpBranchRSYb<ICV<V>, "clt", 0xEB23, GR32>; + def CLGTAsm#V : FixedCmpBranchRSYb<ICV<V>, "clgt", 0xEB2B, GR64>; + } + } +} //===----------------------------------------------------------------------===// -// Call instructions +// Call and return instructions //===----------------------------------------------------------------------===// +// Define the general form of the call instructions for the asm parser. +// These instructions don't hard-code %r14 as the return address register. +let isCall = 1, Defs = [CC] in { + def BRAS : CallRI <"bras", 0xA75>; + def BRASL : CallRIL<"brasl", 0xC05>; + def BAS : CallRX <"bas", 0x4D>; + def BASR : CallRR <"basr", 0x0D>; +} + +// Regular calls. let isCall = 1, Defs = [R14D, CC] in { def CallBRASL : Alias<6, (outs), (ins pcrel32:$I2, variable_ops), [(z_call pcrel32:$I2)]>; @@ -378,6 +257,15 @@ let isCall = 1, Defs = [R14D, CC] in { [(z_call ADDR64:$R2)]>; } +// TLS calls. These will be lowered into a call to __tls_get_offset, +// with an extra relocation specifying the TLS symbol. +let isCall = 1, Defs = [R14D, CC] in { + def TLS_GDCALL : Alias<6, (outs), (ins tlssym:$I2, variable_ops), + [(z_tls_gdcall tglobaltlsaddr:$I2)]>; + def TLS_LDCALL : Alias<6, (outs), (ins tlssym:$I2, variable_ops), + [(z_tls_ldcall tglobaltlsaddr:$I2)]>; +} + // Sibling calls. Indirect sibling calls must be via R1, since R2 upwards // are argument registers and since branching to R0 is a no-op. let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in { @@ -387,10 +275,10 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1 in { def CallBR : Alias<2, (outs), (ins), [(z_sibcall R1D)]>; } +// Conditional sibling calls. let CCMaskFirst = 1, isCall = 1, isTerminator = 1, isReturn = 1 in { def CallBRCL : Alias<6, (outs), (ins cond4:$valid, cond4:$R1, pcrel32:$I2), []>; - let Uses = [R1D] in def CallBCR : Alias<2, (outs), (ins cond4:$valid, cond4:$R1), []>; } @@ -407,60 +295,76 @@ let isCall = 1, isTerminator = 1, isReturn = 1, Uses = [R1D] in { def CLGIBCall : Alias<6, (outs), (ins GR64:$R1, imm64zx8:$I2, cond4:$M3), []>; } -// TLS calls. These will be lowered into a call to __tls_get_offset, -// with an extra relocation specifying the TLS symbol. -let isCall = 1, Defs = [R14D, CC] in { - def TLS_GDCALL : Alias<6, (outs), (ins tlssym:$I2, variable_ops), - [(z_tls_gdcall tglobaltlsaddr:$I2)]>; - def TLS_LDCALL : Alias<6, (outs), (ins tlssym:$I2, variable_ops), - [(z_tls_ldcall tglobaltlsaddr:$I2)]>; -} +// A return instruction (br %r14). +let isReturn = 1, isTerminator = 1, isBarrier = 1, hasCtrlDep = 1 in + def Return : Alias<2, (outs), (ins), [(z_retflag)]>; -// Define the general form of the call instructions for the asm parser. -// These instructions don't hard-code %r14 as the return address register. -// Allow an optional TLS marker symbol to generate TLS call relocations. -let isCall = 1, Defs = [CC] in { - def BRAS : InstRI<0xA75, (outs), (ins GR64:$R1, brtarget16tls:$I2), - "bras\t$R1, $I2", []>; - def BRASL : InstRIL<0xC05, (outs), (ins GR64:$R1, brtarget32tls:$I2), - "brasl\t$R1, $I2", []>; - def BASR : InstRR<0x0D, (outs), (ins GR64:$R1, ADDR64:$R2), - "basr\t$R1, $R2", []>; +// A conditional return instruction (bcr <cond>, %r14). +let isReturn = 1, isTerminator = 1, hasCtrlDep = 1, CCMaskFirst = 1, Uses = [CC] in + def CondReturn : Alias<2, (outs), (ins cond4:$valid, cond4:$R1), []>; + +// Fused compare and conditional returns. +let isReturn = 1, isTerminator = 1, hasCtrlDep = 1 in { + def CRBReturn : Alias<6, (outs), (ins GR32:$R1, GR32:$R2, cond4:$M3), []>; + def CGRBReturn : Alias<6, (outs), (ins GR64:$R1, GR64:$R2, cond4:$M3), []>; + def CIBReturn : Alias<6, (outs), (ins GR32:$R1, imm32sx8:$I2, cond4:$M3), []>; + def CGIBReturn : Alias<6, (outs), (ins GR64:$R1, imm64sx8:$I2, cond4:$M3), []>; + def CLRBReturn : Alias<6, (outs), (ins GR32:$R1, GR32:$R2, cond4:$M3), []>; + def CLGRBReturn : Alias<6, (outs), (ins GR64:$R1, GR64:$R2, cond4:$M3), []>; + def CLIBReturn : Alias<6, (outs), (ins GR32:$R1, imm32zx8:$I2, cond4:$M3), []>; + def CLGIBReturn : Alias<6, (outs), (ins GR64:$R1, imm64zx8:$I2, cond4:$M3), []>; } //===----------------------------------------------------------------------===// +// Select instructions +//===----------------------------------------------------------------------===// + +def Select32Mux : SelectWrapper<GRX32>, Requires<[FeatureHighWord]>; +def Select32 : SelectWrapper<GR32>; +def Select64 : SelectWrapper<GR64>; + +// We don't define 32-bit Mux stores if we don't have STOCFH, because the +// low-only STOC should then always be used if possible. +defm CondStore8Mux : CondStores<GRX32, nonvolatile_truncstorei8, + nonvolatile_anyextloadi8, bdxaddr20only>, + Requires<[FeatureHighWord]>; +defm CondStore16Mux : CondStores<GRX32, nonvolatile_truncstorei16, + nonvolatile_anyextloadi16, bdxaddr20only>, + Requires<[FeatureHighWord]>; +defm CondStore32Mux : CondStores<GRX32, nonvolatile_store, + nonvolatile_load, bdxaddr20only>, + Requires<[FeatureLoadStoreOnCond2]>; +defm CondStore8 : CondStores<GR32, nonvolatile_truncstorei8, + nonvolatile_anyextloadi8, bdxaddr20only>; +defm CondStore16 : CondStores<GR32, nonvolatile_truncstorei16, + nonvolatile_anyextloadi16, bdxaddr20only>; +defm CondStore32 : CondStores<GR32, nonvolatile_store, + nonvolatile_load, bdxaddr20only>; + +defm : CondStores64<CondStore8, CondStore8Inv, nonvolatile_truncstorei8, + nonvolatile_anyextloadi8, bdxaddr20only>; +defm : CondStores64<CondStore16, CondStore16Inv, nonvolatile_truncstorei16, + nonvolatile_anyextloadi16, bdxaddr20only>; +defm : CondStores64<CondStore32, CondStore32Inv, nonvolatile_truncstorei32, + nonvolatile_anyextloadi32, bdxaddr20only>; +defm CondStore64 : CondStores<GR64, nonvolatile_store, + nonvolatile_load, bdxaddr20only>; + +//===----------------------------------------------------------------------===// // Move instructions //===----------------------------------------------------------------------===// // Register moves. let hasSideEffects = 0 in { // Expands to LR, RISBHG or RISBLG, depending on the choice of registers. - def LRMux : UnaryRRPseudo<"l", null_frag, GRX32, GRX32>, + def LRMux : UnaryRRPseudo<"lr", null_frag, GRX32, GRX32>, Requires<[FeatureHighWord]>; - def LR : UnaryRR <"l", 0x18, null_frag, GR32, GR32>; - def LGR : UnaryRRE<"lg", 0xB904, null_frag, GR64, GR64>; + def LR : UnaryRR <"lr", 0x18, null_frag, GR32, GR32>; + def LGR : UnaryRRE<"lgr", 0xB904, null_frag, GR64, GR64>; } let Defs = [CC], CCValues = 0xE, CompareZeroCCMask = 0xE in { - def LTR : UnaryRR <"lt", 0x12, null_frag, GR32, GR32>; - def LTGR : UnaryRRE<"ltg", 0xB902, null_frag, GR64, GR64>; -} - -// Move on condition. -let isCodeGenOnly = 1, Uses = [CC] in { - def LOCR : CondUnaryRRF<"loc", 0xB9F2, GR32, GR32>; - def LOCGR : CondUnaryRRF<"locg", 0xB9E2, GR64, GR64>; -} -let Uses = [CC] in { - def AsmLOCR : AsmCondUnaryRRF<"loc", 0xB9F2, GR32, GR32>; - def AsmLOCGR : AsmCondUnaryRRF<"locg", 0xB9E2, GR64, GR64>; -} -let isCodeGenOnly = 1, Uses = [CC] in { - def LOCHI : CondUnaryRIE<"lochi", 0xEC42, GR32, imm32sx16>; - def LOCGHI : CondUnaryRIE<"locghi", 0xEC46, GR64, imm64sx16>; -} -let Uses = [CC] in { - def AsmLOCHI : AsmCondUnaryRIE<"lochi", 0xEC42, GR32, imm32sx16>; - def AsmLOCGHI : AsmCondUnaryRIE<"locghi", 0xEC46, GR64, imm64sx16>; + def LTR : UnaryRR <"ltr", 0x12, null_frag, GR32, GR32>; + def LTGR : UnaryRRE<"ltgr", 0xB902, null_frag, GR64, GR64>; } // Immediate moves. @@ -512,14 +416,21 @@ let canFoldAsLoad = 1 in { def LGRL : UnaryRILPC<"lgrl", 0xC48, aligned_load, GR64>; } -// Load on condition. -let isCodeGenOnly = 1, Uses = [CC] in { - def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>; - def LOCG : CondUnaryRSY<"locg", 0xEBE2, nonvolatile_load, GR64, 8>; +// Load and zero rightmost byte. +let Predicates = [FeatureLoadAndZeroRightmostByte] in { + def LZRF : UnaryRXY<"lzrf", 0xE33B, null_frag, GR32, 4>; + def LZRG : UnaryRXY<"lzrg", 0xE32A, null_frag, GR64, 8>; + def : Pat<(and (i32 (load bdxaddr20only:$src)), 0xffffff00), + (LZRF bdxaddr20only:$src)>; + def : Pat<(and (i64 (load bdxaddr20only:$src)), 0xffffffffffffff00), + (LZRG bdxaddr20only:$src)>; } -let Uses = [CC] in { - def AsmLOC : AsmCondUnaryRSY<"loc", 0xEBF2, GR32, 4>; - def AsmLOCG : AsmCondUnaryRSY<"locg", 0xEBE2, GR64, 8>; + +// Load and trap. +let Predicates = [FeatureLoadAndTrap] in { + def LAT : UnaryRXY<"lat", 0xE39F, null_frag, GR32, 4>; + def LFHAT : UnaryRXY<"lfhat", 0xE3C8, null_frag, GRH32, 4>; + def LGAT : UnaryRXY<"lgat", 0xE385, null_frag, GR64, 8>; } // Register stores. @@ -542,16 +453,6 @@ let SimpleBDXStore = 1 in { def STRL : StoreRILPC<"strl", 0xC4F, aligned_store, GR32>; def STGRL : StoreRILPC<"stgrl", 0xC4B, aligned_store, GR64>; -// Store on condition. -let isCodeGenOnly = 1, Uses = [CC] in { - def STOC : CondStoreRSY<"stoc", 0xEBF3, GR32, 4>; - def STOCG : CondStoreRSY<"stocg", 0xEBE3, GR64, 8>; -} -let Uses = [CC] in { - def AsmSTOC : AsmCondStoreRSY<"stoc", 0xEBF3, GR32, 4>; - def AsmSTOCG : AsmCondStoreRSY<"stocg", 0xEBE3, GR64, 8>; -} - // 8-bit immediate stores to 8-bit fields. defm MVI : StoreSIPair<"mvi", 0x92, 0xEB52, truncstorei8, imm32zx8trunc>; @@ -569,6 +470,82 @@ let mayLoad = 1, mayStore = 1, Defs = [CC] in defm MVST : StringRRE<"mvst", 0xB255, z_stpcpy>; //===----------------------------------------------------------------------===// +// Conditional move instructions +//===----------------------------------------------------------------------===// + +let Predicates = [FeatureLoadStoreOnCond2], Uses = [CC] in { + // Load immediate on condition. Matched via DAG pattern and created + // by the PeepholeOptimizer via FoldImmediate. + let hasSideEffects = 0 in { + // Expands to LOCHI or LOCHHI, depending on the choice of register. + def LOCHIMux : CondBinaryRIEPseudo<GRX32, imm32sx16>; + defm LOCHHI : CondBinaryRIEPair<"lochhi", 0xEC4E, GRH32, imm32sx16>; + defm LOCHI : CondBinaryRIEPair<"lochi", 0xEC42, GR32, imm32sx16>; + defm LOCGHI : CondBinaryRIEPair<"locghi", 0xEC46, GR64, imm64sx16>; + } + + // Move register on condition. Expanded from Select* pseudos and + // created by early if-conversion. + let hasSideEffects = 0, isCommutable = 1 in { + // Expands to LOCR or LOCFHR or a branch-and-move sequence, + // depending on the choice of registers. + def LOCRMux : CondBinaryRRFPseudo<GRX32, GRX32>; + defm LOCFHR : CondBinaryRRFPair<"locfhr", 0xB9E0, GRH32, GRH32>; + } + + // Load on condition. Matched via DAG pattern. + // Expands to LOC or LOCFH, depending on the choice of register. + def LOCMux : CondUnaryRSYPseudo<nonvolatile_load, GRX32, 4>; + defm LOCFH : CondUnaryRSYPair<"locfh", 0xEBE0, nonvolatile_load, GRH32, 4>; + + // Store on condition. Expanded from CondStore* pseudos. + // Expands to STOC or STOCFH, depending on the choice of register. + def STOCMux : CondStoreRSYPseudo<GRX32, 4>; + defm STOCFH : CondStoreRSYPair<"stocfh", 0xEBE1, GRH32, 4>; + + // Define AsmParser extended mnemonics for each general condition-code mask. + foreach V = [ "E", "NE", "H", "NH", "L", "NL", "HE", "NHE", "LE", "NLE", + "Z", "NZ", "P", "NP", "M", "NM", "LH", "NLH", "O", "NO" ] in { + def LOCHIAsm#V : FixedCondBinaryRIE<CV<V>, "lochi", 0xEC42, GR32, + imm32sx16>; + def LOCGHIAsm#V : FixedCondBinaryRIE<CV<V>, "locghi", 0xEC46, GR64, + imm64sx16>; + def LOCHHIAsm#V : FixedCondBinaryRIE<CV<V>, "lochhi", 0xEC4E, GRH32, + imm32sx16>; + def LOCFHRAsm#V : FixedCondBinaryRRF<CV<V>, "locfhr", 0xB9E0, GRH32, GRH32>; + def LOCFHAsm#V : FixedCondUnaryRSY<CV<V>, "locfh", 0xEBE0, GRH32, 4>; + def STOCFHAsm#V : FixedCondStoreRSY<CV<V>, "stocfh", 0xEBE1, GRH32, 4>; + } +} + +let Predicates = [FeatureLoadStoreOnCond], Uses = [CC] in { + // Move register on condition. Expanded from Select* pseudos and + // created by early if-conversion. + let hasSideEffects = 0, isCommutable = 1 in { + defm LOCR : CondBinaryRRFPair<"locr", 0xB9F2, GR32, GR32>; + defm LOCGR : CondBinaryRRFPair<"locgr", 0xB9E2, GR64, GR64>; + } + + // Load on condition. Matched via DAG pattern. + defm LOC : CondUnaryRSYPair<"loc", 0xEBF2, nonvolatile_load, GR32, 4>; + defm LOCG : CondUnaryRSYPair<"locg", 0xEBE2, nonvolatile_load, GR64, 8>; + + // Store on condition. Expanded from CondStore* pseudos. + defm STOC : CondStoreRSYPair<"stoc", 0xEBF3, GR32, 4>; + defm STOCG : CondStoreRSYPair<"stocg", 0xEBE3, GR64, 8>; + + // Define AsmParser extended mnemonics for each general condition-code mask. + foreach V = [ "E", "NE", "H", "NH", "L", "NL", "HE", "NHE", "LE", "NLE", + "Z", "NZ", "P", "NP", "M", "NM", "LH", "NLH", "O", "NO" ] in { + def LOCRAsm#V : FixedCondBinaryRRF<CV<V>, "locr", 0xB9F2, GR32, GR32>; + def LOCGRAsm#V : FixedCondBinaryRRF<CV<V>, "locgr", 0xB9E2, GR64, GR64>; + def LOCAsm#V : FixedCondUnaryRSY<CV<V>, "loc", 0xEBF2, GR32, 4>; + def LOCGAsm#V : FixedCondUnaryRSY<CV<V>, "locg", 0xEBE2, GR64, 8>; + def STOCAsm#V : FixedCondStoreRSY<CV<V>, "stoc", 0xEBF3, GR32, 4>; + def STOCGAsm#V : FixedCondStoreRSY<CV<V>, "stocg", 0xEBE3, GR64, 8>; + } +} +//===----------------------------------------------------------------------===// // Sign extensions //===----------------------------------------------------------------------===// // @@ -581,18 +558,18 @@ let mayLoad = 1, mayStore = 1, Defs = [CC] in // 32-bit extensions from registers. let hasSideEffects = 0 in { - def LBR : UnaryRRE<"lb", 0xB926, sext8, GR32, GR32>; - def LHR : UnaryRRE<"lh", 0xB927, sext16, GR32, GR32>; + def LBR : UnaryRRE<"lbr", 0xB926, sext8, GR32, GR32>; + def LHR : UnaryRRE<"lhr", 0xB927, sext16, GR32, GR32>; } // 64-bit extensions from registers. let hasSideEffects = 0 in { - def LGBR : UnaryRRE<"lgb", 0xB906, sext8, GR64, GR64>; - def LGHR : UnaryRRE<"lgh", 0xB907, sext16, GR64, GR64>; - def LGFR : UnaryRRE<"lgf", 0xB914, sext32, GR64, GR32>; + def LGBR : UnaryRRE<"lgbr", 0xB906, sext8, GR64, GR64>; + def LGHR : UnaryRRE<"lghr", 0xB907, sext16, GR64, GR64>; + def LGFR : UnaryRRE<"lgfr", 0xB914, sext32, GR64, GR32>; } let Defs = [CC], CCValues = 0xE, CompareZeroCCMask = 0xE in - def LTGFR : UnaryRRE<"ltgf", 0xB912, null_frag, GR64, GR32>; + def LTGFR : UnaryRRE<"ltgfr", 0xB912, null_frag, GR64, GR32>; // Match 32-to-64-bit sign extensions in which the source is already // in a 64-bit register. @@ -632,20 +609,20 @@ let Defs = [CC], CCValues = 0xE, CompareZeroCCMask = 0xE in // 32-bit extensions from registers. let hasSideEffects = 0 in { // Expands to LLCR or RISB[LH]G, depending on the choice of registers. - def LLCRMux : UnaryRRPseudo<"llc", zext8, GRX32, GRX32>, + def LLCRMux : UnaryRRPseudo<"llcr", zext8, GRX32, GRX32>, Requires<[FeatureHighWord]>; - def LLCR : UnaryRRE<"llc", 0xB994, zext8, GR32, GR32>; + def LLCR : UnaryRRE<"llcr", 0xB994, zext8, GR32, GR32>; // Expands to LLHR or RISB[LH]G, depending on the choice of registers. - def LLHRMux : UnaryRRPseudo<"llh", zext16, GRX32, GRX32>, + def LLHRMux : UnaryRRPseudo<"llhr", zext16, GRX32, GRX32>, Requires<[FeatureHighWord]>; - def LLHR : UnaryRRE<"llh", 0xB995, zext16, GR32, GR32>; + def LLHR : UnaryRRE<"llhr", 0xB995, zext16, GR32, GR32>; } // 64-bit extensions from registers. let hasSideEffects = 0 in { - def LLGCR : UnaryRRE<"llgc", 0xB984, zext8, GR64, GR64>; - def LLGHR : UnaryRRE<"llgh", 0xB985, zext16, GR64, GR64>; - def LLGFR : UnaryRRE<"llgf", 0xB916, zext32, GR64, GR32>; + def LLGCR : UnaryRRE<"llgcr", 0xB984, zext8, GR64, GR64>; + def LLGHR : UnaryRRE<"llghr", 0xB985, zext16, GR64, GR64>; + def LLGFR : UnaryRRE<"llgfr", 0xB916, zext32, GR64, GR32>; } // Match 32-to-64-bit zero extensions in which the source is already @@ -677,6 +654,27 @@ def LLGF : UnaryRXY<"llgf", 0xE316, azextloadi32, GR64, 4>; def LLGHRL : UnaryRILPC<"llghrl", 0xC46, aligned_azextloadi16, GR64>; def LLGFRL : UnaryRILPC<"llgfrl", 0xC4E, aligned_azextloadi32, GR64>; +// 31-to-64-bit zero extensions. +def LLGTR : UnaryRRE<"llgtr", 0xB917, null_frag, GR64, GR64>; +def LLGT : UnaryRXY<"llgt", 0xE317, null_frag, GR64, 4>; +def : Pat<(and GR64:$src, 0x7fffffff), + (LLGTR GR64:$src)>; +def : Pat<(and (i64 (azextloadi32 bdxaddr20only:$src)), 0x7fffffff), + (LLGT bdxaddr20only:$src)>; + +// Load and zero rightmost byte. +let Predicates = [FeatureLoadAndZeroRightmostByte] in { + def LLZRGF : UnaryRXY<"llzrgf", 0xE33A, null_frag, GR64, 4>; + def : Pat<(and (i64 (azextloadi32 bdxaddr20only:$src)), 0xffffff00), + (LLZRGF bdxaddr20only:$src)>; +} + +// Load and trap. +let Predicates = [FeatureLoadAndTrap] in { + def LLGFAT : UnaryRXY<"llgfat", 0xE39D, null_frag, GR64, 4>; + def LLGTAT : UnaryRXY<"llgtat", 0xE39C, null_frag, GR64, 4>; +} + //===----------------------------------------------------------------------===// // Truncations //===----------------------------------------------------------------------===// @@ -729,8 +727,8 @@ def STMH : StoreMultipleRSY<"stmh", 0xEB26, GRH32>; // Byte-swapping register moves. let hasSideEffects = 0 in { - def LRVR : UnaryRRE<"lrv", 0xB91F, bswap, GR32, GR32>; - def LRVGR : UnaryRRE<"lrvg", 0xB90F, bswap, GR64, GR64>; + def LRVR : UnaryRRE<"lrvr", 0xB91F, bswap, GR32, GR32>; + def LRVGR : UnaryRRE<"lrvgr", 0xB90F, bswap, GR64, GR64>; } // Byte-swapping loads. Unlike normal loads, these instructions are @@ -749,26 +747,14 @@ def STRVG : StoreRXY<"strvg", 0xE32F, z_strvg, GR64, 8>; //===----------------------------------------------------------------------===// // Load BDX-style addresses. -let hasSideEffects = 0, isAsCheapAsAMove = 1, isReMaterializable = 1, - DispKey = "la" in { - let DispSize = "12" in - def LA : InstRX<0x41, (outs GR64:$R1), (ins laaddr12pair:$XBD2), - "la\t$R1, $XBD2", - [(set GR64:$R1, laaddr12pair:$XBD2)]>; - let DispSize = "20" in - def LAY : InstRXY<0xE371, (outs GR64:$R1), (ins laaddr20pair:$XBD2), - "lay\t$R1, $XBD2", - [(set GR64:$R1, laaddr20pair:$XBD2)]>; -} +let hasSideEffects = 0, isAsCheapAsAMove = 1, isReMaterializable = 1 in + defm LA : LoadAddressRXPair<"la", 0x41, 0xE371, bitconvert>; // Load a PC-relative address. There's no version of this instruction // with a 16-bit offset, so there's no relaxation. let hasSideEffects = 0, isAsCheapAsAMove = 1, isMoveImm = 1, - isReMaterializable = 1 in { - def LARL : InstRIL<0xC00, (outs GR64:$R1), (ins pcrel32:$I2), - "larl\t$R1, $I2", - [(set GR64:$R1, pcrel32:$I2)]>; -} + isReMaterializable = 1 in + def LARL : LoadAddressRIL<"larl", 0xC00, bitconvert>; // Load the Global Offset Table address. This will be lowered into a // larl $R1, _GLOBAL_OFFSET_TABLE_ @@ -782,11 +768,11 @@ def GOT : Alias<6, (outs GR64:$R1), (ins), let Defs = [CC] in { let CCValues = 0xF, CompareZeroCCMask = 0x8 in { - def LPR : UnaryRR <"lp", 0x10, z_iabs, GR32, GR32>; - def LPGR : UnaryRRE<"lpg", 0xB900, z_iabs, GR64, GR64>; + def LPR : UnaryRR <"lpr", 0x10, z_iabs, GR32, GR32>; + def LPGR : UnaryRRE<"lpgr", 0xB900, z_iabs, GR64, GR64>; } let CCValues = 0xE, CompareZeroCCMask = 0xE in - def LPGFR : UnaryRRE<"lpgf", 0xB910, null_frag, GR64, GR32>; + def LPGFR : UnaryRRE<"lpgfr", 0xB910, null_frag, GR64, GR32>; } def : Pat<(z_iabs32 GR32:$src), (LPR GR32:$src)>; def : Pat<(z_iabs64 GR64:$src), (LPGR GR64:$src)>; @@ -795,11 +781,11 @@ defm : SXU<z_iabs64, LPGFR>; let Defs = [CC] in { let CCValues = 0xF, CompareZeroCCMask = 0x8 in { - def LNR : UnaryRR <"ln", 0x11, z_inegabs, GR32, GR32>; - def LNGR : UnaryRRE<"lng", 0xB901, z_inegabs, GR64, GR64>; + def LNR : UnaryRR <"lnr", 0x11, z_inegabs, GR32, GR32>; + def LNGR : UnaryRRE<"lngr", 0xB901, z_inegabs, GR64, GR64>; } let CCValues = 0xE, CompareZeroCCMask = 0xE in - def LNGFR : UnaryRRE<"lngf", 0xB911, null_frag, GR64, GR32>; + def LNGFR : UnaryRRE<"lngfr", 0xB911, null_frag, GR64, GR32>; } def : Pat<(z_inegabs32 GR32:$src), (LNR GR32:$src)>; def : Pat<(z_inegabs64 GR64:$src), (LNGR GR64:$src)>; @@ -808,11 +794,11 @@ defm : SXU<z_inegabs64, LNGFR>; let Defs = [CC] in { let CCValues = 0xF, CompareZeroCCMask = 0x8 in { - def LCR : UnaryRR <"lc", 0x13, ineg, GR32, GR32>; - def LCGR : UnaryRRE<"lcg", 0xB903, ineg, GR64, GR64>; + def LCR : UnaryRR <"lcr", 0x13, ineg, GR32, GR32>; + def LCGR : UnaryRRE<"lcgr", 0xB903, ineg, GR64, GR64>; } let CCValues = 0xE, CompareZeroCCMask = 0xE in - def LCGFR : UnaryRRE<"lcgf", 0xB913, null_frag, GR64, GR32>; + def LCGFR : UnaryRRE<"lcgfr", 0xB913, null_frag, GR64, GR32>; } defm : SXU<ineg, LCGFR>; @@ -880,10 +866,10 @@ def : Pat<(or (zext32 GR32:$src), imm64hf32:$imm), let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0x8 in { // Addition of a register. let isCommutable = 1 in { - defm AR : BinaryRRAndK<"a", 0x1A, 0xB9F8, add, GR32, GR32>; - defm AGR : BinaryRREAndK<"ag", 0xB908, 0xB9E8, add, GR64, GR64>; + defm AR : BinaryRRAndK<"ar", 0x1A, 0xB9F8, add, GR32, GR32>; + defm AGR : BinaryRREAndK<"agr", 0xB908, 0xB9E8, add, GR64, GR64>; } - def AGFR : BinaryRRE<"agf", 0xB918, null_frag, GR64, GR32>; + def AGFR : BinaryRRE<"agfr", 0xB918, null_frag, GR64, GR32>; // Addition of signed 16-bit immediates. defm AHIMux : BinaryRIAndKPseudo<"ahimux", add, GRX32, imm32sx16>; @@ -914,10 +900,10 @@ defm : SXB<add, GR64, AGFR>; let Defs = [CC] in { // Addition of a register. let isCommutable = 1 in { - defm ALR : BinaryRRAndK<"al", 0x1E, 0xB9FA, addc, GR32, GR32>; - defm ALGR : BinaryRREAndK<"alg", 0xB90A, 0xB9EA, addc, GR64, GR64>; + defm ALR : BinaryRRAndK<"alr", 0x1E, 0xB9FA, addc, GR32, GR32>; + defm ALGR : BinaryRREAndK<"algr", 0xB90A, 0xB9EA, addc, GR64, GR64>; } - def ALGFR : BinaryRRE<"algf", 0xB91A, null_frag, GR64, GR32>; + def ALGFR : BinaryRRE<"algfr", 0xB91A, null_frag, GR64, GR32>; // Addition of signed 16-bit immediates. def ALHSIK : BinaryRIE<"alhsik", 0xECDA, addc, GR32, imm32sx16>, @@ -939,8 +925,8 @@ defm : ZXB<addc, GR64, ALGFR>; // Addition producing and using a carry. let Defs = [CC], Uses = [CC] in { // Addition of a register. - def ALCR : BinaryRRE<"alc", 0xB998, adde, GR32, GR32>; - def ALCGR : BinaryRRE<"alcg", 0xB988, adde, GR64, GR64>; + def ALCR : BinaryRRE<"alcr", 0xB998, adde, GR32, GR32>; + def ALCGR : BinaryRRE<"alcgr", 0xB988, adde, GR64, GR64>; // Addition of memory. def ALC : BinaryRXY<"alc", 0xE398, adde, GR32, load, 4>; @@ -955,9 +941,9 @@ let Defs = [CC], Uses = [CC] in { // add-immediate instruction instead. let Defs = [CC], CCValues = 0xF, CompareZeroCCMask = 0x8 in { // Subtraction of a register. - defm SR : BinaryRRAndK<"s", 0x1B, 0xB9F9, sub, GR32, GR32>; - def SGFR : BinaryRRE<"sgf", 0xB919, null_frag, GR64, GR32>; - defm SGR : BinaryRREAndK<"sg", 0xB909, 0xB9E9, sub, GR64, GR64>; + defm SR : BinaryRRAndK<"sr", 0x1B, 0xB9F9, sub, GR32, GR32>; + def SGFR : BinaryRRE<"sgfr", 0xB919, null_frag, GR64, GR32>; + defm SGR : BinaryRREAndK<"sgr", 0xB909, 0xB9E9, sub, GR64, GR64>; // Subtraction of memory. defm SH : BinaryRXPair<"sh", 0x4B, 0xE37B, sub, GR32, asextloadi16, 2>; @@ -970,9 +956,9 @@ defm : SXB<sub, GR64, SGFR>; // Subtraction producing a carry. let Defs = [CC] in { // Subtraction of a register. - defm SLR : BinaryRRAndK<"sl", 0x1F, 0xB9FB, subc, GR32, GR32>; - def SLGFR : BinaryRRE<"slgf", 0xB91B, null_frag, GR64, GR32>; - defm SLGR : BinaryRREAndK<"slg", 0xB90B, 0xB9EB, subc, GR64, GR64>; + defm SLR : BinaryRRAndK<"slr", 0x1F, 0xB9FB, subc, GR32, GR32>; + def SLGFR : BinaryRRE<"slgfr", 0xB91B, null_frag, GR64, GR32>; + defm SLGR : BinaryRREAndK<"slgr", 0xB90B, 0xB9EB, subc, GR64, GR64>; // Subtraction of unsigned 32-bit immediates. These don't match // subc because we prefer addc for constants. @@ -989,8 +975,8 @@ defm : ZXB<subc, GR64, SLGFR>; // Subtraction producing and using a carry. let Defs = [CC], Uses = [CC] in { // Subtraction of a register. - def SLBR : BinaryRRE<"slb", 0xB999, sube, GR32, GR32>; - def SLBGR : BinaryRRE<"slbg", 0xB989, sube, GR64, GR64>; + def SLBR : BinaryRRE<"slbr", 0xB999, sube, GR32, GR32>; + def SLBGR : BinaryRRE<"slbgr", 0xB989, sube, GR64, GR64>; // Subtraction of memory. def SLB : BinaryRXY<"slb", 0xE399, sube, GR32, load, 4>; @@ -1004,8 +990,8 @@ let Defs = [CC], Uses = [CC] in { let Defs = [CC] in { // ANDs of a register. let isCommutable = 1, CCValues = 0xC, CompareZeroCCMask = 0x8 in { - defm NR : BinaryRRAndK<"n", 0x14, 0xB9F4, and, GR32, GR32>; - defm NGR : BinaryRREAndK<"ng", 0xB980, 0xB9E4, and, GR64, GR64>; + defm NR : BinaryRRAndK<"nr", 0x14, 0xB9F4, and, GR32, GR32>; + defm NGR : BinaryRREAndK<"ngr", 0xB980, 0xB9E4, and, GR64, GR64>; } let isConvertibleToThreeAddress = 1 in { @@ -1063,8 +1049,8 @@ defm : RMWIByte<and, bdaddr20pair, NIY>; let Defs = [CC] in { // ORs of a register. let isCommutable = 1, CCValues = 0xC, CompareZeroCCMask = 0x8 in { - defm OR : BinaryRRAndK<"o", 0x16, 0xB9F6, or, GR32, GR32>; - defm OGR : BinaryRREAndK<"og", 0xB981, 0xB9E6, or, GR64, GR64>; + defm OR : BinaryRRAndK<"or", 0x16, 0xB9F6, or, GR32, GR32>; + defm OGR : BinaryRREAndK<"ogr", 0xB981, 0xB9E6, or, GR64, GR64>; } // ORs of a 16-bit immediate, leaving other bits unaffected. @@ -1120,8 +1106,8 @@ defm : RMWIByte<or, bdaddr20pair, OIY>; let Defs = [CC] in { // XORs of a register. let isCommutable = 1, CCValues = 0xC, CompareZeroCCMask = 0x8 in { - defm XR : BinaryRRAndK<"x", 0x17, 0xB9F7, xor, GR32, GR32>; - defm XGR : BinaryRREAndK<"xg", 0xB982, 0xB9E7, xor, GR64, GR64>; + defm XR : BinaryRRAndK<"xr", 0x17, 0xB9F7, xor, GR32, GR32>; + defm XGR : BinaryRREAndK<"xgr", 0xB982, 0xB9E7, xor, GR64, GR64>; } // XORs of a 32-bit immediate, leaving other bits unaffected. @@ -1159,10 +1145,10 @@ defm : RMWIByte<xor, bdaddr20pair, XIY>; // Multiplication of a register. let isCommutable = 1 in { - def MSR : BinaryRRE<"ms", 0xB252, mul, GR32, GR32>; - def MSGR : BinaryRRE<"msg", 0xB90C, mul, GR64, GR64>; + def MSR : BinaryRRE<"msr", 0xB252, mul, GR32, GR32>; + def MSGR : BinaryRRE<"msgr", 0xB90C, mul, GR64, GR64>; } -def MSGFR : BinaryRRE<"msgf", 0xB91C, null_frag, GR64, GR32>; +def MSGFR : BinaryRRE<"msgfr", 0xB91C, null_frag, GR64, GR32>; defm : SXB<mul, GR64, MSGFR>; // Multiplication of a signed 16-bit immediate. @@ -1180,7 +1166,7 @@ def MSGF : BinaryRXY<"msgf", 0xE31C, mul, GR64, asextloadi32, 4>; def MSG : BinaryRXY<"msg", 0xE30C, mul, GR64, load, 8>; // Multiplication of a register, producing two results. -def MLGR : BinaryRRE<"mlg", 0xB986, z_umul_lohi64, GR128, GR64>; +def MLGR : BinaryRRE<"mlgr", 0xB986, z_umul_lohi64, GR128, GR64>; // Multiplication of memory, producing two results. def MLG : BinaryRXY<"mlg", 0xE386, z_umul_lohi64, GR128, load, 8>; @@ -1189,17 +1175,19 @@ def MLG : BinaryRXY<"mlg", 0xE386, z_umul_lohi64, GR128, load, 8>; // Division and remainder //===----------------------------------------------------------------------===// -// Division and remainder, from registers. -def DSGFR : BinaryRRE<"dsgf", 0xB91D, z_sdivrem32, GR128, GR32>; -def DSGR : BinaryRRE<"dsg", 0xB90D, z_sdivrem64, GR128, GR64>; -def DLR : BinaryRRE<"dl", 0xB997, z_udivrem32, GR128, GR32>; -def DLGR : BinaryRRE<"dlg", 0xB987, z_udivrem64, GR128, GR64>; +let hasSideEffects = 1 in { // Do not speculatively execute. + // Division and remainder, from registers. + def DSGFR : BinaryRRE<"dsgfr", 0xB91D, z_sdivrem32, GR128, GR32>; + def DSGR : BinaryRRE<"dsgr", 0xB90D, z_sdivrem64, GR128, GR64>; + def DLR : BinaryRRE<"dlr", 0xB997, z_udivrem32, GR128, GR32>; + def DLGR : BinaryRRE<"dlgr", 0xB987, z_udivrem64, GR128, GR64>; -// Division and remainder, from memory. -def DSGF : BinaryRXY<"dsgf", 0xE31D, z_sdivrem32, GR128, load, 4>; -def DSG : BinaryRXY<"dsg", 0xE30D, z_sdivrem64, GR128, load, 8>; -def DL : BinaryRXY<"dl", 0xE397, z_udivrem32, GR128, load, 4>; -def DLG : BinaryRXY<"dlg", 0xE387, z_udivrem64, GR128, load, 8>; + // Division and remainder, from memory. + def DSGF : BinaryRXY<"dsgf", 0xE31D, z_sdivrem32, GR128, load, 4>; + def DSG : BinaryRXY<"dsg", 0xE30D, z_sdivrem64, GR128, load, 8>; + def DL : BinaryRXY<"dl", 0xE397, z_udivrem32, GR128, load, 4>; + def DLG : BinaryRXY<"dlg", 0xE387, z_udivrem64, GR128, load, 8>; +} //===----------------------------------------------------------------------===// // Shifts @@ -1274,11 +1262,14 @@ let Defs = [CC] in { // of the unsigned forms do. let Defs = [CC], CCValues = 0xE in { // Comparison with a register. - def CR : CompareRR <"c", 0x19, z_scmp, GR32, GR32>; - def CGFR : CompareRRE<"cgf", 0xB930, null_frag, GR64, GR32>; - def CGR : CompareRRE<"cg", 0xB920, z_scmp, GR64, GR64>; + def CR : CompareRR <"cr", 0x19, z_scmp, GR32, GR32>; + def CGFR : CompareRRE<"cgfr", 0xB930, null_frag, GR64, GR32>; + def CGR : CompareRRE<"cgr", 0xB920, z_scmp, GR64, GR64>; - // Comparison with a signed 16-bit immediate. + // Comparison with a signed 16-bit immediate. CHIMux expands to CHI or CIH, + // depending on the choice of register. + def CHIMux : CompareRIPseudo<z_scmp, GRX32, imm32sx16>, + Requires<[FeatureHighWord]>; def CHI : CompareRI<"chi", 0xA7E, z_scmp, GR32, imm32sx16>; def CGHI : CompareRI<"cghi", 0xA7F, z_scmp, GR64, imm64sx16>; @@ -1317,9 +1308,9 @@ defm : SXB<z_scmp, GR64, CGFR>; // Unsigned comparisons. let Defs = [CC], CCValues = 0xE, IsLogical = 1 in { // Comparison with a register. - def CLR : CompareRR <"cl", 0x15, z_ucmp, GR32, GR32>; - def CLGFR : CompareRRE<"clgf", 0xB931, null_frag, GR64, GR32>; - def CLGR : CompareRRE<"clg", 0xB921, z_ucmp, GR64, GR64>; + def CLR : CompareRR <"clr", 0x15, z_ucmp, GR32, GR32>; + def CLGFR : CompareRRE<"clgfr", 0xB931, null_frag, GR64, GR32>; + def CLGR : CompareRRE<"clgr", 0xB921, z_ucmp, GR64, GR64>; // Comparison with an unsigned 32-bit immediate. CLFIMux expands to CLFI // or CLIH, depending on the choice of register. @@ -1391,12 +1382,21 @@ def TML : InstAlias<"tml\t$R, $I", (TMLL GR32:$R, imm32ll16:$I), 0>; def TMH : InstAlias<"tmh\t$R, $I", (TMLH GR32:$R, imm32lh16:$I), 0>; //===----------------------------------------------------------------------===// -// Prefetch +// Prefetch and execution hint //===----------------------------------------------------------------------===// def PFD : PrefetchRXY<"pfd", 0xE336, z_prefetch>; def PFDRL : PrefetchRILPC<"pfdrl", 0xC62, z_prefetch>; +let Predicates = [FeatureExecutionHint] in { + // Branch Prediction Preload + def BPP : BranchPreloadSMI<"bpp", 0xC7>; + def BPRP : BranchPreloadMII<"bprp", 0xC5>; + + // Next Instruction Access Intent + def NIAI : SideEffectBinaryIE<"niai", 0xB2FA, imm32zx4, imm32zx4>; +} + //===----------------------------------------------------------------------===// // Atomic operations //===----------------------------------------------------------------------===// @@ -1407,7 +1407,7 @@ let hasSideEffects = 1 in def Serialize : Alias<2, (outs), (ins), [(z_serialize)]>; // A pseudo instruction that serves as a compiler barrier. -let hasSideEffects = 1 in +let hasSideEffects = 1, hasNoSchedulingInfo = 1 in def MemBarrier : Pseudo<(outs), (ins), [(z_membarrier)]>; let Predicates = [FeatureInterlockedAccess1], Defs = [CC] in { @@ -1543,52 +1543,131 @@ def ATOMIC_CMP_SWAPW let mayLoad = 1; let mayStore = 1; let usesCustomInserter = 1; + let hasNoSchedulingInfo = 1; } +// Test and set. +let mayLoad = 1, Defs = [CC] in + def TS : StoreInherentS<"ts", 0x9300, null_frag, 1>; + +// Compare and swap. let Defs = [CC] in { defm CS : CmpSwapRSPair<"cs", 0xBA, 0xEB14, atomic_cmp_swap_32, GR32>; def CSG : CmpSwapRSY<"csg", 0xEB30, atomic_cmp_swap_64, GR64>; } +// Compare double and swap. +let Defs = [CC] in { + defm CDS : CmpSwapRSPair<"cds", 0xBB, 0xEB31, null_frag, GR128>; + def CDSG : CmpSwapRSY<"cdsg", 0xEB3E, null_frag, GR128>; +} + +// Compare and swap and store. +let Uses = [R0L, R1D], Defs = [CC], mayStore = 1, mayLoad = 1 in + def CSST : SideEffectTernarySSF<"csst", 0xC82, GR64>; + +// Perform locked operation. +let Uses = [R0L, R1D], Defs = [CC], mayStore = 1, mayLoad =1 in + def PLO : SideEffectQuaternarySSe<"plo", 0xEE, GR64>; + +// Load/store pair from/to quadword. +def LPQ : UnaryRXY<"lpq", 0xE38F, null_frag, GR128, 16>; +def STPQ : StoreRXY<"stpq", 0xE38E, null_frag, GR128, 16>; + +// Load pair disjoint. +let Predicates = [FeatureInterlockedAccess1], Defs = [CC] in { + def LPD : BinarySSF<"lpd", 0xC84, GR128>; + def LPDG : BinarySSF<"lpdg", 0xC85, GR128>; +} + +//===----------------------------------------------------------------------===// +// Access registers +//===----------------------------------------------------------------------===// + +// Read a 32-bit access register into a GR32. As with all GR32 operations, +// the upper 32 bits of the enclosing GR64 remain unchanged, which is useful +// when a 64-bit address is stored in a pair of access registers. +def EAR : UnaryRRE<"ear", 0xB24F, null_frag, GR32, AR32>; + +// Set access register. +def SAR : UnaryRRE<"sar", 0xB24E, null_frag, AR32, GR32>; + +// Copy access register. +def CPYA : UnaryRRE<"cpya", 0xB24D, null_frag, AR32, AR32>; + +// Load address extended. +defm LAE : LoadAddressRXPair<"lae", 0x51, 0xE375, null_frag>; + +// Load access multiple. +defm LAM : LoadMultipleRSPair<"lam", 0x9A, 0xEB9A, AR32>; + +// Load access multiple. +defm STAM : StoreMultipleRSPair<"stam", 0x9B, 0xEB9B, AR32>; + +//===----------------------------------------------------------------------===// +// Program mask and addressing mode +//===----------------------------------------------------------------------===// + +// Extract CC and program mask into a register. CC ends up in bits 29 and 28. +let Uses = [CC] in + def IPM : InherentRRE<"ipm", 0xB222, GR32, z_ipm>; + +// Set CC and program mask from a register. +let hasSideEffects = 1, Defs = [CC] in + def SPM : SideEffectUnaryRR<"spm", 0x04, GR32>; + +// Branch and link - like BAS, but also extracts CC and program mask. +let isCall = 1, Uses = [CC], Defs = [CC] in { + def BAL : CallRX<"bal", 0x45>; + def BALR : CallRR<"balr", 0x05>; +} + +// Test addressing mode. +let Defs = [CC] in + def TAM : SideEffectInherentE<"tam", 0x010B>; + +// Set addressing mode. +let hasSideEffects = 1 in { + def SAM24 : SideEffectInherentE<"sam24", 0x010C>; + def SAM31 : SideEffectInherentE<"sam31", 0x010D>; + def SAM64 : SideEffectInherentE<"sam64", 0x010E>; +} + +// Branch and set mode. Not really a call, but also sets an output register. +let isBranch = 1, isTerminator = 1, isBarrier = 1 in + def BSM : CallRR<"bsm", 0x0B>; + +// Branch and save and set mode. +let isCall = 1, Defs = [CC] in + def BASSM : CallRR<"bassm", 0x0C>; + //===----------------------------------------------------------------------===// // Transactional execution //===----------------------------------------------------------------------===// -let Predicates = [FeatureTransactionalExecution] in { +let hasSideEffects = 1, Predicates = [FeatureTransactionalExecution] in { // Transaction Begin - let hasSideEffects = 1, mayStore = 1, - usesCustomInserter = 1, Defs = [CC] in { - def TBEGIN : InstSIL<0xE560, - (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), - "tbegin\t$BD1, $I2", - [(z_tbegin bdaddr12only:$BD1, imm32zx16:$I2)]>; - def TBEGIN_nofloat : Pseudo<(outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), - [(z_tbegin_nofloat bdaddr12only:$BD1, - imm32zx16:$I2)]>; - def TBEGINC : InstSIL<0xE561, - (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), - "tbeginc\t$BD1, $I2", - [(int_s390_tbeginc bdaddr12only:$BD1, - imm32zx16:$I2)]>; + let mayStore = 1, usesCustomInserter = 1, Defs = [CC] in { + def TBEGIN : SideEffectBinarySIL<"tbegin", 0xE560, z_tbegin, imm32zx16>; + def TBEGIN_nofloat : SideEffectBinarySILPseudo<z_tbegin_nofloat, imm32zx16>; + + def TBEGINC : SideEffectBinarySIL<"tbeginc", 0xE561, + int_s390_tbeginc, imm32zx16>; } // Transaction End - let hasSideEffects = 1, Defs = [CC], BD2 = 0 in - def TEND : InstS<0xB2F8, (outs), (ins), "tend", [(z_tend)]>; + let Defs = [CC] in + def TEND : SideEffectInherentS<"tend", 0xB2F8, z_tend>; // Transaction Abort - let hasSideEffects = 1, isTerminator = 1, isBarrier = 1 in - def TABORT : InstS<0xB2FC, (outs), (ins bdaddr12only:$BD2), - "tabort\t$BD2", - [(int_s390_tabort bdaddr12only:$BD2)]>; + let isTerminator = 1, isBarrier = 1 in + def TABORT : SideEffectAddressS<"tabort", 0xB2FC, int_s390_tabort>; // Nontransactional Store - let hasSideEffects = 1 in - def NTSTG : StoreRXY<"ntstg", 0xE325, int_s390_ntstg, GR64, 8>; + def NTSTG : StoreRXY<"ntstg", 0xE325, int_s390_ntstg, GR64, 8>; // Extract Transaction Nesting Depth - let hasSideEffects = 1 in - def ETND : InherentRRE<"etnd", 0xB2EC, GR32, (int_s390_etnd)>; + def ETND : InherentRRE<"etnd", 0xB2EC, GR32, int_s390_etnd>; } //===----------------------------------------------------------------------===// @@ -1596,9 +1675,8 @@ let Predicates = [FeatureTransactionalExecution] in { //===----------------------------------------------------------------------===// let Predicates = [FeatureProcessorAssist] in { - let hasSideEffects = 1, R4 = 0 in - def PPA : InstRRF<0xB2E8, (outs), (ins GR64:$R1, GR64:$R2, imm32zx4:$R3), - "ppa\t$R1, $R2, $R3", []>; + let hasSideEffects = 1 in + def PPA : SideEffectTernaryRRFc<"ppa", 0xB2E8, GR64, GR64, imm32zx4>; def : Pat<(int_s390_ppa_txassist GR32:$src), (PPA (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GR32:$src, subreg_l32), 0, 1)>; @@ -1608,33 +1686,18 @@ let Predicates = [FeatureProcessorAssist] in { // Miscellaneous Instructions. //===----------------------------------------------------------------------===// -// Extract CC into bits 29 and 28 of a register. -let Uses = [CC] in - def IPM : InherentRRE<"ipm", 0xB222, GR32, (z_ipm)>; - -// Read a 32-bit access register into a GR32. As with all GR32 operations, -// the upper 32 bits of the enclosing GR64 remain unchanged, which is useful -// when a 64-bit address is stored in a pair of access registers. -def EAR : InstRRE<0xB24F, (outs GR32:$R1), (ins access_reg:$R2), - "ear\t$R1, $R2", - [(set GR32:$R1, (z_extract_access access_reg:$R2))]>; - // Find leftmost one, AKA count leading zeros. The instruction actually // returns a pair of GR64s, the first giving the number of leading zeros // and the second giving a copy of the source with the leftmost one bit // cleared. We only use the first result here. -let Defs = [CC] in { - def FLOGR : UnaryRRE<"flog", 0xB983, null_frag, GR128, GR64>; -} +let Defs = [CC] in + def FLOGR : UnaryRRE<"flogr", 0xB983, null_frag, GR128, GR64>; def : Pat<(ctlz GR64:$src), (EXTRACT_SUBREG (FLOGR GR64:$src), subreg_h64)>; // Population count. Counts bits set per byte. -let Predicates = [FeaturePopulationCount], Defs = [CC] in { - def POPCNT : InstRRE<0xB9E1, (outs GR64:$R1), (ins GR64:$R2), - "popcnt\t$R1, $R2", - [(set GR64:$R1, (z_popcnt GR64:$R2))]>; -} +let Predicates = [FeaturePopulationCount], Defs = [CC] in + def POPCNT : UnaryRRE<"popcnt", 0xB9E1, z_popcnt, GR64, GR64>; // Use subregs to populate the "don't care" bits in a 32-bit to 64-bit anyext. def : Pat<(i64 (anyext GR32:$src)), @@ -1651,35 +1714,137 @@ let usesCustomInserter = 1 in { let mayLoad = 1, Defs = [CC] in defm SRST : StringRRE<"srst", 0xb25e, z_search_string>; -// Other instructions for inline assembly -let hasSideEffects = 1, Defs = [CC], isCall = 1 in - def SVC : InstI<0x0A, (outs), (ins imm32zx8:$I1), - "svc\t$I1", - []>; -let hasSideEffects = 1, Defs = [CC], mayStore = 1 in - def STCK : InstS<0xB205, (outs), (ins bdaddr12only:$BD2), - "stck\t$BD2", - []>; -let hasSideEffects = 1, Defs = [CC], mayStore = 1 in - def STCKF : InstS<0xB27C, (outs), (ins bdaddr12only:$BD2), - "stckf\t$BD2", - []>; -let hasSideEffects = 1, Defs = [CC], mayStore = 1 in - def STCKE : InstS<0xB278, (outs), (ins bdaddr12only:$BD2), - "stcke\t$BD2", - []>; -let hasSideEffects = 1, Defs = [CC], mayStore = 1 in - def STFLE : InstS<0xB2B0, (outs), (ins bdaddr12only:$BD2), - "stfle\t$BD2", - []>; +// Supervisor call. +let hasSideEffects = 1, isCall = 1, Defs = [CC] in + def SVC : SideEffectUnaryI<"svc", 0x0A, imm32zx8>; + +// Store clock. +let hasSideEffects = 1, Defs = [CC] in { + def STCK : StoreInherentS<"stck", 0xB205, null_frag, 8>; + def STCKF : StoreInherentS<"stckf", 0xB27C, null_frag, 8>; + def STCKE : StoreInherentS<"stcke", 0xB278, null_frag, 16>; +} + +// Store facility list. +let hasSideEffects = 1, Uses = [R0D], Defs = [R0D, CC] in + def STFLE : StoreInherentS<"stfle", 0xB2B0, null_frag, 0>; + +// Extract CPU time. +let Defs = [R0D, R1D], hasSideEffects = 1, mayLoad = 1 in + def ECTG : SideEffectTernarySSF<"ectg", 0xC81, GR64>; +// Execute. let hasSideEffects = 1 in { - def EX : InstRX<0x44, (outs), (ins GR64:$R1, bdxaddr12only:$XBD2), - "ex\t$R1, $XBD2", []>; - def EXRL : InstRIL<0xC60, (outs), (ins GR64:$R1, pcrel32:$I2), - "exrl\t$R1, $I2", []>; + def EX : SideEffectBinaryRX<"ex", 0x44, GR64>; + def EXRL : SideEffectBinaryRILPC<"exrl", 0xC60, GR64>; } +// Program return. +let hasSideEffects = 1, Defs = [CC] in + def PR : SideEffectInherentE<"pr", 0x0101>; + +// Move with key. +let mayLoad = 1, mayStore = 1, Defs = [CC] in + def MVCK : MemoryBinarySSd<"mvck", 0xD9, GR64>; + +// Store real address. +def STRAG : StoreSSE<"strag", 0xE502>; + +//===----------------------------------------------------------------------===// +// .insn directive instructions +//===----------------------------------------------------------------------===// + +let isCodeGenOnly = 1 in { + def InsnE : DirectiveInsnE<(outs), (ins imm64zx16:$enc), ".insn e,$enc", []>; + def InsnRI : DirectiveInsnRI<(outs), (ins imm64zx32:$enc, AnyReg:$R1, + imm32sx16:$I2), + ".insn ri,$enc,$R1,$I2", []>; + def InsnRIE : DirectiveInsnRIE<(outs), (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R3, brtarget16:$I2), + ".insn rie,$enc,$R1,$R3,$I2", []>; + def InsnRIL : DirectiveInsnRIL<(outs), (ins imm64zx48:$enc, AnyReg:$R1, + brtarget32:$I2), + ".insn ril,$enc,$R1,$I2", []>; + def InsnRILU : DirectiveInsnRIL<(outs), (ins imm64zx48:$enc, AnyReg:$R1, + uimm32:$I2), + ".insn rilu,$enc,$R1,$I2", []>; + def InsnRIS : DirectiveInsnRIS<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + imm32sx8:$I2, imm32zx4:$M3, + bdaddr12only:$BD4), + ".insn ris,$enc,$R1,$I2,$M3,$BD4", []>; + def InsnRR : DirectiveInsnRR<(outs), + (ins imm64zx16:$enc, AnyReg:$R1, AnyReg:$R2), + ".insn rr,$enc,$R1,$R2", []>; + def InsnRRE : DirectiveInsnRRE<(outs), (ins imm64zx32:$enc, + AnyReg:$R1, AnyReg:$R2), + ".insn rre,$enc,$R1,$R2", []>; + def InsnRRF : DirectiveInsnRRF<(outs), + (ins imm64zx32:$enc, AnyReg:$R1, AnyReg:$R2, + AnyReg:$R3, imm32zx4:$M4), + ".insn rrf,$enc,$R1,$R2,$R3,$M4", []>; + def InsnRRS : DirectiveInsnRRS<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R2, imm32zx4:$M3, + bdaddr12only:$BD4), + ".insn rrs,$enc,$R1,$R2,$M3,$BD4", []>; + def InsnRS : DirectiveInsnRS<(outs), + (ins imm64zx32:$enc, AnyReg:$R1, + AnyReg:$R3, bdaddr12only:$BD2), + ".insn rs,$enc,$R1,$R3,$BD2", []>; + def InsnRSE : DirectiveInsnRSE<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R3, bdaddr12only:$BD2), + ".insn rse,$enc,$R1,$R3,$BD2", []>; + def InsnRSI : DirectiveInsnRSI<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R3, brtarget16:$RI2), + ".insn rsi,$enc,$R1,$R3,$RI2", []>; + def InsnRSY : DirectiveInsnRSY<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R3, bdaddr20only:$BD2), + ".insn rsy,$enc,$R1,$R3,$BD2", []>; + def InsnRX : DirectiveInsnRX<(outs), (ins imm64zx32:$enc, AnyReg:$R1, + bdxaddr12only:$XBD2), + ".insn rx,$enc,$R1,$XBD2", []>; + def InsnRXE : DirectiveInsnRXE<(outs), (ins imm64zx48:$enc, AnyReg:$R1, + bdxaddr12only:$XBD2), + ".insn rxe,$enc,$R1,$XBD2", []>; + def InsnRXF : DirectiveInsnRXF<(outs), + (ins imm64zx48:$enc, AnyReg:$R1, + AnyReg:$R3, bdxaddr12only:$XBD2), + ".insn rxf,$enc,$R1,$R3,$XBD2", []>; + def InsnRXY : DirectiveInsnRXY<(outs), (ins imm64zx48:$enc, AnyReg:$R1, + bdxaddr20only:$XBD2), + ".insn rxy,$enc,$R1,$XBD2", []>; + def InsnS : DirectiveInsnS<(outs), + (ins imm64zx32:$enc, bdaddr12only:$BD2), + ".insn s,$enc,$BD2", []>; + def InsnSI : DirectiveInsnSI<(outs), + (ins imm64zx32:$enc, bdaddr12only:$BD1, + imm32sx8:$I2), + ".insn si,$enc,$BD1,$I2", []>; + def InsnSIY : DirectiveInsnSIY<(outs), + (ins imm64zx48:$enc, + bdaddr20only:$BD1, imm32zx8:$I2), + ".insn siy,$enc,$BD1,$I2", []>; + def InsnSIL : DirectiveInsnSIL<(outs), + (ins imm64zx48:$enc, bdaddr12only:$BD1, + imm32zx16:$I2), + ".insn sil,$enc,$BD1,$I2", []>; + def InsnSS : DirectiveInsnSS<(outs), + (ins imm64zx48:$enc, bdraddr12only:$RBD1, + bdaddr12only:$BD2, AnyReg:$R3), + ".insn ss,$enc,$RBD1,$BD2,$R3", []>; + def InsnSSE : DirectiveInsnSSE<(outs), + (ins imm64zx48:$enc, + bdaddr12only:$BD1,bdaddr12only:$BD2), + ".insn sse,$enc,$BD1,$BD2", []>; + def InsnSSF : DirectiveInsnSSF<(outs), + (ins imm64zx48:$enc, bdaddr12only:$BD1, + bdaddr12only:$BD2, AnyReg:$R3), + ".insn ssf,$enc,$BD1,$BD2,$R3", []>; +} //===----------------------------------------------------------------------===// // Peepholes. diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZInstrVector.td b/contrib/llvm/lib/Target/SystemZ/SystemZInstrVector.td index c101e43..738ea7a 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZInstrVector.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZInstrVector.td @@ -18,12 +18,14 @@ let Predicates = [FeatureVector] in { def VLR64 : UnaryAliasVRR<null_frag, v64db, v64db>; // Load GR from VR element. + def VLGV : BinaryVRScGeneric<"vlgv", 0xE721>; def VLGVB : BinaryVRSc<"vlgvb", 0xE721, null_frag, v128b, 0>; def VLGVH : BinaryVRSc<"vlgvh", 0xE721, null_frag, v128h, 1>; def VLGVF : BinaryVRSc<"vlgvf", 0xE721, null_frag, v128f, 2>; def VLGVG : BinaryVRSc<"vlgvg", 0xE721, z_vector_extract, v128g, 3>; // Load VR element from GR. + def VLVG : TernaryVRSbGeneric<"vlvg", 0xE722>; def VLVGB : TernaryVRSb<"vlvgb", 0xE722, z_vector_insert, v128b, v128b, GR32, 0>; def VLVGH : TernaryVRSb<"vlvgh", 0xE722, z_vector_insert, @@ -60,6 +62,7 @@ let Predicates = [FeatureVector] in { def VGBM : UnaryVRIa<"vgbm", 0xE744, z_byte_mask, v128b, imm32zx16>; // Generate mask. + def VGM : BinaryVRIbGeneric<"vgm", 0xE746>; def VGMB : BinaryVRIb<"vgmb", 0xE746, z_rotate_mask, v128b, 0>; def VGMH : BinaryVRIb<"vgmh", 0xE746, z_rotate_mask, v128h, 1>; def VGMF : BinaryVRIb<"vgmf", 0xE746, z_rotate_mask, v128f, 2>; @@ -85,6 +88,7 @@ let Predicates = [FeatureVector] in { } // Replicate immediate. + def VREPI : UnaryVRIaGeneric<"vrepi", 0xE745, imm32sx16>; def VREPIB : UnaryVRIa<"vrepib", 0xE745, z_replicate, v128b, imm32sx16, 0>; def VREPIH : UnaryVRIa<"vrepih", 0xE745, z_replicate, v128h, imm32sx16, 1>; def VREPIF : UnaryVRIa<"vrepif", 0xE745, z_replicate, v128f, imm32sx16, 2>; @@ -119,6 +123,7 @@ let Predicates = [FeatureVector] in { def VLM : LoadMultipleVRSa<"vlm", 0xE736>; // Load and replicate + def VLREP : UnaryVRXGeneric<"vlrep", 0xE705>; def VLREPB : UnaryVRX<"vlrepb", 0xE705, z_replicate_loadi8, v128b, 1, 0>; def VLREPH : UnaryVRX<"vlreph", 0xE705, z_replicate_loadi16, v128h, 2, 1>; def VLREPF : UnaryVRX<"vlrepf", 0xE705, z_replicate_loadi32, v128f, 4, 2>; @@ -136,6 +141,7 @@ let Predicates = [FeatureVector] in { def VL64 : UnaryAliasVRX<load, v64db, bdxaddr12pair>; // Load logical element and zero. + def VLLEZ : UnaryVRXGeneric<"vllez", 0xE704>; def VLLEZB : UnaryVRX<"vllezb", 0xE704, z_vllezi8, v128b, 1, 0>; def VLLEZH : UnaryVRX<"vllezh", 0xE704, z_vllezi16, v128h, 2, 1>; def VLLEZF : UnaryVRX<"vllezf", 0xE704, z_vllezi32, v128f, 4, 2>; @@ -223,6 +229,7 @@ let Predicates = [FeatureVector] in { let Predicates = [FeatureVector] in { // Merge high. + def VMRH: BinaryVRRcGeneric<"vmrh", 0xE761>; def VMRHB : BinaryVRRc<"vmrhb", 0xE761, z_merge_high, v128b, v128b, 0>; def VMRHH : BinaryVRRc<"vmrhh", 0xE761, z_merge_high, v128h, v128h, 1>; def VMRHF : BinaryVRRc<"vmrhf", 0xE761, z_merge_high, v128f, v128f, 2>; @@ -231,6 +238,7 @@ let Predicates = [FeatureVector] in { def : BinaryRRWithType<VMRHG, VR128, z_merge_high, v2f64>; // Merge low. + def VMRL: BinaryVRRcGeneric<"vmrl", 0xE760>; def VMRLB : BinaryVRRc<"vmrlb", 0xE760, z_merge_low, v128b, v128b, 0>; def VMRLH : BinaryVRRc<"vmrlh", 0xE760, z_merge_low, v128h, v128h, 1>; def VMRLF : BinaryVRRc<"vmrlf", 0xE760, z_merge_low, v128f, v128f, 2>; @@ -245,6 +253,7 @@ let Predicates = [FeatureVector] in { def VPDI : TernaryVRRc<"vpdi", 0xE784, z_permute_dwords, v128g, v128g>; // Replicate. + def VREP: BinaryVRIcGeneric<"vrep", 0xE74D>; def VREPB : BinaryVRIc<"vrepb", 0xE74D, z_splat, v128b, v128b, 0>; def VREPH : BinaryVRIc<"vreph", 0xE74D, z_splat, v128h, v128h, 1>; def VREPF : BinaryVRIc<"vrepf", 0xE74D, z_splat, v128f, v128f, 2>; @@ -264,11 +273,13 @@ let Predicates = [FeatureVector] in { let Predicates = [FeatureVector] in { // Pack + def VPK : BinaryVRRcGeneric<"vpk", 0xE794>; def VPKH : BinaryVRRc<"vpkh", 0xE794, z_pack, v128b, v128h, 1>; def VPKF : BinaryVRRc<"vpkf", 0xE794, z_pack, v128h, v128f, 2>; def VPKG : BinaryVRRc<"vpkg", 0xE794, z_pack, v128f, v128g, 3>; // Pack saturate. + def VPKS : BinaryVRRbSPairGeneric<"vpks", 0xE797>; defm VPKSH : BinaryVRRbSPair<"vpksh", 0xE797, int_s390_vpksh, z_packs_cc, v128b, v128h, 1>; defm VPKSF : BinaryVRRbSPair<"vpksf", 0xE797, int_s390_vpksf, z_packs_cc, @@ -277,6 +288,7 @@ let Predicates = [FeatureVector] in { v128f, v128g, 3>; // Pack saturate logical. + def VPKLS : BinaryVRRbSPairGeneric<"vpkls", 0xE795>; defm VPKLSH : BinaryVRRbSPair<"vpklsh", 0xE795, int_s390_vpklsh, z_packls_cc, v128b, v128h, 1>; defm VPKLSF : BinaryVRRbSPair<"vpklsf", 0xE795, int_s390_vpklsf, z_packls_cc, @@ -285,6 +297,7 @@ let Predicates = [FeatureVector] in { v128f, v128g, 3>; // Sign-extend to doubleword. + def VSEG : UnaryVRRaGeneric<"vseg", 0xE75F>; def VSEGB : UnaryVRRa<"vsegb", 0xE75F, z_vsei8, v128g, v128g, 0>; def VSEGH : UnaryVRRa<"vsegh", 0xE75F, z_vsei16, v128g, v128g, 1>; def VSEGF : UnaryVRRa<"vsegf", 0xE75F, z_vsei32, v128g, v128g, 2>; @@ -293,21 +306,25 @@ let Predicates = [FeatureVector] in { def : Pat<(z_vsei32_by_parts (v4i32 VR128:$src)), (VSEGF VR128:$src)>; // Unpack high. + def VUPH : UnaryVRRaGeneric<"vuph", 0xE7D7>; def VUPHB : UnaryVRRa<"vuphb", 0xE7D7, z_unpack_high, v128h, v128b, 0>; def VUPHH : UnaryVRRa<"vuphh", 0xE7D7, z_unpack_high, v128f, v128h, 1>; def VUPHF : UnaryVRRa<"vuphf", 0xE7D7, z_unpack_high, v128g, v128f, 2>; // Unpack logical high. + def VUPLH : UnaryVRRaGeneric<"vuplh", 0xE7D5>; def VUPLHB : UnaryVRRa<"vuplhb", 0xE7D5, z_unpackl_high, v128h, v128b, 0>; def VUPLHH : UnaryVRRa<"vuplhh", 0xE7D5, z_unpackl_high, v128f, v128h, 1>; def VUPLHF : UnaryVRRa<"vuplhf", 0xE7D5, z_unpackl_high, v128g, v128f, 2>; // Unpack low. + def VUPL : UnaryVRRaGeneric<"vupl", 0xE7D6>; def VUPLB : UnaryVRRa<"vuplb", 0xE7D6, z_unpack_low, v128h, v128b, 0>; def VUPLHW : UnaryVRRa<"vuplhw", 0xE7D6, z_unpack_low, v128f, v128h, 1>; def VUPLF : UnaryVRRa<"vuplf", 0xE7D6, z_unpack_low, v128g, v128f, 2>; // Unpack logical low. + def VUPLL : UnaryVRRaGeneric<"vupll", 0xE7D4>; def VUPLLB : UnaryVRRa<"vupllb", 0xE7D4, z_unpackl_low, v128h, v128b, 0>; def VUPLLH : UnaryVRRa<"vupllh", 0xE7D4, z_unpackl_low, v128f, v128h, 1>; def VUPLLF : UnaryVRRa<"vupllf", 0xE7D4, z_unpackl_low, v128g, v128f, 2>; @@ -343,6 +360,7 @@ defm : GenericVectorOps<v2f64, v2i64>; let Predicates = [FeatureVector] in { // Add. + def VA : BinaryVRRcGeneric<"va", 0xE7F3>; def VAB : BinaryVRRc<"vab", 0xE7F3, add, v128b, v128b, 0>; def VAH : BinaryVRRc<"vah", 0xE7F3, add, v128h, v128h, 1>; def VAF : BinaryVRRc<"vaf", 0xE7F3, add, v128f, v128f, 2>; @@ -350,6 +368,7 @@ let Predicates = [FeatureVector] in { def VAQ : BinaryVRRc<"vaq", 0xE7F3, int_s390_vaq, v128q, v128q, 4>; // Add compute carry. + def VACC : BinaryVRRcGeneric<"vacc", 0xE7F1>; def VACCB : BinaryVRRc<"vaccb", 0xE7F1, int_s390_vaccb, v128b, v128b, 0>; def VACCH : BinaryVRRc<"vacch", 0xE7F1, int_s390_vacch, v128h, v128h, 1>; def VACCF : BinaryVRRc<"vaccf", 0xE7F1, int_s390_vaccf, v128f, v128f, 2>; @@ -357,9 +376,11 @@ let Predicates = [FeatureVector] in { def VACCQ : BinaryVRRc<"vaccq", 0xE7F1, int_s390_vaccq, v128q, v128q, 4>; // Add with carry. + def VAC : TernaryVRRdGeneric<"vac", 0xE7BB>; def VACQ : TernaryVRRd<"vacq", 0xE7BB, int_s390_vacq, v128q, v128q, 4>; // Add with carry compute carry. + def VACCC : TernaryVRRdGeneric<"vaccc", 0xE7B9>; def VACCCQ : TernaryVRRd<"vacccq", 0xE7B9, int_s390_vacccq, v128q, v128q, 4>; // And. @@ -369,12 +390,14 @@ let Predicates = [FeatureVector] in { def VNC : BinaryVRRc<"vnc", 0xE769, null_frag, v128any, v128any>; // Average. + def VAVG : BinaryVRRcGeneric<"vavg", 0xE7F2>; def VAVGB : BinaryVRRc<"vavgb", 0xE7F2, int_s390_vavgb, v128b, v128b, 0>; def VAVGH : BinaryVRRc<"vavgh", 0xE7F2, int_s390_vavgh, v128h, v128h, 1>; def VAVGF : BinaryVRRc<"vavgf", 0xE7F2, int_s390_vavgf, v128f, v128f, 2>; def VAVGG : BinaryVRRc<"vavgg", 0xE7F2, int_s390_vavgg, v128g, v128g, 3>; // Average logical. + def VAVGL : BinaryVRRcGeneric<"vavgl", 0xE7F0>; def VAVGLB : BinaryVRRc<"vavglb", 0xE7F0, int_s390_vavglb, v128b, v128b, 0>; def VAVGLH : BinaryVRRc<"vavglh", 0xE7F0, int_s390_vavglh, v128h, v128h, 1>; def VAVGLF : BinaryVRRc<"vavglf", 0xE7F0, int_s390_vavglf, v128f, v128f, 2>; @@ -384,12 +407,14 @@ let Predicates = [FeatureVector] in { def VCKSM : BinaryVRRc<"vcksm", 0xE766, int_s390_vcksm, v128f, v128f>; // Count leading zeros. + def VCLZ : UnaryVRRaGeneric<"vclz", 0xE753>; def VCLZB : UnaryVRRa<"vclzb", 0xE753, ctlz, v128b, v128b, 0>; def VCLZH : UnaryVRRa<"vclzh", 0xE753, ctlz, v128h, v128h, 1>; def VCLZF : UnaryVRRa<"vclzf", 0xE753, ctlz, v128f, v128f, 2>; def VCLZG : UnaryVRRa<"vclzg", 0xE753, ctlz, v128g, v128g, 3>; // Count trailing zeros. + def VCTZ : UnaryVRRaGeneric<"vctz", 0xE752>; def VCTZB : UnaryVRRa<"vctzb", 0xE752, cttz, v128b, v128b, 0>; def VCTZH : UnaryVRRa<"vctzh", 0xE752, cttz, v128h, v128h, 1>; def VCTZF : UnaryVRRa<"vctzf", 0xE752, cttz, v128f, v128f, 2>; @@ -399,134 +424,158 @@ let Predicates = [FeatureVector] in { def VX : BinaryVRRc<"vx", 0xE76D, null_frag, v128any, v128any>; // Galois field multiply sum. + def VGFM : BinaryVRRcGeneric<"vgfm", 0xE7B4>; def VGFMB : BinaryVRRc<"vgfmb", 0xE7B4, int_s390_vgfmb, v128h, v128b, 0>; def VGFMH : BinaryVRRc<"vgfmh", 0xE7B4, int_s390_vgfmh, v128f, v128h, 1>; def VGFMF : BinaryVRRc<"vgfmf", 0xE7B4, int_s390_vgfmf, v128g, v128f, 2>; def VGFMG : BinaryVRRc<"vgfmg", 0xE7B4, int_s390_vgfmg, v128q, v128g, 3>; // Galois field multiply sum and accumulate. + def VGFMA : TernaryVRRdGeneric<"vgfma", 0xE7BC>; def VGFMAB : TernaryVRRd<"vgfmab", 0xE7BC, int_s390_vgfmab, v128h, v128b, 0>; def VGFMAH : TernaryVRRd<"vgfmah", 0xE7BC, int_s390_vgfmah, v128f, v128h, 1>; def VGFMAF : TernaryVRRd<"vgfmaf", 0xE7BC, int_s390_vgfmaf, v128g, v128f, 2>; def VGFMAG : TernaryVRRd<"vgfmag", 0xE7BC, int_s390_vgfmag, v128q, v128g, 3>; // Load complement. + def VLC : UnaryVRRaGeneric<"vlc", 0xE7DE>; def VLCB : UnaryVRRa<"vlcb", 0xE7DE, z_vneg, v128b, v128b, 0>; def VLCH : UnaryVRRa<"vlch", 0xE7DE, z_vneg, v128h, v128h, 1>; def VLCF : UnaryVRRa<"vlcf", 0xE7DE, z_vneg, v128f, v128f, 2>; def VLCG : UnaryVRRa<"vlcg", 0xE7DE, z_vneg, v128g, v128g, 3>; // Load positive. + def VLP : UnaryVRRaGeneric<"vlp", 0xE7DF>; def VLPB : UnaryVRRa<"vlpb", 0xE7DF, z_viabs8, v128b, v128b, 0>; def VLPH : UnaryVRRa<"vlph", 0xE7DF, z_viabs16, v128h, v128h, 1>; def VLPF : UnaryVRRa<"vlpf", 0xE7DF, z_viabs32, v128f, v128f, 2>; def VLPG : UnaryVRRa<"vlpg", 0xE7DF, z_viabs64, v128g, v128g, 3>; // Maximum. + def VMX : BinaryVRRcGeneric<"vmx", 0xE7FF>; def VMXB : BinaryVRRc<"vmxb", 0xE7FF, null_frag, v128b, v128b, 0>; def VMXH : BinaryVRRc<"vmxh", 0xE7FF, null_frag, v128h, v128h, 1>; def VMXF : BinaryVRRc<"vmxf", 0xE7FF, null_frag, v128f, v128f, 2>; def VMXG : BinaryVRRc<"vmxg", 0xE7FF, null_frag, v128g, v128g, 3>; // Maximum logical. + def VMXL : BinaryVRRcGeneric<"vmxl", 0xE7FD>; def VMXLB : BinaryVRRc<"vmxlb", 0xE7FD, null_frag, v128b, v128b, 0>; def VMXLH : BinaryVRRc<"vmxlh", 0xE7FD, null_frag, v128h, v128h, 1>; def VMXLF : BinaryVRRc<"vmxlf", 0xE7FD, null_frag, v128f, v128f, 2>; def VMXLG : BinaryVRRc<"vmxlg", 0xE7FD, null_frag, v128g, v128g, 3>; // Minimum. + def VMN : BinaryVRRcGeneric<"vmn", 0xE7FE>; def VMNB : BinaryVRRc<"vmnb", 0xE7FE, null_frag, v128b, v128b, 0>; def VMNH : BinaryVRRc<"vmnh", 0xE7FE, null_frag, v128h, v128h, 1>; def VMNF : BinaryVRRc<"vmnf", 0xE7FE, null_frag, v128f, v128f, 2>; def VMNG : BinaryVRRc<"vmng", 0xE7FE, null_frag, v128g, v128g, 3>; // Minimum logical. + def VMNL : BinaryVRRcGeneric<"vmnl", 0xE7FC>; def VMNLB : BinaryVRRc<"vmnlb", 0xE7FC, null_frag, v128b, v128b, 0>; def VMNLH : BinaryVRRc<"vmnlh", 0xE7FC, null_frag, v128h, v128h, 1>; def VMNLF : BinaryVRRc<"vmnlf", 0xE7FC, null_frag, v128f, v128f, 2>; def VMNLG : BinaryVRRc<"vmnlg", 0xE7FC, null_frag, v128g, v128g, 3>; // Multiply and add low. + def VMAL : TernaryVRRdGeneric<"vmal", 0xE7AA>; def VMALB : TernaryVRRd<"vmalb", 0xE7AA, z_muladd, v128b, v128b, 0>; def VMALHW : TernaryVRRd<"vmalhw", 0xE7AA, z_muladd, v128h, v128h, 1>; def VMALF : TernaryVRRd<"vmalf", 0xE7AA, z_muladd, v128f, v128f, 2>; // Multiply and add high. + def VMAH : TernaryVRRdGeneric<"vmah", 0xE7AB>; def VMAHB : TernaryVRRd<"vmahb", 0xE7AB, int_s390_vmahb, v128b, v128b, 0>; def VMAHH : TernaryVRRd<"vmahh", 0xE7AB, int_s390_vmahh, v128h, v128h, 1>; def VMAHF : TernaryVRRd<"vmahf", 0xE7AB, int_s390_vmahf, v128f, v128f, 2>; // Multiply and add logical high. + def VMALH : TernaryVRRdGeneric<"vmalh", 0xE7A9>; def VMALHB : TernaryVRRd<"vmalhb", 0xE7A9, int_s390_vmalhb, v128b, v128b, 0>; def VMALHH : TernaryVRRd<"vmalhh", 0xE7A9, int_s390_vmalhh, v128h, v128h, 1>; def VMALHF : TernaryVRRd<"vmalhf", 0xE7A9, int_s390_vmalhf, v128f, v128f, 2>; // Multiply and add even. + def VMAE : TernaryVRRdGeneric<"vmae", 0xE7AE>; def VMAEB : TernaryVRRd<"vmaeb", 0xE7AE, int_s390_vmaeb, v128h, v128b, 0>; def VMAEH : TernaryVRRd<"vmaeh", 0xE7AE, int_s390_vmaeh, v128f, v128h, 1>; def VMAEF : TernaryVRRd<"vmaef", 0xE7AE, int_s390_vmaef, v128g, v128f, 2>; // Multiply and add logical even. + def VMALE : TernaryVRRdGeneric<"vmale", 0xE7AC>; def VMALEB : TernaryVRRd<"vmaleb", 0xE7AC, int_s390_vmaleb, v128h, v128b, 0>; def VMALEH : TernaryVRRd<"vmaleh", 0xE7AC, int_s390_vmaleh, v128f, v128h, 1>; def VMALEF : TernaryVRRd<"vmalef", 0xE7AC, int_s390_vmalef, v128g, v128f, 2>; // Multiply and add odd. + def VMAO : TernaryVRRdGeneric<"vmao", 0xE7AF>; def VMAOB : TernaryVRRd<"vmaob", 0xE7AF, int_s390_vmaob, v128h, v128b, 0>; def VMAOH : TernaryVRRd<"vmaoh", 0xE7AF, int_s390_vmaoh, v128f, v128h, 1>; def VMAOF : TernaryVRRd<"vmaof", 0xE7AF, int_s390_vmaof, v128g, v128f, 2>; // Multiply and add logical odd. + def VMALO : TernaryVRRdGeneric<"vmalo", 0xE7AD>; def VMALOB : TernaryVRRd<"vmalob", 0xE7AD, int_s390_vmalob, v128h, v128b, 0>; def VMALOH : TernaryVRRd<"vmaloh", 0xE7AD, int_s390_vmaloh, v128f, v128h, 1>; def VMALOF : TernaryVRRd<"vmalof", 0xE7AD, int_s390_vmalof, v128g, v128f, 2>; // Multiply high. + def VMH : BinaryVRRcGeneric<"vmh", 0xE7A3>; def VMHB : BinaryVRRc<"vmhb", 0xE7A3, int_s390_vmhb, v128b, v128b, 0>; def VMHH : BinaryVRRc<"vmhh", 0xE7A3, int_s390_vmhh, v128h, v128h, 1>; def VMHF : BinaryVRRc<"vmhf", 0xE7A3, int_s390_vmhf, v128f, v128f, 2>; // Multiply logical high. + def VMLH : BinaryVRRcGeneric<"vmlh", 0xE7A1>; def VMLHB : BinaryVRRc<"vmlhb", 0xE7A1, int_s390_vmlhb, v128b, v128b, 0>; def VMLHH : BinaryVRRc<"vmlhh", 0xE7A1, int_s390_vmlhh, v128h, v128h, 1>; def VMLHF : BinaryVRRc<"vmlhf", 0xE7A1, int_s390_vmlhf, v128f, v128f, 2>; // Multiply low. + def VML : BinaryVRRcGeneric<"vml", 0xE7A2>; def VMLB : BinaryVRRc<"vmlb", 0xE7A2, mul, v128b, v128b, 0>; def VMLHW : BinaryVRRc<"vmlhw", 0xE7A2, mul, v128h, v128h, 1>; def VMLF : BinaryVRRc<"vmlf", 0xE7A2, mul, v128f, v128f, 2>; // Multiply even. + def VME : BinaryVRRcGeneric<"vme", 0xE7A6>; def VMEB : BinaryVRRc<"vmeb", 0xE7A6, int_s390_vmeb, v128h, v128b, 0>; def VMEH : BinaryVRRc<"vmeh", 0xE7A6, int_s390_vmeh, v128f, v128h, 1>; def VMEF : BinaryVRRc<"vmef", 0xE7A6, int_s390_vmef, v128g, v128f, 2>; // Multiply logical even. + def VMLE : BinaryVRRcGeneric<"vmle", 0xE7A4>; def VMLEB : BinaryVRRc<"vmleb", 0xE7A4, int_s390_vmleb, v128h, v128b, 0>; def VMLEH : BinaryVRRc<"vmleh", 0xE7A4, int_s390_vmleh, v128f, v128h, 1>; def VMLEF : BinaryVRRc<"vmlef", 0xE7A4, int_s390_vmlef, v128g, v128f, 2>; // Multiply odd. + def VMO : BinaryVRRcGeneric<"vmo", 0xE7A7>; def VMOB : BinaryVRRc<"vmob", 0xE7A7, int_s390_vmob, v128h, v128b, 0>; def VMOH : BinaryVRRc<"vmoh", 0xE7A7, int_s390_vmoh, v128f, v128h, 1>; def VMOF : BinaryVRRc<"vmof", 0xE7A7, int_s390_vmof, v128g, v128f, 2>; // Multiply logical odd. + def VMLO : BinaryVRRcGeneric<"vmlo", 0xE7A5>; def VMLOB : BinaryVRRc<"vmlob", 0xE7A5, int_s390_vmlob, v128h, v128b, 0>; def VMLOH : BinaryVRRc<"vmloh", 0xE7A5, int_s390_vmloh, v128f, v128h, 1>; def VMLOF : BinaryVRRc<"vmlof", 0xE7A5, int_s390_vmlof, v128g, v128f, 2>; // Nor. def VNO : BinaryVRRc<"vno", 0xE76B, null_frag, v128any, v128any>; + def : InstAlias<"vnot\t$V1, $V2", (VNO VR128:$V1, VR128:$V2, VR128:$V2), 0>; // Or. def VO : BinaryVRRc<"vo", 0xE76A, null_frag, v128any, v128any>; // Population count. - def VPOPCT : BinaryVRRa<"vpopct", 0xE750>; + def VPOPCT : UnaryVRRaGeneric<"vpopct", 0xE750>; def : Pat<(v16i8 (z_popcnt VR128:$x)), (VPOPCT VR128:$x, 0)>; // Element rotate left logical (with vector shift amount). + def VERLLV : BinaryVRRcGeneric<"verllv", 0xE773>; def VERLLVB : BinaryVRRc<"verllvb", 0xE773, int_s390_verllvb, v128b, v128b, 0>; def VERLLVH : BinaryVRRc<"verllvh", 0xE773, int_s390_verllvh, @@ -537,48 +586,56 @@ let Predicates = [FeatureVector] in { v128g, v128g, 3>; // Element rotate left logical (with scalar shift amount). + def VERLL : BinaryVRSaGeneric<"verll", 0xE733>; def VERLLB : BinaryVRSa<"verllb", 0xE733, int_s390_verllb, v128b, v128b, 0>; def VERLLH : BinaryVRSa<"verllh", 0xE733, int_s390_verllh, v128h, v128h, 1>; def VERLLF : BinaryVRSa<"verllf", 0xE733, int_s390_verllf, v128f, v128f, 2>; def VERLLG : BinaryVRSa<"verllg", 0xE733, int_s390_verllg, v128g, v128g, 3>; // Element rotate and insert under mask. + def VERIM : QuaternaryVRIdGeneric<"verim", 0xE772>; def VERIMB : QuaternaryVRId<"verimb", 0xE772, int_s390_verimb, v128b, v128b, 0>; def VERIMH : QuaternaryVRId<"verimh", 0xE772, int_s390_verimh, v128h, v128h, 1>; def VERIMF : QuaternaryVRId<"verimf", 0xE772, int_s390_verimf, v128f, v128f, 2>; def VERIMG : QuaternaryVRId<"verimg", 0xE772, int_s390_verimg, v128g, v128g, 3>; // Element shift left (with vector shift amount). + def VESLV : BinaryVRRcGeneric<"veslv", 0xE770>; def VESLVB : BinaryVRRc<"veslvb", 0xE770, z_vshl, v128b, v128b, 0>; def VESLVH : BinaryVRRc<"veslvh", 0xE770, z_vshl, v128h, v128h, 1>; def VESLVF : BinaryVRRc<"veslvf", 0xE770, z_vshl, v128f, v128f, 2>; def VESLVG : BinaryVRRc<"veslvg", 0xE770, z_vshl, v128g, v128g, 3>; // Element shift left (with scalar shift amount). + def VESL : BinaryVRSaGeneric<"vesl", 0xE730>; def VESLB : BinaryVRSa<"veslb", 0xE730, z_vshl_by_scalar, v128b, v128b, 0>; def VESLH : BinaryVRSa<"veslh", 0xE730, z_vshl_by_scalar, v128h, v128h, 1>; def VESLF : BinaryVRSa<"veslf", 0xE730, z_vshl_by_scalar, v128f, v128f, 2>; def VESLG : BinaryVRSa<"veslg", 0xE730, z_vshl_by_scalar, v128g, v128g, 3>; // Element shift right arithmetic (with vector shift amount). + def VESRAV : BinaryVRRcGeneric<"vesrav", 0xE77A>; def VESRAVB : BinaryVRRc<"vesravb", 0xE77A, z_vsra, v128b, v128b, 0>; def VESRAVH : BinaryVRRc<"vesravh", 0xE77A, z_vsra, v128h, v128h, 1>; def VESRAVF : BinaryVRRc<"vesravf", 0xE77A, z_vsra, v128f, v128f, 2>; def VESRAVG : BinaryVRRc<"vesravg", 0xE77A, z_vsra, v128g, v128g, 3>; // Element shift right arithmetic (with scalar shift amount). + def VESRA : BinaryVRSaGeneric<"vesra", 0xE73A>; def VESRAB : BinaryVRSa<"vesrab", 0xE73A, z_vsra_by_scalar, v128b, v128b, 0>; def VESRAH : BinaryVRSa<"vesrah", 0xE73A, z_vsra_by_scalar, v128h, v128h, 1>; def VESRAF : BinaryVRSa<"vesraf", 0xE73A, z_vsra_by_scalar, v128f, v128f, 2>; def VESRAG : BinaryVRSa<"vesrag", 0xE73A, z_vsra_by_scalar, v128g, v128g, 3>; // Element shift right logical (with vector shift amount). + def VESRLV : BinaryVRRcGeneric<"vesrlv", 0xE778>; def VESRLVB : BinaryVRRc<"vesrlvb", 0xE778, z_vsrl, v128b, v128b, 0>; def VESRLVH : BinaryVRRc<"vesrlvh", 0xE778, z_vsrl, v128h, v128h, 1>; def VESRLVF : BinaryVRRc<"vesrlvf", 0xE778, z_vsrl, v128f, v128f, 2>; def VESRLVG : BinaryVRRc<"vesrlvg", 0xE778, z_vsrl, v128g, v128g, 3>; // Element shift right logical (with scalar shift amount). + def VESRL : BinaryVRSaGeneric<"vesrl", 0xE738>; def VESRLB : BinaryVRSa<"vesrlb", 0xE738, z_vsrl_by_scalar, v128b, v128b, 0>; def VESRLH : BinaryVRSa<"vesrlh", 0xE738, z_vsrl_by_scalar, v128h, v128h, 1>; def VESRLF : BinaryVRSa<"vesrlf", 0xE738, z_vsrl_by_scalar, v128f, v128f, 2>; @@ -608,6 +665,7 @@ let Predicates = [FeatureVector] in { def VSRLB : BinaryVRRc<"vsrlb", 0xE77D, int_s390_vsrlb, v128b, v128b>; // Subtract. + def VS : BinaryVRRcGeneric<"vs", 0xE7F7>; def VSB : BinaryVRRc<"vsb", 0xE7F7, sub, v128b, v128b, 0>; def VSH : BinaryVRRc<"vsh", 0xE7F7, sub, v128h, v128h, 1>; def VSF : BinaryVRRc<"vsf", 0xE7F7, sub, v128f, v128f, 2>; @@ -615,6 +673,7 @@ let Predicates = [FeatureVector] in { def VSQ : BinaryVRRc<"vsq", 0xE7F7, int_s390_vsq, v128q, v128q, 4>; // Subtract compute borrow indication. + def VSCBI : BinaryVRRcGeneric<"vscbi", 0xE7F5>; def VSCBIB : BinaryVRRc<"vscbib", 0xE7F5, int_s390_vscbib, v128b, v128b, 0>; def VSCBIH : BinaryVRRc<"vscbih", 0xE7F5, int_s390_vscbih, v128h, v128h, 1>; def VSCBIF : BinaryVRRc<"vscbif", 0xE7F5, int_s390_vscbif, v128f, v128f, 2>; @@ -622,21 +681,26 @@ let Predicates = [FeatureVector] in { def VSCBIQ : BinaryVRRc<"vscbiq", 0xE7F5, int_s390_vscbiq, v128q, v128q, 4>; // Subtract with borrow indication. + def VSBI : TernaryVRRdGeneric<"vsbi", 0xE7BF>; def VSBIQ : TernaryVRRd<"vsbiq", 0xE7BF, int_s390_vsbiq, v128q, v128q, 4>; // Subtract with borrow compute borrow indication. + def VSBCBI : TernaryVRRdGeneric<"vsbcbi", 0xE7BD>; def VSBCBIQ : TernaryVRRd<"vsbcbiq", 0xE7BD, int_s390_vsbcbiq, v128q, v128q, 4>; // Sum across doubleword. + def VSUMG : BinaryVRRcGeneric<"vsumg", 0xE765>; def VSUMGH : BinaryVRRc<"vsumgh", 0xE765, z_vsum, v128g, v128h, 1>; def VSUMGF : BinaryVRRc<"vsumgf", 0xE765, z_vsum, v128g, v128f, 2>; // Sum across quadword. + def VSUMQ : BinaryVRRcGeneric<"vsumq", 0xE767>; def VSUMQF : BinaryVRRc<"vsumqf", 0xE767, z_vsum, v128q, v128f, 2>; def VSUMQG : BinaryVRRc<"vsumqg", 0xE767, z_vsum, v128q, v128g, 3>; // Sum across word. + def VSUM : BinaryVRRcGeneric<"vsum", 0xE764>; def VSUMB : BinaryVRRc<"vsumb", 0xE764, z_vsum, v128f, v128b, 0>; def VSUMH : BinaryVRRc<"vsumh", 0xE764, z_vsum, v128f, v128h, 1>; } @@ -737,6 +801,7 @@ defm : IntegerMinMaxVectorOps<v2i64, z_vicmphl, VMNLG, VMXLG>; let Predicates = [FeatureVector] in { // Element compare. let Defs = [CC] in { + def VEC : CompareVRRaGeneric<"vec", 0xE7DB>; def VECB : CompareVRRa<"vecb", 0xE7DB, null_frag, v128b, 0>; def VECH : CompareVRRa<"vech", 0xE7DB, null_frag, v128h, 1>; def VECF : CompareVRRa<"vecf", 0xE7DB, null_frag, v128f, 2>; @@ -745,6 +810,7 @@ let Predicates = [FeatureVector] in { // Element compare logical. let Defs = [CC] in { + def VECL : CompareVRRaGeneric<"vecl", 0xE7D9>; def VECLB : CompareVRRa<"veclb", 0xE7D9, null_frag, v128b, 0>; def VECLH : CompareVRRa<"veclh", 0xE7D9, null_frag, v128h, 1>; def VECLF : CompareVRRa<"veclf", 0xE7D9, null_frag, v128f, 2>; @@ -752,6 +818,7 @@ let Predicates = [FeatureVector] in { } // Compare equal. + def VCEQ : BinaryVRRbSPairGeneric<"vceq", 0xE7F8>; defm VCEQB : BinaryVRRbSPair<"vceqb", 0xE7F8, z_vicmpe, z_vicmpes, v128b, v128b, 0>; defm VCEQH : BinaryVRRbSPair<"vceqh", 0xE7F8, z_vicmpe, z_vicmpes, @@ -762,6 +829,7 @@ let Predicates = [FeatureVector] in { v128g, v128g, 3>; // Compare high. + def VCH : BinaryVRRbSPairGeneric<"vch", 0xE7FB>; defm VCHB : BinaryVRRbSPair<"vchb", 0xE7FB, z_vicmph, z_vicmphs, v128b, v128b, 0>; defm VCHH : BinaryVRRbSPair<"vchh", 0xE7FB, z_vicmph, z_vicmphs, @@ -772,6 +840,7 @@ let Predicates = [FeatureVector] in { v128g, v128g, 3>; // Compare high logical. + def VCHL : BinaryVRRbSPairGeneric<"vchl", 0xE7F9>; defm VCHLB : BinaryVRRbSPair<"vchlb", 0xE7F9, z_vicmphl, z_vicmphls, v128b, v128b, 0>; defm VCHLH : BinaryVRRbSPair<"vchlh", 0xE7F9, z_vicmphl, z_vicmphls, @@ -798,69 +867,86 @@ multiclass VectorRounding<Instruction insn, TypedReg tr> { def : FPConversion<insn, ffloor, tr, tr, 4, 7>; def : FPConversion<insn, fceil, tr, tr, 4, 6>; def : FPConversion<insn, ftrunc, tr, tr, 4, 5>; - def : FPConversion<insn, frnd, tr, tr, 4, 1>; + def : FPConversion<insn, fround, tr, tr, 4, 1>; } let Predicates = [FeatureVector] in { // Add. + def VFA : BinaryVRRcFloatGeneric<"vfa", 0xE7E3>; def VFADB : BinaryVRRc<"vfadb", 0xE7E3, fadd, v128db, v128db, 3, 0>; def WFADB : BinaryVRRc<"wfadb", 0xE7E3, fadd, v64db, v64db, 3, 8>; // Convert from fixed 64-bit. + def VCDG : TernaryVRRaFloatGeneric<"vcdg", 0xE7C3>; def VCDGB : TernaryVRRa<"vcdgb", 0xE7C3, null_frag, v128db, v128g, 3, 0>; def WCDGB : TernaryVRRa<"wcdgb", 0xE7C3, null_frag, v64db, v64g, 3, 8>; def : FPConversion<VCDGB, sint_to_fp, v128db, v128g, 0, 0>; // Convert from logical 64-bit. + def VCDLG : TernaryVRRaFloatGeneric<"vcdlg", 0xE7C1>; def VCDLGB : TernaryVRRa<"vcdlgb", 0xE7C1, null_frag, v128db, v128g, 3, 0>; def WCDLGB : TernaryVRRa<"wcdlgb", 0xE7C1, null_frag, v64db, v64g, 3, 8>; def : FPConversion<VCDLGB, uint_to_fp, v128db, v128g, 0, 0>; // Convert to fixed 64-bit. + def VCGD : TernaryVRRaFloatGeneric<"vcgd", 0xE7C2>; def VCGDB : TernaryVRRa<"vcgdb", 0xE7C2, null_frag, v128g, v128db, 3, 0>; def WCGDB : TernaryVRRa<"wcgdb", 0xE7C2, null_frag, v64g, v64db, 3, 8>; // Rounding mode should agree with SystemZInstrFP.td. def : FPConversion<VCGDB, fp_to_sint, v128g, v128db, 0, 5>; // Convert to logical 64-bit. + def VCLGD : TernaryVRRaFloatGeneric<"vclgd", 0xE7C0>; def VCLGDB : TernaryVRRa<"vclgdb", 0xE7C0, null_frag, v128g, v128db, 3, 0>; def WCLGDB : TernaryVRRa<"wclgdb", 0xE7C0, null_frag, v64g, v64db, 3, 8>; // Rounding mode should agree with SystemZInstrFP.td. def : FPConversion<VCLGDB, fp_to_uint, v128g, v128db, 0, 5>; // Divide. + def VFD : BinaryVRRcFloatGeneric<"vfd", 0xE7E5>; def VFDDB : BinaryVRRc<"vfddb", 0xE7E5, fdiv, v128db, v128db, 3, 0>; def WFDDB : BinaryVRRc<"wfddb", 0xE7E5, fdiv, v64db, v64db, 3, 8>; // Load FP integer. + def VFI : TernaryVRRaFloatGeneric<"vfi", 0xE7C7>; def VFIDB : TernaryVRRa<"vfidb", 0xE7C7, int_s390_vfidb, v128db, v128db, 3, 0>; def WFIDB : TernaryVRRa<"wfidb", 0xE7C7, null_frag, v64db, v64db, 3, 8>; defm : VectorRounding<VFIDB, v128db>; defm : VectorRounding<WFIDB, v64db>; // Load lengthened. + def VLDE : UnaryVRRaFloatGeneric<"vlde", 0xE7C4>; def VLDEB : UnaryVRRa<"vldeb", 0xE7C4, z_vextend, v128db, v128eb, 2, 0>; - def WLDEB : UnaryVRRa<"wldeb", 0xE7C4, fextend, v64db, v32eb, 2, 8>; + def WLDEB : UnaryVRRa<"wldeb", 0xE7C4, fpextend, v64db, v32eb, 2, 8>; // Load rounded, + def VLED : TernaryVRRaFloatGeneric<"vled", 0xE7C5>; def VLEDB : TernaryVRRa<"vledb", 0xE7C5, null_frag, v128eb, v128db, 3, 0>; def WLEDB : TernaryVRRa<"wledb", 0xE7C5, null_frag, v32eb, v64db, 3, 8>; def : Pat<(v4f32 (z_vround (v2f64 VR128:$src))), (VLEDB VR128:$src, 0, 0)>; - def : FPConversion<WLEDB, fround, v32eb, v64db, 0, 0>; + def : FPConversion<WLEDB, fpround, v32eb, v64db, 0, 0>; // Multiply. + def VFM : BinaryVRRcFloatGeneric<"vfm", 0xE7E7>; def VFMDB : BinaryVRRc<"vfmdb", 0xE7E7, fmul, v128db, v128db, 3, 0>; def WFMDB : BinaryVRRc<"wfmdb", 0xE7E7, fmul, v64db, v64db, 3, 8>; // Multiply and add. + def VFMA : TernaryVRReFloatGeneric<"vfma", 0xE78F>; def VFMADB : TernaryVRRe<"vfmadb", 0xE78F, fma, v128db, v128db, 0, 3>; def WFMADB : TernaryVRRe<"wfmadb", 0xE78F, fma, v64db, v64db, 8, 3>; // Multiply and subtract. + def VFMS : TernaryVRReFloatGeneric<"vfms", 0xE78E>; def VFMSDB : TernaryVRRe<"vfmsdb", 0xE78E, fms, v128db, v128db, 0, 3>; def WFMSDB : TernaryVRRe<"wfmsdb", 0xE78E, fms, v64db, v64db, 8, 3>; - // Load complement, + // Perform sign operation. + def VFPSO : BinaryVRRaFloatGeneric<"vfpso", 0xE7CC>; + def VFPSODB : BinaryVRRa<"vfpsodb", 0xE7CC, null_frag, v128db, v128db, 3, 0>; + def WFPSODB : BinaryVRRa<"wfpsodb", 0xE7CC, null_frag, v64db, v64db, 3, 8>; + + // Load complement. def VFLCDB : UnaryVRRa<"vflcdb", 0xE7CC, fneg, v128db, v128db, 3, 0, 0>; def WFLCDB : UnaryVRRa<"wflcdb", 0xE7CC, fneg, v64db, v64db, 3, 8, 0>; @@ -873,15 +959,18 @@ let Predicates = [FeatureVector] in { def WFLPDB : UnaryVRRa<"wflpdb", 0xE7CC, fabs, v64db, v64db, 3, 8, 2>; // Square root. + def VFSQ : UnaryVRRaFloatGeneric<"vfsq", 0xE7CE>; def VFSQDB : UnaryVRRa<"vfsqdb", 0xE7CE, fsqrt, v128db, v128db, 3, 0>; def WFSQDB : UnaryVRRa<"wfsqdb", 0xE7CE, fsqrt, v64db, v64db, 3, 8>; // Subtract. + def VFS : BinaryVRRcFloatGeneric<"vfs", 0xE7E2>; def VFSDB : BinaryVRRc<"vfsdb", 0xE7E2, fsub, v128db, v128db, 3, 0>; def WFSDB : BinaryVRRc<"wfsdb", 0xE7E2, fsub, v64db, v64db, 3, 8>; // Test data class immediate. let Defs = [CC] in { + def VFTCI : BinaryVRIeFloatGeneric<"vftci", 0xE74A>; def VFTCIDB : BinaryVRIe<"vftcidb", 0xE74A, z_vftci, v128g, v128db, 3, 0>; def WFTCIDB : BinaryVRIe<"wftcidb", 0xE74A, null_frag, v64g, v64db, 3, 8>; } @@ -893,26 +982,33 @@ let Predicates = [FeatureVector] in { let Predicates = [FeatureVector] in { // Compare scalar. - let Defs = [CC] in + let Defs = [CC] in { + def WFC : CompareVRRaFloatGeneric<"wfc", 0xE7CB>; def WFCDB : CompareVRRa<"wfcdb", 0xE7CB, z_fcmp, v64db, 3>; + } // Compare and signal scalar. - let Defs = [CC] in + let Defs = [CC] in { + def WFK : CompareVRRaFloatGeneric<"wfk", 0xE7CA>; def WFKDB : CompareVRRa<"wfkdb", 0xE7CA, null_frag, v64db, 3>; + } // Compare equal. + def VFCE : BinaryVRRcSPairFloatGeneric<"vfce", 0xE7E8>; defm VFCEDB : BinaryVRRcSPair<"vfcedb", 0xE7E8, z_vfcmpe, z_vfcmpes, v128g, v128db, 3, 0>; defm WFCEDB : BinaryVRRcSPair<"wfcedb", 0xE7E8, null_frag, null_frag, v64g, v64db, 3, 8>; // Compare high. + def VFCH : BinaryVRRcSPairFloatGeneric<"vfch", 0xE7EB>; defm VFCHDB : BinaryVRRcSPair<"vfchdb", 0xE7EB, z_vfcmph, z_vfcmphs, v128g, v128db, 3, 0>; defm WFCHDB : BinaryVRRcSPair<"wfchdb", 0xE7EB, null_frag, null_frag, v64g, v64db, 3, 8>; // Compare high or equal. + def VFCHE : BinaryVRRcSPairFloatGeneric<"vfche", 0xE7EA>; defm VFCHEDB : BinaryVRRcSPair<"vfchedb", 0xE7EA, z_vfcmphe, z_vfcmphes, v128g, v128db, 3, 0>; defm WFCHEDB : BinaryVRRcSPair<"wfchedb", 0xE7EA, null_frag, null_frag, @@ -983,11 +1079,13 @@ def : Pat<(v2i64 (z_replicate GR64:$scalar)), // Moving 32-bit values between GPRs and FPRs can be done using VLVGF // and VLGVF. -def LEFR : UnaryAliasVRS<VR32, GR32>; -def LFER : UnaryAliasVRS<GR64, VR32>; -def : Pat<(f32 (bitconvert (i32 GR32:$src))), (LEFR GR32:$src)>; -def : Pat<(i32 (bitconvert (f32 VR32:$src))), - (EXTRACT_SUBREG (LFER VR32:$src), subreg_l32)>; +let Predicates = [FeatureVector] in { + def LEFR : UnaryAliasVRS<VR32, GR32>; + def LFER : UnaryAliasVRS<GR64, VR32>; + def : Pat<(f32 (bitconvert (i32 GR32:$src))), (LEFR GR32:$src)>; + def : Pat<(i32 (bitconvert (f32 VR32:$src))), + (EXTRACT_SUBREG (LFER VR32:$src), subreg_l32)>; +} // Floating-point values are stored in element 0 of the corresponding // vector register. Scalar to vector conversion is just a subreg and @@ -1036,62 +1134,67 @@ let AddedComplexity = 4 in { //===----------------------------------------------------------------------===// let Predicates = [FeatureVector] in { - defm VFAEB : TernaryVRRbSPair<"vfaeb", 0xE782, int_s390_vfaeb, z_vfae_cc, - v128b, v128b, 0, 0>; - defm VFAEH : TernaryVRRbSPair<"vfaeh", 0xE782, int_s390_vfaeh, z_vfae_cc, - v128h, v128h, 1, 0>; - defm VFAEF : TernaryVRRbSPair<"vfaef", 0xE782, int_s390_vfaef, z_vfae_cc, - v128f, v128f, 2, 0>; - defm VFAEZB : TernaryVRRbSPair<"vfaezb", 0xE782, int_s390_vfaezb, z_vfaez_cc, - v128b, v128b, 0, 2>; - defm VFAEZH : TernaryVRRbSPair<"vfaezh", 0xE782, int_s390_vfaezh, z_vfaez_cc, - v128h, v128h, 1, 2>; - defm VFAEZF : TernaryVRRbSPair<"vfaezf", 0xE782, int_s390_vfaezf, z_vfaez_cc, - v128f, v128f, 2, 2>; - - defm VFEEB : BinaryVRRbSPair<"vfeeb", 0xE780, int_s390_vfeeb, z_vfee_cc, - v128b, v128b, 0, 0, 1>; - defm VFEEH : BinaryVRRbSPair<"vfeeh", 0xE780, int_s390_vfeeh, z_vfee_cc, - v128h, v128h, 1, 0, 1>; - defm VFEEF : BinaryVRRbSPair<"vfeef", 0xE780, int_s390_vfeef, z_vfee_cc, - v128f, v128f, 2, 0, 1>; - defm VFEEZB : BinaryVRRbSPair<"vfeezb", 0xE780, int_s390_vfeezb, z_vfeez_cc, - v128b, v128b, 0, 2, 3>; - defm VFEEZH : BinaryVRRbSPair<"vfeezh", 0xE780, int_s390_vfeezh, z_vfeez_cc, - v128h, v128h, 1, 2, 3>; - defm VFEEZF : BinaryVRRbSPair<"vfeezf", 0xE780, int_s390_vfeezf, z_vfeez_cc, - v128f, v128f, 2, 2, 3>; - - defm VFENEB : BinaryVRRbSPair<"vfeneb", 0xE781, int_s390_vfeneb, z_vfene_cc, - v128b, v128b, 0, 0, 1>; - defm VFENEH : BinaryVRRbSPair<"vfeneh", 0xE781, int_s390_vfeneh, z_vfene_cc, - v128h, v128h, 1, 0, 1>; - defm VFENEF : BinaryVRRbSPair<"vfenef", 0xE781, int_s390_vfenef, z_vfene_cc, - v128f, v128f, 2, 0, 1>; + defm VFAE : TernaryOptVRRbSPairGeneric<"vfae", 0xE782>; + defm VFAEB : TernaryOptVRRbSPair<"vfaeb", 0xE782, int_s390_vfaeb, + z_vfae_cc, v128b, v128b, 0>; + defm VFAEH : TernaryOptVRRbSPair<"vfaeh", 0xE782, int_s390_vfaeh, + z_vfae_cc, v128h, v128h, 1>; + defm VFAEF : TernaryOptVRRbSPair<"vfaef", 0xE782, int_s390_vfaef, + z_vfae_cc, v128f, v128f, 2>; + defm VFAEZB : TernaryOptVRRbSPair<"vfaezb", 0xE782, int_s390_vfaezb, + z_vfaez_cc, v128b, v128b, 0, 2>; + defm VFAEZH : TernaryOptVRRbSPair<"vfaezh", 0xE782, int_s390_vfaezh, + z_vfaez_cc, v128h, v128h, 1, 2>; + defm VFAEZF : TernaryOptVRRbSPair<"vfaezf", 0xE782, int_s390_vfaezf, + z_vfaez_cc, v128f, v128f, 2, 2>; + + defm VFEE : BinaryExtraVRRbSPairGeneric<"vfee", 0xE780>; + defm VFEEB : BinaryExtraVRRbSPair<"vfeeb", 0xE780, int_s390_vfeeb, + z_vfee_cc, v128b, v128b, 0>; + defm VFEEH : BinaryExtraVRRbSPair<"vfeeh", 0xE780, int_s390_vfeeh, + z_vfee_cc, v128h, v128h, 1>; + defm VFEEF : BinaryExtraVRRbSPair<"vfeef", 0xE780, int_s390_vfeef, + z_vfee_cc, v128f, v128f, 2>; + defm VFEEZB : BinaryVRRbSPair<"vfeezb", 0xE780, int_s390_vfeezb, + z_vfeez_cc, v128b, v128b, 0, 2>; + defm VFEEZH : BinaryVRRbSPair<"vfeezh", 0xE780, int_s390_vfeezh, + z_vfeez_cc, v128h, v128h, 1, 2>; + defm VFEEZF : BinaryVRRbSPair<"vfeezf", 0xE780, int_s390_vfeezf, + z_vfeez_cc, v128f, v128f, 2, 2>; + + defm VFENE : BinaryExtraVRRbSPairGeneric<"vfene", 0xE781>; + defm VFENEB : BinaryExtraVRRbSPair<"vfeneb", 0xE781, int_s390_vfeneb, + z_vfene_cc, v128b, v128b, 0>; + defm VFENEH : BinaryExtraVRRbSPair<"vfeneh", 0xE781, int_s390_vfeneh, + z_vfene_cc, v128h, v128h, 1>; + defm VFENEF : BinaryExtraVRRbSPair<"vfenef", 0xE781, int_s390_vfenef, + z_vfene_cc, v128f, v128f, 2>; defm VFENEZB : BinaryVRRbSPair<"vfenezb", 0xE781, int_s390_vfenezb, - z_vfenez_cc, v128b, v128b, 0, 2, 3>; + z_vfenez_cc, v128b, v128b, 0, 2>; defm VFENEZH : BinaryVRRbSPair<"vfenezh", 0xE781, int_s390_vfenezh, - z_vfenez_cc, v128h, v128h, 1, 2, 3>; + z_vfenez_cc, v128h, v128h, 1, 2>; defm VFENEZF : BinaryVRRbSPair<"vfenezf", 0xE781, int_s390_vfenezf, - z_vfenez_cc, v128f, v128f, 2, 2, 3>; - - defm VISTRB : UnaryVRRaSPair<"vistrb", 0xE75C, int_s390_vistrb, z_vistr_cc, - v128b, v128b, 0>; - defm VISTRH : UnaryVRRaSPair<"vistrh", 0xE75C, int_s390_vistrh, z_vistr_cc, - v128h, v128h, 1>; - defm VISTRF : UnaryVRRaSPair<"vistrf", 0xE75C, int_s390_vistrf, z_vistr_cc, - v128f, v128f, 2>; - - defm VSTRCB : QuaternaryVRRdSPair<"vstrcb", 0xE78A, int_s390_vstrcb, - z_vstrc_cc, v128b, v128b, 0, 0>; - defm VSTRCH : QuaternaryVRRdSPair<"vstrch", 0xE78A, int_s390_vstrch, - z_vstrc_cc, v128h, v128h, 1, 0>; - defm VSTRCF : QuaternaryVRRdSPair<"vstrcf", 0xE78A, int_s390_vstrcf, - z_vstrc_cc, v128f, v128f, 2, 0>; - defm VSTRCZB : QuaternaryVRRdSPair<"vstrczb", 0xE78A, int_s390_vstrczb, - z_vstrcz_cc, v128b, v128b, 0, 2>; - defm VSTRCZH : QuaternaryVRRdSPair<"vstrczh", 0xE78A, int_s390_vstrczh, - z_vstrcz_cc, v128h, v128h, 1, 2>; - defm VSTRCZF : QuaternaryVRRdSPair<"vstrczf", 0xE78A, int_s390_vstrczf, - z_vstrcz_cc, v128f, v128f, 2, 2>; + z_vfenez_cc, v128f, v128f, 2, 2>; + + defm VISTR : UnaryExtraVRRaSPairGeneric<"vistr", 0xE75C>; + defm VISTRB : UnaryExtraVRRaSPair<"vistrb", 0xE75C, int_s390_vistrb, + z_vistr_cc, v128b, v128b, 0>; + defm VISTRH : UnaryExtraVRRaSPair<"vistrh", 0xE75C, int_s390_vistrh, + z_vistr_cc, v128h, v128h, 1>; + defm VISTRF : UnaryExtraVRRaSPair<"vistrf", 0xE75C, int_s390_vistrf, + z_vistr_cc, v128f, v128f, 2>; + + defm VSTRC : QuaternaryOptVRRdSPairGeneric<"vstrc", 0xE78A>; + defm VSTRCB : QuaternaryOptVRRdSPair<"vstrcb", 0xE78A, int_s390_vstrcb, + z_vstrc_cc, v128b, v128b, 0>; + defm VSTRCH : QuaternaryOptVRRdSPair<"vstrch", 0xE78A, int_s390_vstrch, + z_vstrc_cc, v128h, v128h, 1>; + defm VSTRCF : QuaternaryOptVRRdSPair<"vstrcf", 0xE78A, int_s390_vstrcf, + z_vstrc_cc, v128f, v128f, 2>; + defm VSTRCZB : QuaternaryOptVRRdSPair<"vstrczb", 0xE78A, int_s390_vstrczb, + z_vstrcz_cc, v128b, v128b, 0, 2>; + defm VSTRCZH : QuaternaryOptVRRdSPair<"vstrczh", 0xE78A, int_s390_vstrczh, + z_vstrcz_cc, v128h, v128h, 1, 2>; + defm VSTRCZF : QuaternaryOptVRRdSPair<"vstrczf", 0xE78A, int_s390_vstrczf, + z_vstrcz_cc, v128f, v128f, 2, 2>; } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZLDCleanup.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZLDCleanup.cpp index 2cdf2f9..ec8ce6e 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZLDCleanup.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZLDCleanup.cpp @@ -33,7 +33,7 @@ public: SystemZLDCleanup(const SystemZTargetMachine &tm) : MachineFunctionPass(ID), TII(nullptr), MF(nullptr) {} - const char *getPassName() const override { + StringRef getPassName() const override { return "SystemZ Local Dynamic TLS Access Clean-up"; } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZLongBranch.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZLongBranch.cpp index a24d47d..14ff6af 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZLongBranch.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZLongBranch.cpp @@ -133,14 +133,12 @@ public: SystemZLongBranch(const SystemZTargetMachine &tm) : MachineFunctionPass(ID), TII(nullptr) {} - const char *getPassName() const override { - return "SystemZ Long Branch"; - } + StringRef getPassName() const override { return "SystemZ Long Branch"; } bool runOnMachineFunction(MachineFunction &F) override; MachineFunctionProperties getRequiredProperties() const override { return MachineFunctionProperties().set( - MachineFunctionProperties::Property::AllVRegsAllocated); + MachineFunctionProperties::Property::NoVRegs); } private: @@ -228,6 +226,10 @@ TerminatorInfo SystemZLongBranch::describeTerminator(MachineInstr &MI) { // Relaxes to A(G)HI and BRCL, which is 6 bytes longer. Terminator.ExtraRelaxSize = 6; break; + case SystemZ::BRCTH: + // Never needs to be relaxed. + Terminator.ExtraRelaxSize = 0; + break; case SystemZ::CRJ: case SystemZ::CLRJ: // Relaxes to a C(L)R/BRCL sequence, which is 2 bytes longer. diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.cpp new file mode 100644 index 0000000..ab6020f --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.cpp @@ -0,0 +1,153 @@ +//-- SystemZMachineScheduler.cpp - SystemZ Scheduler Interface -*- C++ -*---==// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// -------------------------- Post RA scheduling ---------------------------- // +// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into +// the MachineScheduler. It has a sorted Available set of SUs and a pickNode() +// implementation that looks to optimize decoder grouping and balance the +// usage of processor resources. +//===----------------------------------------------------------------------===// + +#include "SystemZMachineScheduler.h" + +using namespace llvm; + +#define DEBUG_TYPE "misched" + +#ifndef NDEBUG +// Print the set of SUs +void SystemZPostRASchedStrategy::SUSet:: +dump(SystemZHazardRecognizer &HazardRec) { + dbgs() << "{"; + for (auto &SU : *this) { + HazardRec.dumpSU(SU, dbgs()); + if (SU != *rbegin()) + dbgs() << ", "; + } + dbgs() << "}\n"; +} +#endif + +SystemZPostRASchedStrategy:: +SystemZPostRASchedStrategy(const MachineSchedContext *C) + : DAG(nullptr), HazardRec(C) {} + +void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) { + DAG = dag; + HazardRec.setDAG(dag); + HazardRec.Reset(); +} + +// Pick the next node to schedule. +SUnit *SystemZPostRASchedStrategy::pickNode(bool &IsTopNode) { + // Only scheduling top-down. + IsTopNode = true; + + if (Available.empty()) + return nullptr; + + // If only one choice, return it. + if (Available.size() == 1) { + DEBUG (dbgs() << "+++ Only one: "; + HazardRec.dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";); + return *Available.begin(); + } + + // All nodes that are possible to schedule are stored by in the + // Available set. + DEBUG(dbgs() << "+++ Available: "; Available.dump(HazardRec);); + + Candidate Best; + for (auto *SU : Available) { + + // SU is the next candidate to be compared against current Best. + Candidate c(SU, HazardRec); + + // Remeber which SU is the best candidate. + if (Best.SU == nullptr || c < Best) { + Best = c; + DEBUG(dbgs() << "+++ Best sofar: "; + HazardRec.dumpSU(Best.SU, dbgs()); + if (Best.GroupingCost != 0) + dbgs() << "\tGrouping cost:" << Best.GroupingCost; + if (Best.ResourcesCost != 0) + dbgs() << " Resource cost:" << Best.ResourcesCost; + dbgs() << " Height:" << Best.SU->getHeight(); + dbgs() << "\n";); + } + + // Once we know we have seen all SUs that affect grouping or use unbuffered + // resources, we can stop iterating if Best looks good. + if (!SU->isScheduleHigh && Best.noCost()) + break; + } + + assert (Best.SU != nullptr); + return Best.SU; +} + +SystemZPostRASchedStrategy::Candidate:: +Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec) : Candidate() { + SU = SU_; + + // Check the grouping cost. For a node that must begin / end a + // group, it is positive if it would do so prematurely, or negative + // if it would fit naturally into the schedule. + GroupingCost = HazardRec.groupingCost(SU); + + // Check the resources cost for this SU. + ResourcesCost = HazardRec.resourcesCost(SU); +} + +bool SystemZPostRASchedStrategy::Candidate:: +operator<(const Candidate &other) { + + // Check decoder grouping. + if (GroupingCost < other.GroupingCost) + return true; + if (GroupingCost > other.GroupingCost) + return false; + + // Compare the use of resources. + if (ResourcesCost < other.ResourcesCost) + return true; + if (ResourcesCost > other.ResourcesCost) + return false; + + // Higher SU is otherwise generally better. + if (SU->getHeight() > other.SU->getHeight()) + return true; + if (SU->getHeight() < other.SU->getHeight()) + return false; + + // If all same, fall back to original order. + if (SU->NodeNum < other.SU->NodeNum) + return true; + + return false; +} + +void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) { + DEBUG(dbgs() << "+++ Scheduling SU(" << SU->NodeNum << ")\n";); + + // Remove SU from Available set and update HazardRec. + Available.erase(SU); + HazardRec.EmitInstruction(SU); +} + +void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) { + // Set isScheduleHigh flag on all SUs that we want to consider first in + // pickNode(). + const MCSchedClassDesc *SC = DAG->getSchedClass(SU); + bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup || SC->EndGroup)); + SU->isScheduleHigh = (AffectsGrouping || SU->isUnbuffered); + + // Put all released SUs in the Available set. + Available.insert(SU); +} diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.h b/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.h new file mode 100644 index 0000000..b919758 --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZMachineScheduler.h @@ -0,0 +1,112 @@ +//==-- SystemZMachineScheduler.h - SystemZ Scheduler Interface -*- C++ -*---==// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// -------------------------- Post RA scheduling ---------------------------- // +// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into +// the MachineScheduler. It has a sorted Available set of SUs and a pickNode() +// implementation that looks to optimize decoder grouping and balance the +// usage of processor resources. +//===----------------------------------------------------------------------===// + +#include "SystemZInstrInfo.h" +#include "SystemZHazardRecognizer.h" +#include "llvm/CodeGen/MachineScheduler.h" +#include "llvm/Support/Debug.h" + +#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H +#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H + +using namespace llvm; + +namespace llvm { + +/// A MachineSchedStrategy implementation for SystemZ post RA scheduling. +class SystemZPostRASchedStrategy : public MachineSchedStrategy { + ScheduleDAGMI *DAG; + + /// A candidate during instruction evaluation. + struct Candidate { + SUnit *SU; + + /// The decoding cost. + int GroupingCost; + + /// The processor resources cost. + int ResourcesCost; + + Candidate() : SU(nullptr), GroupingCost(0), ResourcesCost(0) {} + Candidate(SUnit *SU_, SystemZHazardRecognizer &HazardRec); + + // Compare two candidates. + bool operator<(const Candidate &other); + + // Check if this node is free of cost ("as good as any"). + bool inline noCost() { + return (GroupingCost <= 0 && !ResourcesCost); + } + }; + + // A sorter for the Available set that makes sure that SUs are considered + // in the best order. + struct SUSorter { + bool operator() (SUnit *lhs, SUnit *rhs) const { + if (lhs->isScheduleHigh && !rhs->isScheduleHigh) + return true; + if (!lhs->isScheduleHigh && rhs->isScheduleHigh) + return false; + + if (lhs->getHeight() > rhs->getHeight()) + return true; + else if (lhs->getHeight() < rhs->getHeight()) + return false; + + return (lhs->NodeNum < rhs->NodeNum); + } + }; + // A set of SUs with a sorter and dump method. + struct SUSet : std::set<SUnit*, SUSorter> { + #ifndef NDEBUG + void dump(SystemZHazardRecognizer &HazardRec); + #endif + }; + + /// The set of available SUs to schedule next. + SUSet Available; + + // HazardRecognizer that tracks the scheduler state for the current + // region. + SystemZHazardRecognizer HazardRec; + + public: + SystemZPostRASchedStrategy(const MachineSchedContext *C); + + /// PostRA scheduling does not track pressure. + bool shouldTrackPressure() const override { return false; } + + /// Initialize the strategy after building the DAG for a new region. + void initialize(ScheduleDAGMI *dag) override; + + /// Pick the next node to schedule, or return NULL. + SUnit *pickNode(bool &IsTopNode) override; + + /// ScheduleDAGMI has scheduled an instruction - tell HazardRec + /// about it. + void schedNode(SUnit *SU, bool IsTopNode) override; + + /// SU has had all predecessor dependencies resolved. Put it into + /// Available. + void releaseTopNode(SUnit *SU) override; + + /// Currently only scheduling top-down, so this method is empty. + void releaseBottomNode(SUnit *SU) override {}; +}; + +} // namespace llvm + +#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H */ diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZOperands.td b/contrib/llvm/lib/Target/SystemZ/SystemZOperands.td index 17b076d..7bb4fe5 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZOperands.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZOperands.td @@ -133,6 +133,13 @@ class BDLMode<string type, string bitsize, string dispsize, string suffix, !cast<Immediate>("disp"##dispsize##"imm"##bitsize), !cast<Immediate>("imm"##bitsize))>; +// A BDMode paired with a register length operand. +class BDRMode<string type, string bitsize, string dispsize, string suffix> + : AddressingMode<type, bitsize, dispsize, suffix, "", 3, "BDRAddr", + (ops !cast<RegisterOperand>("ADDR"##bitsize), + !cast<Immediate>("disp"##dispsize##"imm"##bitsize), + !cast<RegisterOperand>("GR"##bitsize))>; + // An addressing mode with a base, displacement and a vector index. class BDVMode<string bitsize, string dispsize> : AddressOperand<bitsize, dispsize, "", "BDVAddr", @@ -230,6 +237,12 @@ def UIMM32 : SDNodeXForm<imm, [{ MVT::i64); }]>; +// Truncate an immediate to a 48-bit unsigned quantity. +def UIMM48 : SDNodeXForm<imm, [{ + return CurDAG->getTargetConstant(uint64_t(N->getZExtValue()) & 0xffffffffffff, + SDLoc(N), MVT::i64); +}]>; + // Negate and then truncate an immediate to a 32-bit unsigned quantity. def NEGIMM32 : SDNodeXForm<imm, [{ return CurDAG->getTargetConstant(uint32_t(-N->getZExtValue()), SDLoc(N), @@ -252,6 +265,7 @@ def S16Imm : ImmediateAsmOperand<"S16Imm">; def U16Imm : ImmediateAsmOperand<"U16Imm">; def S32Imm : ImmediateAsmOperand<"S32Imm">; def U32Imm : ImmediateAsmOperand<"U32Imm">; +def U48Imm : ImmediateAsmOperand<"U48Imm">; //===----------------------------------------------------------------------===// // i32 immediates @@ -425,6 +439,10 @@ def imm64zx32n : Immediate<i64, [{ return isUInt<32>(-N->getSExtValue()); }], NEGIMM32, "U32Imm">; +def imm64zx48 : Immediate<i64, [{ + return isUInt<64>(N->getZExtValue()); +}], UIMM48, "U48Imm">; + def imm64 : ImmLeaf<i64, [{}]>, Operand<i64>; //===----------------------------------------------------------------------===// @@ -442,7 +460,9 @@ def fpimmneg0 : PatLeaf<(fpimm), [{ return N->isExactlyValue(-0.0); }]>; //===----------------------------------------------------------------------===// // PC-relative asm operands. +def PCRel12 : PCRelAsmOperand<"12">; def PCRel16 : PCRelAsmOperand<"16">; +def PCRel24 : PCRelAsmOperand<"24">; def PCRel32 : PCRelAsmOperand<"32">; def PCRelTLS16 : PCRelTLSAsmOperand<"16">; def PCRelTLS32 : PCRelTLSAsmOperand<"32">; @@ -458,6 +478,20 @@ def brtarget32 : PCRelOperand<OtherVT, PCRel32> { let DecoderMethod = "decodePC32DBLBranchOperand"; } +// Variants of brtarget for use with branch prediction preload. +def brtarget12bpp : PCRelOperand<OtherVT, PCRel12> { + let EncoderMethod = "getPC12DBLBPPEncoding"; + let DecoderMethod = "decodePC12DBLBranchOperand"; +} +def brtarget16bpp : PCRelOperand<OtherVT, PCRel16> { + let EncoderMethod = "getPC16DBLBPPEncoding"; + let DecoderMethod = "decodePC16DBLBranchOperand"; +} +def brtarget24bpp : PCRelOperand<OtherVT, PCRel24> { + let EncoderMethod = "getPC24DBLBPPEncoding"; + let DecoderMethod = "decodePC24DBLBranchOperand"; +} + // Variants of brtarget16/32 with an optional additional TLS symbol. // These are used to annotate calls to __tls_get_offset. def tlssym : Operand<i64> { } @@ -498,6 +532,7 @@ def BDAddr64Disp20 : AddressAsmOperand<"BDAddr", "64", "20">; def BDXAddr64Disp12 : AddressAsmOperand<"BDXAddr", "64", "12">; def BDXAddr64Disp20 : AddressAsmOperand<"BDXAddr", "64", "20">; def BDLAddr64Disp12Len8 : AddressAsmOperand<"BDLAddr", "64", "12", "Len8">; +def BDRAddr64Disp12 : AddressAsmOperand<"BDRAddr", "64", "12">; def BDVAddr64Disp12 : AddressAsmOperand<"BDVAddr", "64", "12">; // DAG patterns and operands for addressing modes. Each mode has @@ -544,23 +579,13 @@ def dynalloc12only : BDXMode<"DynAlloc", "64", "12", "Only">; def laaddr12pair : BDXMode<"LAAddr", "64", "12", "Pair">; def laaddr20pair : BDXMode<"LAAddr", "64", "20", "Pair">; def bdladdr12onlylen8 : BDLMode<"BDLAddr", "64", "12", "Only", "8">; +def bdraddr12only : BDRMode<"BDRAddr", "64", "12", "Only">; def bdvaddr12only : BDVMode< "64", "12">; //===----------------------------------------------------------------------===// // Miscellaneous //===----------------------------------------------------------------------===// -// Access registers. At present we just use them for accessing the thread -// pointer, so we don't expose them as register to LLVM. -def AccessReg : AsmOperandClass { - let Name = "AccessReg"; - let ParserMethod = "parseAccessReg"; -} -def access_reg : Immediate<i32, [{ return N->getZExtValue() < 16; }], - NOOP_SDNodeXForm, "AccessReg"> { - let ParserMatchClass = AccessReg; -} - // A 4-bit condition-code mask. def cond4 : PatLeaf<(i32 imm), [{ return (N->getZExtValue() < 16); }]>, Operand<i32> { diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZOperators.td b/contrib/llvm/lib/Target/SystemZ/SystemZOperators.td index 8d031f1..fde26ed 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZOperators.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZOperators.td @@ -35,9 +35,6 @@ def SDT_ZWrapOffset : SDTypeProfile<1, 2, SDTCisSameAs<0, 2>, SDTCisPtrTy<0>]>; def SDT_ZAdjDynAlloc : SDTypeProfile<1, 0, [SDTCisVT<0, i64>]>; -def SDT_ZExtractAccess : SDTypeProfile<1, 1, - [SDTCisVT<0, i32>, - SDTCisVT<1, i32>]>; def SDT_ZGR128Binary32 : SDTypeProfile<1, 2, [SDTCisVT<0, untyped>, SDTCisVT<1, untyped>, @@ -186,8 +183,6 @@ def z_br_ccmask : SDNode<"SystemZISD::BR_CCMASK", SDT_ZBRCCMask, def z_select_ccmask : SDNode<"SystemZISD::SELECT_CCMASK", SDT_ZSelectCCMask, [SDNPInGlue]>; def z_adjdynalloc : SDNode<"SystemZISD::ADJDYNALLOC", SDT_ZAdjDynAlloc>; -def z_extract_access : SDNode<"SystemZISD::EXTRACT_ACCESS", - SDT_ZExtractAccess>; def z_popcnt : SDNode<"SystemZISD::POPCNT", SDTIntUnaryOp>; def z_umul_lohi64 : SDNode<"SystemZISD::UMUL_LOHI64", SDT_ZGR128Binary64>; def z_sdivrem32 : SDNode<"SystemZISD::SDIVREM32", SDT_ZGR128Binary32>; @@ -387,15 +382,6 @@ def zext8 : PatFrag<(ops node:$src), (and node:$src, 0xff)>; def zext16 : PatFrag<(ops node:$src), (and node:$src, 0xffff)>; def zext32 : PatFrag<(ops node:$src), (zext (i32 node:$src))>; -// Match extensions of an i32 to an i64, followed by an AND of the low -// i8 or i16 part. -def zext8dbl : PatFrag<(ops node:$src), (zext8 (anyext node:$src))>; -def zext16dbl : PatFrag<(ops node:$src), (zext16 (anyext node:$src))>; - -// Typed floating-point loads. -def loadf32 : PatFrag<(ops node:$src), (f32 (load node:$src))>; -def loadf64 : PatFrag<(ops node:$src), (f64 (load node:$src))>; - // Extending loads in which the extension type can be signed. def asextload : PatFrag<(ops node:$ptr), (unindexedload node:$ptr), [{ unsigned Type = cast<LoadSDNode>(N)->getExtensionType(); @@ -529,7 +515,7 @@ def inserthf : PatFrag<(ops node:$src1, node:$src2), // ORs that can be treated as insertions. def or_as_inserti8 : PatFrag<(ops node:$src1, node:$src2), (or node:$src1, node:$src2), [{ - unsigned BitWidth = N->getValueType(0).getScalarType().getSizeInBits(); + unsigned BitWidth = N->getValueType(0).getScalarSizeInBits(); return CurDAG->MaskedValueIsZero(N->getOperand(0), APInt::getLowBitsSet(BitWidth, 8)); }]>; @@ -537,7 +523,7 @@ def or_as_inserti8 : PatFrag<(ops node:$src1, node:$src2), // ORs that can be treated as reversed insertions. def or_as_revinserti8 : PatFrag<(ops node:$src1, node:$src2), (or node:$src1, node:$src2), [{ - unsigned BitWidth = N->getValueType(0).getScalarType().getSizeInBits(); + unsigned BitWidth = N->getValueType(0).getScalarSizeInBits(); return CurDAG->MaskedValueIsZero(N->getOperand(1), APInt::getLowBitsSet(BitWidth, 8)); }]>; @@ -584,6 +570,12 @@ class storeu<SDPatternOperator operator, SDPatternOperator store = store> : PatFrag<(ops node:$value, node:$addr), (store (operator node:$value), node:$addr)>; +// Create a store operator that performs the given inherent operation +// and stores the resulting value. +class storei<SDPatternOperator operator, SDPatternOperator store = store> + : PatFrag<(ops node:$addr), + (store (operator), node:$addr)>; + // Vector representation of all-zeros and all-ones. def z_vzero : PatFrag<(ops), (bitconvert (v16i8 (z_byte_mask (i32 0))))>; def z_vones : PatFrag<(ops), (bitconvert (v16i8 (z_byte_mask (i32 65535))))>; diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZProcessors.td b/contrib/llvm/lib/Target/SystemZ/SystemZProcessors.td index 9adc018..1cdc094 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZProcessors.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZProcessors.td @@ -7,96 +7,29 @@ // //===----------------------------------------------------------------------===// // -// Processor and feature definitions. +// Processor definitions. +// +// For compatibility with other compilers on the platform, each model can +// be identifed either by the system name (e.g. z10) or the level of the +// architecture the model supports, as identified by the edition level +// of the z/Architecture Principles of Operation document (e.g. arch8). +// +// The minimum architecture level supported by LLVM is as defined in +// the Eighth Edition of the PoP (i.e. as implemented on z10). // //===----------------------------------------------------------------------===// -class SystemZFeature<string extname, string intname, string desc> - : Predicate<"Subtarget->has"##intname##"()">, - AssemblerPredicate<"Feature"##intname, extname>, - SubtargetFeature<extname, "Has"##intname, "true", desc>; - -class SystemZMissingFeature<string intname> - : Predicate<"!Subtarget->has"##intname##"()">; - -def FeatureDistinctOps : SystemZFeature< - "distinct-ops", "DistinctOps", - "Assume that the distinct-operands facility is installed" ->; - -def FeatureLoadStoreOnCond : SystemZFeature< - "load-store-on-cond", "LoadStoreOnCond", - "Assume that the load/store-on-condition facility is installed" ->; - -def FeatureLoadStoreOnCond2 : SystemZFeature< - "load-store-on-cond-2", "LoadStoreOnCond2", - "Assume that the load/store-on-condition facility 2 is installed" ->; - -def FeatureHighWord : SystemZFeature< - "high-word", "HighWord", - "Assume that the high-word facility is installed" ->; - -def FeatureFPExtension : SystemZFeature< - "fp-extension", "FPExtension", - "Assume that the floating-point extension facility is installed" ->; - -def FeaturePopulationCount : SystemZFeature< - "population-count", "PopulationCount", - "Assume that the population-count facility is installed" ->; - -def FeatureFastSerialization : SystemZFeature< - "fast-serialization", "FastSerialization", - "Assume that the fast-serialization facility is installed" ->; - -def FeatureInterlockedAccess1 : SystemZFeature< - "interlocked-access1", "InterlockedAccess1", - "Assume that interlocked-access facility 1 is installed" ->; -def FeatureNoInterlockedAccess1 : SystemZMissingFeature<"InterlockedAccess1">; +def : ProcessorModel<"generic", NoSchedModel, []>; -def FeatureMiscellaneousExtensions : SystemZFeature< - "miscellaneous-extensions", "MiscellaneousExtensions", - "Assume that the miscellaneous-extensions facility is installed" ->; +def : ProcessorModel<"arch8", NoSchedModel, Arch8SupportedFeatures.List>; +def : ProcessorModel<"z10", NoSchedModel, Arch8SupportedFeatures.List>; -def FeatureTransactionalExecution : SystemZFeature< - "transactional-execution", "TransactionalExecution", - "Assume that the transactional-execution facility is installed" ->; +def : ProcessorModel<"arch9", Z196Model, Arch9SupportedFeatures.List>; +def : ProcessorModel<"z196", Z196Model, Arch9SupportedFeatures.List>; -def FeatureProcessorAssist : SystemZFeature< - "processor-assist", "ProcessorAssist", - "Assume that the processor-assist facility is installed" ->; +def : ProcessorModel<"arch10", ZEC12Model, Arch10SupportedFeatures.List>; +def : ProcessorModel<"zEC12", ZEC12Model, Arch10SupportedFeatures.List>; -def FeatureVector : SystemZFeature< - "vector", "Vector", - "Assume that the vectory facility is installed" ->; -def FeatureNoVector : SystemZMissingFeature<"Vector">; +def : ProcessorModel<"arch11", Z13Model, Arch11SupportedFeatures.List>; +def : ProcessorModel<"z13", Z13Model, Arch11SupportedFeatures.List>; -def : Processor<"generic", NoItineraries, []>; -def : Processor<"z10", NoItineraries, []>; -def : Processor<"z196", NoItineraries, - [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord, - FeatureFPExtension, FeaturePopulationCount, - FeatureFastSerialization, FeatureInterlockedAccess1]>; -def : Processor<"zEC12", NoItineraries, - [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord, - FeatureFPExtension, FeaturePopulationCount, - FeatureFastSerialization, FeatureInterlockedAccess1, - FeatureMiscellaneousExtensions, - FeatureTransactionalExecution, FeatureProcessorAssist]>; -def : Processor<"z13", NoItineraries, - [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord, - FeatureFPExtension, FeaturePopulationCount, - FeatureFastSerialization, FeatureInterlockedAccess1, - FeatureMiscellaneousExtensions, - FeatureTransactionalExecution, FeatureProcessorAssist, - FeatureVector, FeatureLoadStoreOnCond2]>; diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp index b5e5fd4..6ef8000 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp @@ -59,6 +59,11 @@ SystemZRegisterInfo::getReservedRegs(const MachineFunction &MF) const { Reserved.set(SystemZ::R15L); Reserved.set(SystemZ::R15H); Reserved.set(SystemZ::R14Q); + + // A0 and A1 hold the thread pointer. + Reserved.set(SystemZ::A0); + Reserved.set(SystemZ::A1); + return Reserved; } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td b/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td index 0d8b08b..47d2f75 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td +++ b/contrib/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td @@ -36,15 +36,16 @@ def subreg_hr32 : ComposedSubRegIndex<subreg_h64, subreg_r32>; // associated operand called NAME. SIZE is the size and alignment // of the registers and REGLIST is the list of individual registers. multiclass SystemZRegClass<string name, list<ValueType> types, int size, - dag regList> { + dag regList, bit allocatable = 1> { def AsmOperand : AsmOperandClass { let Name = name; let ParserMethod = "parse"##name; let RenderMethod = "addRegOperands"; } - def Bit : RegisterClass<"SystemZ", types, size, regList> { - let Size = size; - } + let isAllocatable = allocatable in + def Bit : RegisterClass<"SystemZ", types, size, regList> { + let Size = size; + } def "" : RegisterOperand<!cast<RegisterClass>(name##"Bit")> { let ParserMatchClass = !cast<AsmOperandClass>(name##"AsmOperand"); } @@ -121,6 +122,14 @@ defm ADDR64 : SystemZRegClass<"ADDR64", [i64], 64, (sub GR64Bit, R0D)>; // of a GR128. defm ADDR128 : SystemZRegClass<"ADDR128", [untyped], 128, (sub GR128Bit, R0Q)>; +// Any type register. Used for .insn directives when we don't know what the +// register types could be. +defm AnyReg : SystemZRegClass<"AnyReg", + [i64, f64, v8i8, v4i16, v2i32, v2f32], 64, + (add (sequence "R%uD", 0, 15), + (sequence "F%uD", 0, 15), + (sequence "V%u", 0, 15))>; + //===----------------------------------------------------------------------===// // Floating-point registers //===----------------------------------------------------------------------===// @@ -284,3 +293,14 @@ def v128any : TypedReg<untyped, VR128>; def CC : SystemZReg<"cc">; let isAllocatable = 0 in def CCRegs : RegisterClass<"SystemZ", [i32], 32, (add CC)>; + +// Access registers. +class ACR32<bits<16> num, string n> : SystemZReg<n> { + let HWEncoding = num; +} +foreach I = 0-15 in { + def A#I : ACR32<I, "a"#I>, DwarfRegNum<[!add(I, 48)]>; +} +defm AR32 : SystemZRegClass<"AR32", [i32], 32, + (add (sequence "A%u", 0, 15)), 0>; + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZSchedule.td b/contrib/llvm/lib/Target/SystemZ/SystemZSchedule.td new file mode 100644 index 0000000..dbba8ab --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZSchedule.td @@ -0,0 +1,77 @@ +//==-- SystemZSchedule.td - SystemZ Scheduling Definitions ----*- tblgen -*-==// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// + +// Scheduler resources +// Resources ending with a '2' use that resource for 2 cycles. An instruction +// using two such resources use the mapped unit for 4 cycles, and 2 is added +// to the total number of uops of the sched class. + +// These three resources are used to express decoder grouping rules. +// The number of decoder slots needed by an instructions is normally +// one. For a cracked instruction (BeginGroup && !EndGroup) it is +// two. Expanded instructions (BeginGroup && EndGroup) group alone. +def GroupAlone : SchedWrite; +def BeginGroup : SchedWrite; +def EndGroup : SchedWrite; + +// Latencies, to make code a bit neater. If more than one resource is +// used for an instruction, the greatest latency (not the sum) will be +// output by Tablegen. Therefore, in such cases one of these resources +// is needed. +def Lat2 : SchedWrite; +def Lat3 : SchedWrite; +def Lat4 : SchedWrite; +def Lat5 : SchedWrite; +def Lat6 : SchedWrite; +def Lat7 : SchedWrite; +def Lat8 : SchedWrite; +def Lat9 : SchedWrite; +def Lat10 : SchedWrite; +def Lat11 : SchedWrite; +def Lat12 : SchedWrite; +def Lat15 : SchedWrite; +def Lat20 : SchedWrite; +def Lat30 : SchedWrite; + +// Fixed-point +def FXa : SchedWrite; +def FXa2 : SchedWrite; +def FXb : SchedWrite; +def FXU : SchedWrite; + +// Load/store unit +def LSU : SchedWrite; + +// Model a return without latency, otherwise if-converter will model +// extra cost and abort (currently there is an assert that checks that +// all instructions have at least one uop). +def LSU_lat1 : SchedWrite; + +// Floating point unit (zEC12 and earlier) +def FPU : SchedWrite; +def FPU2 : SchedWrite; + +// Vector sub units (z13) +def VecBF : SchedWrite; +def VecBF2 : SchedWrite; +def VecDF : SchedWrite; +def VecDF2 : SchedWrite; +def VecFPd : SchedWrite; // Blocking BFP div/sqrt unit. +def VecMul : SchedWrite; +def VecStr : SchedWrite; +def VecXsPm : SchedWrite; + +// Virtual branching unit +def VBU : SchedWrite; + + +include "SystemZScheduleZ13.td" +include "SystemZScheduleZEC12.td" +include "SystemZScheduleZ196.td" + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ13.td b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ13.td new file mode 100644 index 0000000..e97d61d --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ13.td @@ -0,0 +1,1064 @@ +//-- SystemZScheduleZ13.td - SystemZ Scheduling Definitions ----*- tblgen -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines the machine model for Z13 to support instruction +// scheduling and other instruction cost heuristics. +// +//===----------------------------------------------------------------------===// + +def Z13Model : SchedMachineModel { + + let UnsupportedFeatures = Arch11UnsupportedFeatures.List; + + let IssueWidth = 8; + let MicroOpBufferSize = 60; // Issue queues + let LoadLatency = 1; // Optimistic load latency. + + let PostRAScheduler = 1; + + // Extra cycles for a mispredicted branch. + let MispredictPenalty = 20; +} + +let SchedModel = Z13Model in { + +// These definitions could be put in a subtarget common include file, +// but it seems the include system in Tablegen currently rejects +// multiple includes of same file. +def : WriteRes<GroupAlone, []> { + let NumMicroOps = 0; + let BeginGroup = 1; + let EndGroup = 1; +} +def : WriteRes<BeginGroup, []> { + let NumMicroOps = 0; + let BeginGroup = 1; +} +def : WriteRes<EndGroup, []> { + let NumMicroOps = 0; + let EndGroup = 1; +} +def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;} +def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;} +def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;} +def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;} +def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;} +def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;} +def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;} +def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;} +def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;} +def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;} +def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;} +def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;} +def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;} +def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;} + +// Execution units. +def Z13_FXaUnit : ProcResource<2>; +def Z13_FXbUnit : ProcResource<2>; +def Z13_LSUnit : ProcResource<2>; +def Z13_VecUnit : ProcResource<2>; +def Z13_VecFPdUnit : ProcResource<2> { let BufferSize = 1; /* blocking */ } +def Z13_VBUnit : ProcResource<2>; + +// Subtarget specific definitions of scheduling resources. +def : WriteRes<FXa, [Z13_FXaUnit]> { let Latency = 1; } +def : WriteRes<FXa2, [Z13_FXaUnit, Z13_FXaUnit]> { let Latency = 2; } +def : WriteRes<FXb, [Z13_FXbUnit]> { let Latency = 1; } +def : WriteRes<LSU, [Z13_LSUnit]> { let Latency = 4; } +def : WriteRes<VecBF, [Z13_VecUnit]> { let Latency = 8; } +def : WriteRes<VecBF2, [Z13_VecUnit, Z13_VecUnit]> { let Latency = 9; } +def : WriteRes<VecDF, [Z13_VecUnit]> { let Latency = 8; } +def : WriteRes<VecDF2, [Z13_VecUnit, Z13_VecUnit]> { let Latency = 9; } +def : WriteRes<VecFPd, [Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit, + Z13_VecFPdUnit, Z13_VecFPdUnit, Z13_VecFPdUnit]> + { let Latency = 30; } +def : WriteRes<VecMul, [Z13_VecUnit]> { let Latency = 5; } +def : WriteRes<VecStr, [Z13_VecUnit]> { let Latency = 4; } +def : WriteRes<VecXsPm, [Z13_VecUnit]> { let Latency = 3; } +def : WriteRes<VBU, [Z13_VBUnit]>; // Virtual Branching Unit + +// -------------------------- INSTRUCTIONS ---------------------------------- // + +// InstRW constructs have been used in order to preserve the +// readability of the InstrInfo files. + +// For each instruction, as matched by a regexp, provide a list of +// resources that it needs. These will be combined into a SchedClass. + +//===----------------------------------------------------------------------===// +// Stack allocation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY + +//===----------------------------------------------------------------------===// +// Branch instructions +//===----------------------------------------------------------------------===// + +// Branch +def : InstRW<[VBU], (instregex "(Call)?BRC(L)?(Asm.*)?$")>; +def : InstRW<[VBU], (instregex "(Call)?J(G)?(Asm.*)?$")>; +def : InstRW<[FXb], (instregex "(Call)?BC(R)?(Asm.*)?$")>; +def : InstRW<[FXb], (instregex "(Call)?B(R)?(Asm.*)?$")>; +def : InstRW<[FXa, EndGroup], (instregex "BRCT(G)?$")>; +def : InstRW<[FXb, FXa, Lat2, GroupAlone], (instregex "BRCTH$")>; +def : InstRW<[FXb, FXa, Lat2, GroupAlone], (instregex "BCT(G)?(R)?$")>; +def : InstRW<[FXa, FXa, FXb, FXb, Lat4, GroupAlone], + (instregex "B(R)?X(H|L).*$")>; + +// Compare and branch +def : InstRW<[FXb], (instregex "C(L)?(G)?(I|R)J(Asm.*)?$")>; +def : InstRW<[FXb, FXb, Lat2, GroupAlone], + (instregex "C(L)?(G)?(I|R)B(Call|Return|Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Trap instructions +//===----------------------------------------------------------------------===// + +// Trap +def : InstRW<[VBU], (instregex "(Cond)?Trap$")>; + +// Compare and trap +def : InstRW<[FXb], (instregex "C(G)?(I|R)T(Asm.*)?$")>; +def : InstRW<[FXb], (instregex "CL(G)?RT(Asm.*)?$")>; +def : InstRW<[FXb], (instregex "CL(F|G)IT(Asm.*)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CL(G)?T(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Call and return instructions +//===----------------------------------------------------------------------===// + +// Call +def : InstRW<[VBU, FXa, FXa, Lat3, GroupAlone], (instregex "(Call)?BRAS$")>; +def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BRASL$")>; +def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "(Call)?BAS(R)?$")>; +def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "TLS_(G|L)DCALL$")>; + +// Return +def : InstRW<[FXb, EndGroup], (instregex "Return$")>; +def : InstRW<[FXb], (instregex "CondReturn$")>; + +//===----------------------------------------------------------------------===// +// Select instructions +//===----------------------------------------------------------------------===// + +// Select pseudo +def : InstRW<[FXa], (instregex "Select(32|64|32Mux)$")>; + +// CondStore pseudos +def : InstRW<[FXa], (instregex "CondStore16(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore16Mux(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore32(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore32Mux(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore64(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore8(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStore8Mux(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// Move instructions +//===----------------------------------------------------------------------===// + +// Moves +def : InstRW<[FXb, LSU, Lat5], (instregex "MV(G|H)?HI$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "MVI(Y)?$")>; + +// Move character +def : InstRW<[FXb, LSU, LSU, LSU, Lat8, GroupAlone], (instregex "MVC$")>; + +// Pseudo -> reg move +def : InstRW<[FXa], (instregex "COPY(_TO_REGCLASS)?$")>; +def : InstRW<[FXa], (instregex "EXTRACT_SUBREG$")>; +def : InstRW<[FXa], (instregex "INSERT_SUBREG$")>; +def : InstRW<[FXa], (instregex "REG_SEQUENCE$")>; +def : InstRW<[FXa], (instregex "SUBREG_TO_REG$")>; + +// Loads +def : InstRW<[LSU], (instregex "L(Y|FH|RL|Mux|CBB)?$")>; +def : InstRW<[LSU], (instregex "LG(RL)?$")>; +def : InstRW<[LSU], (instregex "L128$")>; + +def : InstRW<[FXa], (instregex "LLIH(F|H|L)$")>; +def : InstRW<[FXa], (instregex "LLIL(F|H|L)$")>; + +def : InstRW<[FXa], (instregex "LG(F|H)I$")>; +def : InstRW<[FXa], (instregex "LHI(Mux)?$")>; +def : InstRW<[FXa], (instregex "LR(Mux)?$")>; + +// Load and zero rightmost byte +def : InstRW<[LSU], (instregex "LZR(F|G)$")>; + +// Load and trap +def : InstRW<[FXb, LSU, Lat5], (instregex "L(FH|G)?AT$")>; + +// Load and test +def : InstRW<[FXa, LSU, Lat5], (instregex "LT(G)?$")>; +def : InstRW<[FXa], (instregex "LT(G)?R$")>; + +// Stores +def : InstRW<[FXb, LSU, Lat5], (instregex "STG(RL)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "ST128$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "ST(Y|FH|RL|Mux)?$")>; + +// String moves. +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>; + +//===----------------------------------------------------------------------===// +// Conditional move instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, Lat2], (instregex "LOCRMux$")>; +def : InstRW<[FXa, Lat2], (instregex "LOC(G|FH)?R(Asm.*)?$")>; +def : InstRW<[FXa, Lat2], (instregex "LOC(G|H)?HI(Asm.*)?$")>; +def : InstRW<[FXa, LSU, Lat6], (instregex "LOC(G|FH|Mux)?(Asm.*)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "STOC(G|FH|Mux)?(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Sign extensions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "L(B|H|G)R$")>; +def : InstRW<[FXa], (instregex "LG(B|H|F)R$")>; + +def : InstRW<[FXa, LSU, Lat5], (instregex "LTGF$")>; +def : InstRW<[FXa], (instregex "LTGFR$")>; + +def : InstRW<[FXa, LSU, Lat5], (instregex "LB(H|Mux)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LH(Y)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LH(H|Mux|RL)$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LG(B|H|F)$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LG(H|F)RL$")>; + +//===----------------------------------------------------------------------===// +// Zero extensions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "LLCR(Mux)?$")>; +def : InstRW<[FXa], (instregex "LLHR(Mux)?$")>; +def : InstRW<[FXa], (instregex "LLG(C|H|F|T)R$")>; +def : InstRW<[LSU], (instregex "LLC(Mux)?$")>; +def : InstRW<[LSU], (instregex "LLH(Mux)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LL(C|H)H$")>; +def : InstRW<[LSU], (instregex "LLHRL$")>; +def : InstRW<[LSU], (instregex "LLG(C|H|F|T|HRL|FRL)$")>; + +// Load and zero rightmost byte +def : InstRW<[LSU], (instregex "LLZRGF$")>; + +// Load and trap +def : InstRW<[FXb, LSU, Lat5], (instregex "LLG(F|T)?AT$")>; + +//===----------------------------------------------------------------------===// +// Truncations +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb, LSU, Lat5], (instregex "STC(H|Y|Mux)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "STH(H|Y|RL|Mux)?$")>; + +//===----------------------------------------------------------------------===// +// Multi-register moves +//===----------------------------------------------------------------------===// + +// Load multiple (estimated average of 5 ops) +def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone], + (instregex "LM(H|Y|G)?$")>; + +// Store multiple (estimated average of ceil(5/2) FXb ops) +def : InstRW<[LSU, LSU, FXb, FXb, FXb, Lat10, + GroupAlone], (instregex "STM(G|H|Y)?$")>; + +//===----------------------------------------------------------------------===// +// Byte swaps +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "LRV(G)?R$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "LRV(G|H)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "STRV(G|H)?$")>; + +//===----------------------------------------------------------------------===// +// Load address instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "LA(Y|RL)?$")>; + +// Load the Global Offset Table address ( -> larl ) +def : InstRW<[FXa], (instregex "GOT$")>; + +//===----------------------------------------------------------------------===// +// Absolute and Negation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, Lat2], (instregex "LP(G)?R$")>; +def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "L(N|P)GFR$")>; +def : InstRW<[FXa, Lat2], (instregex "LN(R|GR)$")>; +def : InstRW<[FXa], (instregex "LC(R|GR)$")>; +def : InstRW<[FXa, FXa, Lat2, BeginGroup], (instregex "LCGFR$")>; + +//===----------------------------------------------------------------------===// +// Insertion +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "IC(Y)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "IC32(Y)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "ICM(H|Y)?$")>; +def : InstRW<[FXa], (instregex "II(F|H|L)Mux$")>; +def : InstRW<[FXa], (instregex "IIHF(64)?$")>; +def : InstRW<[FXa], (instregex "IIHH(64)?$")>; +def : InstRW<[FXa], (instregex "IIHL(64)?$")>; +def : InstRW<[FXa], (instregex "IILF(64)?$")>; +def : InstRW<[FXa], (instregex "IILH(64)?$")>; +def : InstRW<[FXa], (instregex "IILL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Addition +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "A(Y)?$")>; +def : InstRW<[FXa, LSU, Lat6], (instregex "AH(Y)?$")>; +def : InstRW<[FXa], (instregex "AIH$")>; +def : InstRW<[FXa], (instregex "AFI(Mux)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "AG$")>; +def : InstRW<[FXa], (instregex "AGFI$")>; +def : InstRW<[FXa], (instregex "AGHI(K)?$")>; +def : InstRW<[FXa], (instregex "AGR(K)?$")>; +def : InstRW<[FXa], (instregex "AHI(K)?$")>; +def : InstRW<[FXa], (instregex "AHIMux(K)?$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "AL(Y)?$")>; +def : InstRW<[FXa], (instregex "AL(FI|HSIK)$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "ALG(F)?$")>; +def : InstRW<[FXa], (instregex "ALGHSIK$")>; +def : InstRW<[FXa], (instregex "ALGF(I|R)$")>; +def : InstRW<[FXa], (instregex "ALGR(K)?$")>; +def : InstRW<[FXa], (instregex "ALR(K)?$")>; +def : InstRW<[FXa], (instregex "AR(K)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "A(G)?SI$")>; + +// Logical addition with carry +def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "ALC(G)?$")>; +def : InstRW<[FXa, Lat2, GroupAlone], (instregex "ALC(G)?R$")>; + +// Add with sign extension (32 -> 64) +def : InstRW<[FXa, LSU, Lat6], (instregex "AGF$")>; +def : InstRW<[FXa, Lat2], (instregex "AGFR$")>; + +//===----------------------------------------------------------------------===// +// Subtraction +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "S(G|Y)?$")>; +def : InstRW<[FXa, LSU, Lat6], (instregex "SH(Y)?$")>; +def : InstRW<[FXa], (instregex "SGR(K)?$")>; +def : InstRW<[FXa], (instregex "SLFI$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "SL(G|GF|Y)?$")>; +def : InstRW<[FXa], (instregex "SLGF(I|R)$")>; +def : InstRW<[FXa], (instregex "SLGR(K)?$")>; +def : InstRW<[FXa], (instregex "SLR(K)?$")>; +def : InstRW<[FXa], (instregex "SR(K)?$")>; + +// Subtraction with borrow +def : InstRW<[FXa, LSU, Lat6, GroupAlone], (instregex "SLB(G)?$")>; +def : InstRW<[FXa, Lat2, GroupAlone], (instregex "SLB(G)?R$")>; + +// Subtraction with sign extension (32 -> 64) +def : InstRW<[FXa, LSU, Lat6], (instregex "SGF$")>; +def : InstRW<[FXa, Lat2], (instregex "SGFR$")>; + +//===----------------------------------------------------------------------===// +// AND +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "N(G|Y)?$")>; +def : InstRW<[FXa], (instregex "NGR(K)?$")>; +def : InstRW<[FXa], (instregex "NI(FMux|HMux|LMux)$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "NI(Y)?$")>; +def : InstRW<[FXa], (instregex "NIHF(64)?$")>; +def : InstRW<[FXa], (instregex "NIHH(64)?$")>; +def : InstRW<[FXa], (instregex "NIHL(64)?$")>; +def : InstRW<[FXa], (instregex "NILF(64)?$")>; +def : InstRW<[FXa], (instregex "NILH(64)?$")>; +def : InstRW<[FXa], (instregex "NILL(64)?$")>; +def : InstRW<[FXa], (instregex "NR(K)?$")>; +def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "NC$")>; + +//===----------------------------------------------------------------------===// +// OR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "O(G|Y)?$")>; +def : InstRW<[FXa], (instregex "OGR(K)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "OI(Y)?$")>; +def : InstRW<[FXa], (instregex "OI(FMux|HMux|LMux)$")>; +def : InstRW<[FXa], (instregex "OIHF(64)?$")>; +def : InstRW<[FXa], (instregex "OIHH(64)?$")>; +def : InstRW<[FXa], (instregex "OIHL(64)?$")>; +def : InstRW<[FXa], (instregex "OILF(64)?$")>; +def : InstRW<[FXa], (instregex "OILH(64)?$")>; +def : InstRW<[FXa], (instregex "OILL(64)?$")>; +def : InstRW<[FXa], (instregex "OR(K)?$")>; +def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "OC$")>; + +//===----------------------------------------------------------------------===// +// XOR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat5], (instregex "X(G|Y)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "XI(Y)?$")>; +def : InstRW<[FXa], (instregex "XIFMux$")>; +def : InstRW<[FXa], (instregex "XGR(K)?$")>; +def : InstRW<[FXa], (instregex "XIHF(64)?$")>; +def : InstRW<[FXa], (instregex "XILF(64)?$")>; +def : InstRW<[FXa], (instregex "XR(K)?$")>; +def : InstRW<[LSU, LSU, FXb, Lat9, BeginGroup], (instregex "XC$")>; + +//===----------------------------------------------------------------------===// +// Multiplication +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat10], (instregex "MS(GF|Y)?$")>; +def : InstRW<[FXa, Lat6], (instregex "MS(R|FI)$")>; +def : InstRW<[FXa, LSU, Lat12], (instregex "MSG$")>; +def : InstRW<[FXa, Lat8], (instregex "MSGR$")>; +def : InstRW<[FXa, Lat6], (instregex "MSGF(I|R)$")>; +def : InstRW<[FXa, LSU, Lat15, GroupAlone], (instregex "MLG$")>; +def : InstRW<[FXa, Lat9, GroupAlone], (instregex "MLGR$")>; +def : InstRW<[FXa, Lat5], (instregex "MGHI$")>; +def : InstRW<[FXa, Lat5], (instregex "MHI$")>; +def : InstRW<[FXa, LSU, Lat9], (instregex "MH(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Division and remainder +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, Lat30, GroupAlone], (instregex "DSG(F)?R$")>; +def : InstRW<[LSU, FXa, Lat30, GroupAlone], (instregex "DSG(F)?$")>; +def : InstRW<[FXa2, FXa2, Lat20, GroupAlone], (instregex "DLR$")>; +def : InstRW<[FXa2, FXa2, Lat30, GroupAlone], (instregex "DLGR$")>; +def : InstRW<[FXa2, FXa2, LSU, Lat30, GroupAlone], (instregex "DL(G)?$")>; + +//===----------------------------------------------------------------------===// +// Shifts +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "SLL(G|K)?$")>; +def : InstRW<[FXa], (instregex "SRL(G|K)?$")>; +def : InstRW<[FXa], (instregex "SRA(G|K)?$")>; +def : InstRW<[FXa], (instregex "SLA(K)?$")>; + +// Rotate +def : InstRW<[FXa, LSU, Lat6], (instregex "RLL(G)?$")>; + +// Rotate and insert +def : InstRW<[FXa], (instregex "RISBG(N|32)?$")>; +def : InstRW<[FXa], (instregex "RISBH(G|H|L)$")>; +def : InstRW<[FXa], (instregex "RISBL(G|H|L)$")>; +def : InstRW<[FXa], (instregex "RISBMux$")>; + +// Rotate and Select +def : InstRW<[FXa, FXa, Lat3, BeginGroup], (instregex "R(N|O|X)SBG$")>; + +//===----------------------------------------------------------------------===// +// Comparison +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb, LSU, Lat5], (instregex "C(G|Y|Mux|RL)?$")>; +def : InstRW<[FXb], (instregex "C(F|H)I(Mux)?$")>; +def : InstRW<[FXb], (instregex "CG(F|H)I$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CG(HSI|RL)$")>; +def : InstRW<[FXb], (instregex "C(G)?R$")>; +def : InstRW<[FXb], (instregex "CIH$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CH(F|SI)$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CL(Y|Mux|FHSI)?$")>; +def : InstRW<[FXb], (instregex "CLFI(Mux)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLG(HRL|HSI)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLGF(RL)?$")>; +def : InstRW<[FXb], (instregex "CLGF(I|R)$")>; +def : InstRW<[FXb], (instregex "CLGR$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLGRL$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLH(F|RL|HSI)$")>; +def : InstRW<[FXb], (instregex "CLIH$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLI(Y)?$")>; +def : InstRW<[FXb], (instregex "CLR$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "CLRL$")>; + +// Compare halfword +def : InstRW<[FXb, LSU, Lat6], (instregex "CH(Y|RL)?$")>; +def : InstRW<[FXb, LSU, Lat6], (instregex "CGH(RL)?$")>; +def : InstRW<[FXa, FXb, LSU, Lat6, BeginGroup], (instregex "CHHSI$")>; + +// Compare with sign extension (32 -> 64) +def : InstRW<[FXb, LSU, Lat6], (instregex "CGF(RL)?$")>; +def : InstRW<[FXb, Lat2], (instregex "CGFR$")>; + +// Compare logical character +def : InstRW<[FXb, LSU, LSU, Lat9, BeginGroup], (instregex "CLC$")>; + +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>; + +// Test under mask +def : InstRW<[FXb, LSU, Lat5], (instregex "TM(Y)?$")>; +def : InstRW<[FXb], (instregex "TM(H|L)Mux$")>; +def : InstRW<[FXb], (instregex "TMHH(64)?$")>; +def : InstRW<[FXb], (instregex "TMHL(64)?$")>; +def : InstRW<[FXb], (instregex "TMLH(64)?$")>; +def : InstRW<[FXb], (instregex "TMLL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Prefetch and execution hint +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU], (instregex "PFD(RL)?$")>; +def : InstRW<[FXb, Lat2], (instregex "BPP$")>; +def : InstRW<[FXb, EndGroup], (instregex "BPRP$")>; +def : InstRW<[FXb], (instregex "NIAI$")>; + +//===----------------------------------------------------------------------===// +// Atomic operations +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb, EndGroup], (instregex "Serialize$")>; + +def : InstRW<[FXb, LSU, Lat5], (instregex "LAA(G)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "LAAL(G)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "LAN(G)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "LAO(G)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "LAX(G)?$")>; + +// Test and set +def : InstRW<[FXb, LSU, Lat5, EndGroup], (instregex "TS$")>; + +// Compare and swap +def : InstRW<[FXa, FXb, LSU, Lat6, GroupAlone], (instregex "CS(G|Y)?$")>; + +// Compare double and swap +def : InstRW<[FXa, FXa, FXb, FXb, FXa, LSU, Lat10, GroupAlone], + (instregex "CDS(Y)?$")>; +def : InstRW<[FXa, FXa, FXb, FXb, LSU, FXb, FXb, LSU, LSU, Lat20, GroupAlone], + (instregex "CDSG$")>; + +// Compare and swap and store +def : InstRW<[FXa, Lat30, GroupAlone], (instregex "CSST$")>; + +// Perform locked operation +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PLO$")>; + +// Load/store pair from/to quadword +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPQ$")>; +def : InstRW<[FXb, FXb, LSU, Lat6, GroupAlone], (instregex "STPQ$")>; + +// Load pair disjoint +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPD(G)?$")>; + +//===----------------------------------------------------------------------===// +// Access registers +//===----------------------------------------------------------------------===// + +// Extract/set/copy access register +def : InstRW<[LSU], (instregex "(EAR|SAR|CPYA)$")>; + +// Load address extended +def : InstRW<[LSU, FXa, Lat5, BeginGroup], (instregex "LAE(Y)?$")>; + +// Load/store access multiple (not modeled precisely) +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "(L|ST)AM(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Program mask and addressing mode +//===----------------------------------------------------------------------===// + +// Insert Program Mask +def : InstRW<[FXa, Lat3, EndGroup], (instregex "IPM$")>; + +// Set Program Mask +def : InstRW<[LSU, EndGroup], (instregex "SPM$")>; + +// Branch and link +def : InstRW<[FXa, FXa, FXb, Lat5, GroupAlone], (instregex "BAL(R)?$")>; + +// Test addressing mode +def : InstRW<[FXb], (instregex "TAM$")>; + +// Set addressing mode +def : InstRW<[FXb, Lat2, EndGroup], (instregex "SAM(24|31|64)$")>; + +// Branch (and save) and set mode. +def : InstRW<[FXa, FXb, Lat2, GroupAlone], (instregex "BSM$")>; +def : InstRW<[FXa, FXa, FXb, Lat3, GroupAlone], (instregex "BASSM$")>; + +//===----------------------------------------------------------------------===// +// Transactional execution +//===----------------------------------------------------------------------===// + +// Transaction begin +def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat15, GroupAlone], + (instregex "TBEGIN(C|_nofloat)?$")>; + +// Transaction end +def : InstRW<[FXb, GroupAlone], (instregex "TEND$")>; + +// Transaction abort +def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>; + +// Extract Transaction Nesting Depth +def : InstRW<[FXa], (instregex "ETND$")>; + +// Nontransactional store +def : InstRW<[FXb, LSU, Lat5], (instregex "NTSTG$")>; + +//===----------------------------------------------------------------------===// +// Processor assist +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb], (instregex "PPA$")>; + +//===----------------------------------------------------------------------===// +// Miscellaneous Instructions. +//===----------------------------------------------------------------------===// + +// Find leftmost one +def : InstRW<[FXa, Lat6, GroupAlone], (instregex "FLOGR$")>; + +// Population count +def : InstRW<[FXa, Lat3], (instregex "POPCNT$")>; + +// Extend +def : InstRW<[FXa], (instregex "AEXT128_64$")>; +def : InstRW<[FXa], (instregex "ZEXT128_(32|64)$")>; + +// String instructions +def : InstRW<[FXa, LSU, Lat30], (instregex "SRST$")>; + +// Move with key +def : InstRW<[FXa, FXa, FXb, LSU, Lat8, GroupAlone], (instregex "MVCK$")>; + +// Extract CPU Time +def : InstRW<[FXa, Lat5, LSU], (instregex "ECTG$")>; + +// Execute +def : InstRW<[FXb, GroupAlone], (instregex "EX(RL)?$")>; + +// Program return +def : InstRW<[FXb, Lat30], (instregex "PR$")>; + +// Inline assembly +def : InstRW<[LSU, LSU, LSU, FXa, FXa, FXb, Lat9, GroupAlone], + (instregex "STCK(F)?$")>; +def : InstRW<[LSU, LSU, LSU, LSU, FXa, FXa, FXb, FXb, Lat11, GroupAlone], + (instregex "STCKE$")>; +def : InstRW<[FXa, LSU, Lat5], (instregex "STFLE$")>; +def : InstRW<[FXb, Lat30], (instregex "SVC$")>; + +// Store real address +def : InstRW<[FXb, LSU, Lat5], (instregex "STRAG$")>; + +//===----------------------------------------------------------------------===// +// .insn directive instructions +//===----------------------------------------------------------------------===// + +// An "empty" sched-class will be assigned instead of the "invalid sched-class". +// getNumDecoderSlots() will then return 1 instead of 0. +def : InstRW<[], (instregex "Insn.*")>; + + +// ----------------------------- Floating point ----------------------------- // + +//===----------------------------------------------------------------------===// +// FP: Select instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa], (instregex "SelectF(32|64|128)$")>; +def : InstRW<[FXa], (instregex "CondStoreF32(Inv)?$")>; +def : InstRW<[FXa], (instregex "CondStoreF64(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Move instructions +//===----------------------------------------------------------------------===// + +// Load zero +def : InstRW<[FXb], (instregex "LZ(DR|ER)$")>; +def : InstRW<[FXb, FXb, Lat2, BeginGroup], (instregex "LZXR$")>; + +// Load +def : InstRW<[VecXsPm], (instregex "LER$")>; +def : InstRW<[FXb], (instregex "LD(R|R32|GR)$")>; +def : InstRW<[FXb, Lat3], (instregex "LGDR$")>; +def : InstRW<[FXb, FXb, Lat2, GroupAlone], (instregex "LXR$")>; + +// Load and Test +def : InstRW<[VecXsPm, Lat4], (instregex "LT(D|E)BR$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "LTEBRCompare(_VecPseudo)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "LTDBRCompare(_VecPseudo)?$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "LTXBR$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], + (instregex "LTXBRCompare(_VecPseudo)?$")>; + +// Copy sign +def : InstRW<[VecXsPm], (instregex "CPSDRd(d|s)$")>; +def : InstRW<[VecXsPm], (instregex "CPSDRs(d|s)$")>; + +//===----------------------------------------------------------------------===// +// FP: Load instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm, LSU, Lat7], (instregex "LE(Y)?$")>; +def : InstRW<[LSU], (instregex "LD(Y|E32)?$")>; +def : InstRW<[LSU], (instregex "LX$")>; + +//===----------------------------------------------------------------------===// +// FP: Store instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb, LSU, Lat7], (instregex "STD(Y)?$")>; +def : InstRW<[FXb, LSU, Lat7], (instregex "STE(Y)?$")>; +def : InstRW<[FXb, LSU, Lat5], (instregex "STX$")>; + +//===----------------------------------------------------------------------===// +// FP: Conversion instructions +//===----------------------------------------------------------------------===// + +// Load rounded +def : InstRW<[VecBF], (instregex "LEDBR(A)?$")>; +def : InstRW<[VecDF, VecDF, Lat20], (instregex "LEXBR(A)?$")>; +def : InstRW<[VecDF, VecDF, Lat20], (instregex "LDXBR(A)?$")>; + +// Load lengthened +def : InstRW<[VecBF, LSU, Lat12], (instregex "LDEB$")>; +def : InstRW<[VecBF], (instregex "LDEBR$")>; +def : InstRW<[VecBF2, VecBF2, LSU, Lat12 , GroupAlone], (instregex "LX(D|E)B$")>; +def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "LX(D|E)BR$")>; + +// Convert from fixed / logical +def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CE(F|G)BR(A)?$")>; +def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CD(F|G)BR(A)?$")>; +def : InstRW<[FXb, VecDF2, VecDF2, Lat12, GroupAlone], (instregex "CX(F|G)BR(A)?$")>; +def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CEL(F|G)BR$")>; +def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CDL(F|G)BR$")>; +def : InstRW<[FXb, VecDF2, VecDF2, Lat12, GroupAlone], (instregex "CXL(F|G)BR$")>; + +// Convert to fixed / logical +def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CF(E|D)BR(A)?$")>; +def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CG(E|D)BR(A)?$")>; +def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "C(F|G)XBR(A)?$")>; +def : InstRW<[FXb, VecBF, Lat11, GroupAlone], (instregex "CLFEBR$")>; +def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLFDBR$")>; +def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CLG(E|D)BR$")>; +def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "CL(F|G)XBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Unary arithmetic +//===----------------------------------------------------------------------===// + +// Load Complement / Negative / Positive +def : InstRW<[VecXsPm, Lat4], (instregex "L(C|N|P)DBR$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "L(C|N|P)EBR$")>; +def : InstRW<[FXb], (instregex "LCDFR(_32)?$")>; +def : InstRW<[FXb], (instregex "LNDFR(_32)?$")>; +def : InstRW<[FXb], (instregex "LPDFR(_32)?$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "L(C|N|P)XBR$")>; + +// Square root +def : InstRW<[VecFPd, LSU], (instregex "SQ(E|D)B$")>; +def : InstRW<[VecFPd], (instregex "SQ(E|D)BR$")>; +def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "SQXBR$")>; + +// Load FP integer +def : InstRW<[VecBF], (instregex "FIEBR(A)?$")>; +def : InstRW<[VecBF], (instregex "FIDBR(A)?$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "FIXBR(A)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Binary arithmetic +//===----------------------------------------------------------------------===// + +// Addition +def : InstRW<[VecBF, LSU, Lat12], (instregex "A(E|D)B$")>; +def : InstRW<[VecBF], (instregex "A(E|D)BR$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "AXBR$")>; + +// Subtraction +def : InstRW<[VecBF, LSU, Lat12], (instregex "S(E|D)B$")>; +def : InstRW<[VecBF], (instregex "S(E|D)BR$")>; +def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "SXBR$")>; + +// Multiply +def : InstRW<[VecBF, LSU, Lat12], (instregex "M(D|DE|EE)B$")>; +def : InstRW<[VecBF], (instregex "M(D|DE|EE)BR$")>; +def : InstRW<[VecBF2, VecBF2, LSU, Lat12, GroupAlone], (instregex "MXDB$")>; +def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "MXDBR$")>; +def : InstRW<[VecDF2, VecDF2, Lat20, GroupAlone], (instregex "MXBR$")>; + +// Multiply and add / subtract +def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A|S)EB$")>; +def : InstRW<[VecBF, GroupAlone], (instregex "M(A|S)EBR$")>; +def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A|S)DB$")>; +def : InstRW<[VecBF], (instregex "M(A|S)DBR$")>; + +// Division +def : InstRW<[VecFPd, LSU], (instregex "D(E|D)B$")>; +def : InstRW<[VecFPd], (instregex "D(E|D)BR$")>; +def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "DXBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Comparisons +//===----------------------------------------------------------------------===// + +// Compare +def : InstRW<[VecXsPm, LSU, Lat8], (instregex "C(E|D)B$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "C(E|D)BR?$")>; +def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "CXBR$")>; + +// Test Data Class +def : InstRW<[LSU, VecXsPm, Lat9], (instregex "TC(E|D)B$")>; +def : InstRW<[LSU, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "TCXB$")>; + +//===----------------------------------------------------------------------===// +// FP: Floating-point control register instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXa, LSU, Lat4, GroupAlone], (instregex "EFPC$")>; +def : InstRW<[FXb, LSU, Lat5, GroupAlone], (instregex "STFPC$")>; +def : InstRW<[LSU, Lat3, GroupAlone], (instregex "SFPC$")>; +def : InstRW<[LSU, LSU, Lat6, GroupAlone], (instregex "LFPC$")>; +def : InstRW<[FXa, Lat30, GroupAlone], (instregex "SFASR$")>; +def : InstRW<[FXa, LSU, Lat30, GroupAlone], (instregex "LFAS$")>; +def : InstRW<[FXb, Lat3, GroupAlone], (instregex "SRNM(B|T)?$")>; + +// --------------------------------- Vector --------------------------------- // + +//===----------------------------------------------------------------------===// +// Vector: Move instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb], (instregex "VLR(32|64)?$")>; +def : InstRW<[FXb, Lat4], (instregex "VLGV(B|F|G|H)?$")>; +def : InstRW<[FXb], (instregex "VLVG(B|F|G|H)?$")>; +def : InstRW<[FXb, Lat2], (instregex "VLVGP(32)?$")>; + +//===----------------------------------------------------------------------===// +// Vector: Immediate instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm], (instregex "VZERO$")>; +def : InstRW<[VecXsPm], (instregex "VONE$")>; +def : InstRW<[VecXsPm], (instregex "VGBM$")>; +def : InstRW<[VecXsPm], (instregex "VGM(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VLEI(B|F|G|H)$")>; +def : InstRW<[VecXsPm], (instregex "VREPI(B|F|G|H)?$")>; + +//===----------------------------------------------------------------------===// +// Vector: Loads +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU], (instregex "VL(L|BB)?$")>; +def : InstRW<[LSU], (instregex "VL(32|64)$")>; +def : InstRW<[LSU], (instregex "VLLEZ(B|F|G|H)?$")>; +def : InstRW<[LSU], (instregex "VLREP(B|F|G|H)?$")>; +def : InstRW<[VecXsPm, LSU, Lat7], (instregex "VLE(B|F|G|H)$")>; +def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VGE(F|G)$")>; +def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone], + (instregex "VLM$")>; + +//===----------------------------------------------------------------------===// +// Vector: Stores +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb, LSU, Lat8], (instregex "VST(L|32|64)?$")>; +def : InstRW<[FXb, LSU, Lat8], (instregex "VSTE(F|G)$")>; +def : InstRW<[FXb, LSU, VecXsPm, Lat11, BeginGroup], (instregex "VSTE(B|H)$")>; +def : InstRW<[LSU, LSU, FXb, FXb, FXb, FXb, FXb, Lat20, GroupAlone], + (instregex "VSTM$")>; +def : InstRW<[FXb, FXb, LSU, Lat12, BeginGroup], (instregex "VSCE(F|G)$")>; + +//===----------------------------------------------------------------------===// +// Vector: Selects and permutes +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm], (instregex "VMRH(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VMRL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VPERM$")>; +def : InstRW<[VecXsPm], (instregex "VPDI$")>; +def : InstRW<[VecXsPm], (instregex "VREP(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VSEL$")>; + +//===----------------------------------------------------------------------===// +// Vector: Widening and narrowing +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm], (instregex "VPK(F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VPKS(F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VPKS(F|G|H)S$")>; +def : InstRW<[VecXsPm], (instregex "VPKLS(F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VPKLS(F|G|H)S$")>; +def : InstRW<[VecXsPm], (instregex "VSEG(B|F|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VUPH(B|F|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VUPL(B|F)?$")>; +def : InstRW<[VecXsPm], (instregex "VUPLH(B|F|H|W)?$")>; +def : InstRW<[VecXsPm], (instregex "VUPLL(B|F|H)?$")>; + +//===----------------------------------------------------------------------===// +// Vector: Integer arithmetic +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm], (instregex "VA(B|F|G|H|Q|C|CQ)?$")>; +def : InstRW<[VecXsPm], (instregex "VACC(B|F|G|H|Q|C|CQ)?$")>; +def : InstRW<[VecXsPm], (instregex "VAVG(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VAVGL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VN(C|O)?$")>; +def : InstRW<[VecXsPm], (instregex "VO$")>; +def : InstRW<[VecMul], (instregex "VCKSM$")>; +def : InstRW<[VecXsPm], (instregex "VCLZ(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VCTZ(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VX$")>; +def : InstRW<[VecMul], (instregex "VGFM?$")>; +def : InstRW<[VecMul], (instregex "VGFMA(B|F|G|H)?$")>; +def : InstRW<[VecMul], (instregex "VGFM(B|F|G|H)$")>; +def : InstRW<[VecXsPm], (instregex "VLC(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VLP(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VMX(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VMXL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VMN(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VMNL(B|F|G|H)?$")>; +def : InstRW<[VecMul], (instregex "VMAL(B|F)?$")>; +def : InstRW<[VecMul], (instregex "VMALE(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMALH(B|F|H|W)?$")>; +def : InstRW<[VecMul], (instregex "VMALO(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMAO(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMAE(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMAH(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VME(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMH(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VML(B|F)?$")>; +def : InstRW<[VecMul], (instregex "VMLE(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMLH(B|F|H|W)?$")>; +def : InstRW<[VecMul], (instregex "VMLO(B|F|H)?$")>; +def : InstRW<[VecMul], (instregex "VMO(B|F|H)?$")>; + +def : InstRW<[VecXsPm], (instregex "VPOPCT$")>; + +def : InstRW<[VecXsPm], (instregex "VERLL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VERLLV(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VERIM(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESLV(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESRA(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESRAV(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESRL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VESRLV(B|F|G|H)?$")>; + +def : InstRW<[VecXsPm], (instregex "VSL(DB)?$")>; +def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSLB$")>; +def : InstRW<[VecXsPm], (instregex "VSR(A|L)$")>; +def : InstRW<[VecXsPm, VecXsPm, Lat8], (instregex "VSR(A|L)B$")>; + +def : InstRW<[VecXsPm], (instregex "VSB(I|IQ|CBI|CBIQ)?$")>; +def : InstRW<[VecXsPm], (instregex "VSCBI(B|F|G|H|Q)?$")>; +def : InstRW<[VecXsPm], (instregex "VS(F|G|H|Q)?$")>; + +def : InstRW<[VecMul], (instregex "VSUM(B|H)?$")>; +def : InstRW<[VecMul], (instregex "VSUMG(F|H)?$")>; +def : InstRW<[VecMul], (instregex "VSUMQ(F|G)?$")>; + +//===----------------------------------------------------------------------===// +// Vector: Integer comparison +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm, Lat4], (instregex "VEC(B|F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VECL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm], (instregex "VCEQ(B|F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VCEQ(B|F|G|H)S$")>; +def : InstRW<[VecXsPm], (instregex "VCH(B|F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VCH(B|F|G|H)S$")>; +def : InstRW<[VecXsPm], (instregex "VCHL(B|F|G|H)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VCHL(B|F|G|H)S$")>; +def : InstRW<[VecStr, Lat5], (instregex "VTM$")>; + +//===----------------------------------------------------------------------===// +// Vector: Floating-point arithmetic +//===----------------------------------------------------------------------===// + +def : InstRW<[VecBF2], (instregex "VCD(G|GB|LG|LGB)$")>; +def : InstRW<[VecBF], (instregex "WCD(GB|LGB)$")>; +def : InstRW<[VecBF2], (instregex "VC(L)?GD$")>; +def : InstRW<[VecBF2], (instregex "VFADB$")>; +def : InstRW<[VecBF], (instregex "WFADB$")>; +def : InstRW<[VecBF2], (instregex "VCGDB$")>; +def : InstRW<[VecBF], (instregex "WCGDB$")>; +def : InstRW<[VecBF2], (instregex "VF(I|M|A|S)$")>; +def : InstRW<[VecBF2], (instregex "VF(I|M|S)DB$")>; +def : InstRW<[VecBF], (instregex "WF(I|M|S)DB$")>; +def : InstRW<[VecBF2], (instregex "VCLGDB$")>; +def : InstRW<[VecBF], (instregex "WCLGDB$")>; +def : InstRW<[VecXsPm], (instregex "VFL(C|N|P)DB$")>; +def : InstRW<[VecXsPm], (instregex "WFL(C|N|P)DB$")>; +def : InstRW<[VecBF2], (instregex "VFM(A|S)$")>; +def : InstRW<[VecBF2], (instregex "VFM(A|S)DB$")>; +def : InstRW<[VecBF], (instregex "WFM(A|S)DB$")>; +def : InstRW<[VecXsPm], (instregex "VFPSO$")>; +def : InstRW<[VecXsPm], (instregex "(V|W)FPSODB$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VFTCI(DB)?$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "WFTCIDB$")>; +def : InstRW<[VecBF2], (instregex "VL(DE|ED)$")>; +def : InstRW<[VecBF2], (instregex "VL(DE|ED)B$")>; +def : InstRW<[VecBF], (instregex "WL(DE|ED)B$")>; + +// divide / square root +def : InstRW<[VecFPd], (instregex "VFD$")>; +def : InstRW<[VecFPd], (instregex "(V|W)FDDB$")>; +def : InstRW<[VecFPd], (instregex "VFSQ$")>; +def : InstRW<[VecFPd], (instregex "(V|W)FSQDB$")>; + +//===----------------------------------------------------------------------===// +// Vector: Floating-point comparison +//===----------------------------------------------------------------------===// + +def : InstRW<[VecXsPm], (instregex "VFC(E|H|HE)$")>; +def : InstRW<[VecXsPm], (instregex "VFC(E|H|HE)DB$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "WF(C|K)$")>; +def : InstRW<[VecXsPm], (instregex "WFC(E|H|HE)DB$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "VFC(E|H|HE)DBS$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "WFC(E|H|HE)DBS$")>; +def : InstRW<[VecXsPm, Lat4], (instregex "WF(C|K)DB$")>; + +//===----------------------------------------------------------------------===// +// Vector: Floating-point insertion and extraction +//===----------------------------------------------------------------------===// + +def : InstRW<[FXb], (instregex "LEFR$")>; +def : InstRW<[FXb, Lat4], (instregex "LFER$")>; + +//===----------------------------------------------------------------------===// +// Vector: String instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[VecStr], (instregex "VFAE(B)?$")>; +def : InstRW<[VecStr, Lat5], (instregex "VFAEBS$")>; +def : InstRW<[VecStr], (instregex "VFAE(F|H)$")>; +def : InstRW<[VecStr, Lat5], (instregex "VFAE(F|H)S$")>; +def : InstRW<[VecStr], (instregex "VFAEZ(B|F|H)$")>; +def : InstRW<[VecStr, Lat5], (instregex "VFAEZ(B|F|H)S$")>; +def : InstRW<[VecStr], (instregex "VFEE(B|F|H|ZB|ZF|ZH)?$")>; +def : InstRW<[VecStr, Lat5], (instregex "VFEE(B|F|H|ZB|ZF|ZH)S$")>; +def : InstRW<[VecStr], (instregex "VFENE(B|F|H|ZB|ZF|ZH)?$")>; +def : InstRW<[VecStr, Lat5], (instregex "VFENE(B|F|H|ZB|ZF|ZH)S$")>; +def : InstRW<[VecStr], (instregex "VISTR(B|F|H)?$")>; +def : InstRW<[VecStr, Lat5], (instregex "VISTR(B|F|H)S$")>; +def : InstRW<[VecStr], (instregex "VSTRC(B|F|H)?$")>; +def : InstRW<[VecStr, Lat5], (instregex "VSTRC(B|F|H)S$")>; +def : InstRW<[VecStr], (instregex "VSTRCZ(B|F|H)$")>; +def : InstRW<[VecStr, Lat5], (instregex "VSTRCZ(B|F|H)S$")>; + +} + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ196.td b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ196.td new file mode 100644 index 0000000..a950e54 --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZ196.td @@ -0,0 +1,769 @@ +//=- SystemZScheduleZ196.td - SystemZ Scheduling Definitions ---*- tblgen -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines the machine model for Z196 to support instruction +// scheduling and other instruction cost heuristics. +// +//===----------------------------------------------------------------------===// + +def Z196Model : SchedMachineModel { + + let UnsupportedFeatures = Arch9UnsupportedFeatures.List; + + let IssueWidth = 5; + let MicroOpBufferSize = 40; // Issue queues + let LoadLatency = 1; // Optimistic load latency. + + let PostRAScheduler = 1; + + // Extra cycles for a mispredicted branch. + let MispredictPenalty = 16; +} + +let SchedModel = Z196Model in { + +// These definitions could be put in a subtarget common include file, +// but it seems the include system in Tablegen currently rejects +// multiple includes of same file. +def : WriteRes<GroupAlone, []> { + let NumMicroOps = 0; + let BeginGroup = 1; + let EndGroup = 1; +} +def : WriteRes<EndGroup, []> { + let NumMicroOps = 0; + let EndGroup = 1; +} +def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;} +def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;} +def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;} +def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;} +def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;} +def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;} +def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;} +def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;} +def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;} +def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;} +def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;} +def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;} +def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;} +def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;} + +// Execution units. +def Z196_FXUnit : ProcResource<2>; +def Z196_LSUnit : ProcResource<2>; +def Z196_FPUnit : ProcResource<1>; + +// Subtarget specific definitions of scheduling resources. +def : WriteRes<FXU, [Z196_FXUnit]> { let Latency = 1; } +def : WriteRes<LSU, [Z196_LSUnit]> { let Latency = 4; } +def : WriteRes<LSU_lat1, [Z196_LSUnit]> { let Latency = 1; } +def : WriteRes<FPU, [Z196_FPUnit]> { let Latency = 8; } +def : WriteRes<FPU2, [Z196_FPUnit, Z196_FPUnit]> { let Latency = 9; } + +// -------------------------- INSTRUCTIONS ---------------------------------- // + +// InstRW constructs have been used in order to preserve the +// readability of the InstrInfo files. + +// For each instruction, as matched by a regexp, provide a list of +// resources that it needs. These will be combined into a SchedClass. + +//===----------------------------------------------------------------------===// +// Stack allocation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY + +//===----------------------------------------------------------------------===// +// Branch instructions +//===----------------------------------------------------------------------===// + +// Branch +def : InstRW<[LSU, EndGroup], (instregex "(Call)?BRC(L)?(Asm.*)?$")>; +def : InstRW<[LSU, EndGroup], (instregex "(Call)?J(G)?(Asm.*)?$")>; +def : InstRW<[LSU, EndGroup], (instregex "(Call)?BC(R)?(Asm.*)?$")>; +def : InstRW<[LSU, EndGroup], (instregex "(Call)?B(R)?(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BRCT(G|H)?$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BCT(G)?(R)?$")>; +def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], + (instregex "B(R)?X(H|L).*$")>; + +// Compare and branch +def : InstRW<[FXU, LSU, Lat5, GroupAlone], + (instregex "C(L)?(G)?(I|R)J(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], + (instregex "C(L)?(G)?(I|R)B(Call|Return|Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Trap instructions +//===----------------------------------------------------------------------===// + +// Trap +def : InstRW<[LSU, EndGroup], (instregex "(Cond)?Trap$")>; + +// Compare and trap +def : InstRW<[FXU], (instregex "C(G)?(I|R)T(Asm.*)?$")>; +def : InstRW<[FXU], (instregex "CL(G)?RT(Asm.*)?$")>; +def : InstRW<[FXU], (instregex "CL(F|G)IT(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Call and return instructions +//===----------------------------------------------------------------------===// + +// Call +def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "(Call)?BRAS$")>; +def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BAS(R)?$")>; +def : InstRW<[LSU, FXU, FXU, Lat6, GroupAlone], (instregex "TLS_(G|L)DCALL$")>; + +// Return +def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>; +def : InstRW<[LSU_lat1, EndGroup], (instregex "CondReturn$")>; + +//===----------------------------------------------------------------------===// +// Select instructions +//===----------------------------------------------------------------------===// + +// Select pseudo +def : InstRW<[FXU], (instregex "Select(32|64|32Mux)$")>; + +// CondStore pseudos +def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// Move instructions +//===----------------------------------------------------------------------===// + +// Moves +def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G|H)?HI$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>; + +// Move character +def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>; + +// Pseudo -> reg move +def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>; +def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>; +def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>; +def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>; +def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>; + +// Loads +def : InstRW<[LSU], (instregex "L(Y|FH|RL|Mux)?$")>; +def : InstRW<[LSU], (instregex "LG(RL)?$")>; +def : InstRW<[LSU], (instregex "L128$")>; + +def : InstRW<[FXU], (instregex "LLIH(F|H|L)$")>; +def : InstRW<[FXU], (instregex "LLIL(F|H|L)$")>; + +def : InstRW<[FXU], (instregex "LG(F|H)I$")>; +def : InstRW<[FXU], (instregex "LHI(Mux)?$")>; +def : InstRW<[FXU], (instregex "LR(Mux)?$")>; + +// Load and test +def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G)?$")>; +def : InstRW<[FXU], (instregex "LT(G)?R$")>; + +// Stores +def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ST128$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y|FH|RL|Mux)?$")>; + +// String moves. +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>; + +//===----------------------------------------------------------------------===// +// Conditional move instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, Lat2, EndGroup], (instregex "LOC(G)?R(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat6, EndGroup], (instregex "LOC(G)?(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5, EndGroup], (instregex "STOC(G)?(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Sign extensions +//===----------------------------------------------------------------------===// +def : InstRW<[FXU], (instregex "L(B|H|G)R$")>; +def : InstRW<[FXU], (instregex "LG(B|H|F)R$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LTGF$")>; +def : InstRW<[FXU], (instregex "LTGFR$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H|Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LH(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H|Mux|RL)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LG(B|H|F)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LG(H|F)RL$")>; + +//===----------------------------------------------------------------------===// +// Zero extensions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>; +def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>; +def : InstRW<[FXU], (instregex "LLG(C|F|H|T)R$")>; +def : InstRW<[LSU], (instregex "LLC(Mux)?$")>; +def : InstRW<[LSU], (instregex "LLH(Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C|H)H$")>; +def : InstRW<[LSU], (instregex "LLHRL$")>; +def : InstRW<[LSU], (instregex "LLG(C|F|H|T|FRL|HRL)$")>; + +//===----------------------------------------------------------------------===// +// Truncations +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H|Y|Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H|Y|RL|Mux)?$")>; + +//===----------------------------------------------------------------------===// +// Multi-register moves +//===----------------------------------------------------------------------===// + +// Load multiple (estimated average of 5 ops) +def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone], + (instregex "LM(H|Y|G)?$")>; + +// Store multiple (estimated average of 3 ops) +def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone], + (instregex "STM(H|Y|G)?$")>; + +//===----------------------------------------------------------------------===// +// Byte swaps +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LRV(G)?R$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G|H)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G|H)?$")>; + +//===----------------------------------------------------------------------===// +// Load address instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LA(Y|RL)?$")>; + +// Load the Global Offset Table address +def : InstRW<[FXU], (instregex "GOT$")>; + +//===----------------------------------------------------------------------===// +// Absolute and Negation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>; +def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N|P)GFR$")>; +def : InstRW<[FXU, Lat2], (instregex "LN(R|GR)$")>; +def : InstRW<[FXU], (instregex "LC(R|GR)$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>; + +//===----------------------------------------------------------------------===// +// Insertion +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H|Y)?$")>; +def : InstRW<[FXU], (instregex "II(F|H|L)Mux$")>; +def : InstRW<[FXU], (instregex "IIHF(64)?$")>; +def : InstRW<[FXU], (instregex "IIHH(64)?$")>; +def : InstRW<[FXU], (instregex "IIHL(64)?$")>; +def : InstRW<[FXU], (instregex "IILF(64)?$")>; +def : InstRW<[FXU], (instregex "IILH(64)?$")>; +def : InstRW<[FXU], (instregex "IILL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Addition +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y|SI)?$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AH(Y)?$")>; +def : InstRW<[FXU], (instregex "AIH$")>; +def : InstRW<[FXU], (instregex "AFI(Mux)?$")>; +def : InstRW<[FXU], (instregex "AGFI$")>; +def : InstRW<[FXU], (instregex "AGHI(K)?$")>; +def : InstRW<[FXU], (instregex "AGR(K)?$")>; +def : InstRW<[FXU], (instregex "AHI(K)?$")>; +def : InstRW<[FXU], (instregex "AHIMux(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>; +def : InstRW<[FXU], (instregex "AL(FI|HSIK)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>; +def : InstRW<[FXU], (instregex "ALGHSIK$")>; +def : InstRW<[FXU], (instregex "ALGF(I|R)$")>; +def : InstRW<[FXU], (instregex "ALGR(K)?$")>; +def : InstRW<[FXU], (instregex "ALR(K)?$")>; +def : InstRW<[FXU], (instregex "AR(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>; + +// Logical addition with carry +def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>; +def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>; + +// Add with sign extension (32 -> 64) +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "AGF$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "AGFR$")>; + +//===----------------------------------------------------------------------===// +// Subtraction +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "S(G|Y)?$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SH(Y)?$")>; +def : InstRW<[FXU], (instregex "SGR(K)?$")>; +def : InstRW<[FXU], (instregex "SLFI$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G|GF|Y)?$")>; +def : InstRW<[FXU], (instregex "SLGF(I|R)$")>; +def : InstRW<[FXU], (instregex "SLGR(K)?$")>; +def : InstRW<[FXU], (instregex "SLR(K)?$")>; +def : InstRW<[FXU], (instregex "SR(K)?$")>; + +// Subtraction with borrow +def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>; +def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>; + +// Subtraction with sign extension (32 -> 64) +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "SGF$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "SGFR$")>; + +//===----------------------------------------------------------------------===// +// AND +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "N(G|Y)?$")>; +def : InstRW<[FXU], (instregex "NGR(K)?$")>; +def : InstRW<[FXU], (instregex "NI(FMux|HMux|LMux)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>; +def : InstRW<[FXU], (instregex "NIHF(64)?$")>; +def : InstRW<[FXU], (instregex "NIHH(64)?$")>; +def : InstRW<[FXU], (instregex "NIHL(64)?$")>; +def : InstRW<[FXU], (instregex "NILF(64)?$")>; +def : InstRW<[FXU], (instregex "NILH(64)?$")>; +def : InstRW<[FXU], (instregex "NILL(64)?$")>; +def : InstRW<[FXU], (instregex "NR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "NC$")>; + +//===----------------------------------------------------------------------===// +// OR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "O(G|Y)?$")>; +def : InstRW<[FXU], (instregex "OGR(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>; +def : InstRW<[FXU], (instregex "OI(FMux|HMux|LMux)$")>; +def : InstRW<[FXU], (instregex "OIHF(64)?$")>; +def : InstRW<[FXU], (instregex "OIHH(64)?$")>; +def : InstRW<[FXU], (instregex "OIHL(64)?$")>; +def : InstRW<[FXU], (instregex "OILF(64)?$")>; +def : InstRW<[FXU], (instregex "OILH(64)?$")>; +def : InstRW<[FXU], (instregex "OILL(64)?$")>; +def : InstRW<[FXU], (instregex "OR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "OC$")>; + +//===----------------------------------------------------------------------===// +// XOR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "X(G|Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>; +def : InstRW<[FXU], (instregex "XIFMux$")>; +def : InstRW<[FXU], (instregex "XGR(K)?$")>; +def : InstRW<[FXU], (instregex "XIHF(64)?$")>; +def : InstRW<[FXU], (instregex "XILF(64)?$")>; +def : InstRW<[FXU], (instregex "XR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "XC$")>; + +//===----------------------------------------------------------------------===// +// Multiplication +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF|Y)?$")>; +def : InstRW<[FXU, Lat6], (instregex "MS(R|FI)$")>; +def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>; +def : InstRW<[FXU, Lat8], (instregex "MSGR$")>; +def : InstRW<[FXU, Lat6], (instregex "MSGF(I|R)$")>; +def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>; +def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>; +def : InstRW<[FXU, Lat5], (instregex "MGHI$")>; +def : InstRW<[FXU, Lat5], (instregex "MHI$")>; +def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Division and remainder +//===----------------------------------------------------------------------===// + +def : InstRW<[FPU2, FPU2, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DSG(F)?R$")>; +def : InstRW<[FPU2, FPU2, LSU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DSG(F)?$")>; +def : InstRW<[FPU2, FPU2, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DL(G)?R$")>; +def : InstRW<[FPU2, FPU2, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DL(G)?$")>; + +//===----------------------------------------------------------------------===// +// Shifts +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "SLL(G|K)?$")>; +def : InstRW<[FXU], (instregex "SRL(G|K)?$")>; +def : InstRW<[FXU], (instregex "SRA(G|K)?$")>; +def : InstRW<[FXU, Lat2], (instregex "SLA(K)?$")>; + +// Rotate +def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>; + +// Rotate and insert +def : InstRW<[FXU], (instregex "RISBG(32)?$")>; +def : InstRW<[FXU], (instregex "RISBH(G|H|L)$")>; +def : InstRW<[FXU], (instregex "RISBL(G|H|L)$")>; +def : InstRW<[FXU], (instregex "RISBMux$")>; + +// Rotate and Select +def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N|O|X)SBG$")>; + +//===----------------------------------------------------------------------===// +// Comparison +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "C(G|Y|Mux|RL)?$")>; +def : InstRW<[FXU], (instregex "C(F|H)I(Mux)?$")>; +def : InstRW<[FXU], (instregex "CG(F|H)I$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI|RL)$")>; +def : InstRW<[FXU], (instregex "C(G)?R$")>; +def : InstRW<[FXU], (instregex "CIH$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F|SI)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y|Mux|FHSI)?$")>; +def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL|HSI)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>; +def : InstRW<[FXU], (instregex "CLGF(I|R)$")>; +def : InstRW<[FXU], (instregex "CLGR$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F|RL|HSI)$")>; +def : InstRW<[FXU], (instregex "CLIH$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>; +def : InstRW<[FXU], (instregex "CLR$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>; + +// Compare halfword +def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CH(Y|RL)?$")>; +def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CGH(RL)?$")>; +def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CHHSI$")>; + +// Compare with sign extension (32 -> 64) +def : InstRW<[FXU, FXU, LSU, Lat6, Lat2, GroupAlone], (instregex "CGF(RL)?$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "CGFR$")>; + +// Compare logical character +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "CLC$")>; + +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>; + +// Test under mask +def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>; +def : InstRW<[FXU], (instregex "TM(H|L)Mux$")>; +def : InstRW<[FXU], (instregex "TMHH(64)?$")>; +def : InstRW<[FXU], (instregex "TMHL(64)?$")>; +def : InstRW<[FXU], (instregex "TMLH(64)?$")>; +def : InstRW<[FXU], (instregex "TMLL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Prefetch +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU, GroupAlone], (instregex "PFD(RL)?$")>; + +//===----------------------------------------------------------------------===// +// Atomic operations +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>; + +// Test and set +def : InstRW<[FXU, LSU, Lat5, EndGroup], (instregex "TS$")>; + +// Compare and swap +def : InstRW<[FXU, LSU, FXU, Lat6, GroupAlone], (instregex "CS(G|Y)?$")>; + +// Compare double and swap +def : InstRW<[FXU, FXU, FXU, FXU, FXU, LSU, Lat10, GroupAlone], + (instregex "CDS(Y)?$")>; +def : InstRW<[FXU, FXU, FXU, FXU, FXU, FXU, LSU, LSU, Lat12, GroupAlone], + (instregex "CDSG$")>; + +// Compare and swap and store +def : InstRW<[FXU, Lat30, GroupAlone], (instregex "CSST$")>; + +// Perform locked operation +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PLO$")>; + +// Load/store pair from/to quadword +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPQ$")>; +def : InstRW<[FXU, FXU, LSU, LSU, Lat6, GroupAlone], (instregex "STPQ$")>; + +// Load pair disjoint +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPD(G)?$")>; + +//===----------------------------------------------------------------------===// +// Access registers +//===----------------------------------------------------------------------===// + +// Extract/set/copy access register +def : InstRW<[LSU], (instregex "(EAR|SAR|CPYA)$")>; + +// Load address extended +def : InstRW<[LSU, FXU, Lat5, GroupAlone], (instregex "LAE(Y)?$")>; + +// Load/store access multiple (not modeled precisely) +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "(L|ST)AM(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Program mask and addressing mode +//===----------------------------------------------------------------------===// + +// Insert Program Mask +def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>; + +// Set Program Mask +def : InstRW<[LSU, EndGroup], (instregex "SPM$")>; + +// Branch and link +def : InstRW<[FXU, FXU, LSU, Lat8, GroupAlone], (instregex "BAL(R)?$")>; + +// Test addressing mode +def : InstRW<[FXU], (instregex "TAM$")>; + +// Set addressing mode +def : InstRW<[LSU, EndGroup], (instregex "SAM(24|31|64)$")>; + +// Branch (and save) and set mode. +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BSM$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "BASSM$")>; + +//===----------------------------------------------------------------------===// +// Miscellaneous Instructions. +//===----------------------------------------------------------------------===// + +// Find leftmost one +def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>; + +// Population count +def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>; + +// Extend +def : InstRW<[FXU], (instregex "AEXT128_64$")>; +def : InstRW<[FXU], (instregex "ZEXT128_(32|64)$")>; + +// String instructions +def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>; + +// Move with key +def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>; + +// Extract CPU Time +def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>; + +// Execute +def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>; + +// Program return +def : InstRW<[FXU, Lat30], (instregex "PR$")>; + +// Inline assembly +def : InstRW<[FXU, LSU, Lat15], (instregex "STCK$")>; +def : InstRW<[FXU, LSU, Lat12], (instregex "STCKF$")>; +def : InstRW<[LSU, FXU, Lat5], (instregex "STCKE$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>; +def : InstRW<[FXU, Lat30], (instregex "SVC$")>; + +// Store real address +def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>; + +//===----------------------------------------------------------------------===// +// .insn directive instructions +//===----------------------------------------------------------------------===// + +// An "empty" sched-class will be assigned instead of the "invalid sched-class". +// getNumDecoderSlots() will then return 1 instead of 0. +def : InstRW<[], (instregex "Insn.*")>; + + +// ----------------------------- Floating point ----------------------------- // + +//===----------------------------------------------------------------------===// +// FP: Select instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "SelectF(32|64|128)$")>; +def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Move instructions +//===----------------------------------------------------------------------===// + +// Load zero +def : InstRW<[FXU], (instregex "LZ(DR|ER)$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>; + +// Load +def : InstRW<[FXU], (instregex "LER$")>; +def : InstRW<[FXU], (instregex "LD(R|R32|GR)$")>; +def : InstRW<[FXU, Lat3], (instregex "LGDR$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>; + +// Load and Test +def : InstRW<[FPU], (instregex "LT(D|E)BR$")>; +def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>; +def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "LTXBR$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], + (instregex "LTXBRCompare(_VecPseudo)?$")>; + +// Copy sign +def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d|s)$")>; +def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d|s)$")>; + +//===----------------------------------------------------------------------===// +// FP: Load instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU], (instregex "LE(Y)?$")>; +def : InstRW<[LSU], (instregex "LD(Y|E32)?$")>; +def : InstRW<[LSU], (instregex "LX$")>; + +//===----------------------------------------------------------------------===// +// FP: Store instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>; +def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STX$")>; + +//===----------------------------------------------------------------------===// +// FP: Conversion instructions +//===----------------------------------------------------------------------===// + +// Load rounded +def : InstRW<[FPU], (instregex "LEDBR(A)?$")>; +def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>; +def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>; + +// Load lengthened +def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>; +def : InstRW<[FPU], (instregex "LDEBR$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "LX(D|E)B$")>; +def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "LX(D|E)BR$")>; + +// Convert from fixed / logical +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F|G)BR(A)?$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F|G)BR(A)?$")>; +def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CX(F|G)BR(A)?$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F|G)BR$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F|G)BR$")>; +def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CXL(F|G)BR$")>; + +// Convert to fixed / logical +def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E|D)BR(A)?$")>; +def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E|D)BR(A)?$")>; +def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F|G)XBR(A)?$")>; +def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E|D)BR$")>; +def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E|D)BR$")>; +def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F|G)XBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Unary arithmetic +//===----------------------------------------------------------------------===// + +// Load Complement / Negative / Positive +def : InstRW<[FPU], (instregex "L(C|N|P)DBR$")>; +def : InstRW<[FPU], (instregex "L(C|N|P)EBR$")>; +def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>; +def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>; +def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "L(C|N|P)XBR$")>; + +// Square root +def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E|D)B$")>; +def : InstRW<[FPU, Lat30], (instregex "SQ(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "SQXBR$")>; + +// Load FP integer +def : InstRW<[FPU], (instregex "FIEBR(A)?$")>; +def : InstRW<[FPU], (instregex "FIDBR(A)?$")>; +def : InstRW<[FPU2, FPU2, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Binary arithmetic +//===----------------------------------------------------------------------===// + +// Addition +def : InstRW<[FPU, LSU, Lat12], (instregex "A(E|D)B$")>; +def : InstRW<[FPU], (instregex "A(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "AXBR$")>; + +// Subtraction +def : InstRW<[FPU, LSU, Lat12], (instregex "S(E|D)B$")>; +def : InstRW<[FPU], (instregex "S(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "SXBR$")>; + +// Multiply +def : InstRW<[FPU, LSU, Lat12], (instregex "M(D|DE|EE)B$")>; +def : InstRW<[FPU], (instregex "M(D|DE|EE)BR$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MXDB$")>; +def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MXDBR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "MXBR$")>; + +// Multiply and add / subtract +def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)EB$")>; +def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)EBR$")>; +def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)DB$")>; +def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)DBR$")>; + +// Division +def : InstRW<[FPU, LSU, Lat30], (instregex "D(E|D)B$")>; +def : InstRW<[FPU, Lat30], (instregex "D(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "DXBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Comparisons +//===----------------------------------------------------------------------===// + +// Compare +def : InstRW<[FPU, LSU, Lat12], (instregex "C(E|D)B$")>; +def : InstRW<[FPU], (instregex "C(E|D)BR$")>; +def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>; + +// Test Data Class +def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E|D)B$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "TCXB$")>; + +//===----------------------------------------------------------------------===// +// FP: Floating-point control register instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat4, GroupAlone], (instregex "EFPC$")>; +def : InstRW<[LSU, Lat3, GroupAlone], (instregex "SFPC$")>; +def : InstRW<[LSU, LSU, Lat6, GroupAlone], (instregex "LFPC$")>; +def : InstRW<[LSU, Lat3, GroupAlone], (instregex "STFPC$")>; +def : InstRW<[FXU, Lat30, GroupAlone], (instregex "SFASR$")>; +def : InstRW<[FXU, LSU, Lat30, GroupAlone], (instregex "LFAS$")>; +def : InstRW<[FXU, Lat2, GroupAlone], (instregex "SRNM(B|T)?$")>; + +} + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZEC12.td b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZEC12.td new file mode 100644 index 0000000..8ab6c82 --- /dev/null +++ b/contrib/llvm/lib/Target/SystemZ/SystemZScheduleZEC12.td @@ -0,0 +1,807 @@ +//=- SystemZScheduleZEC12.td - SystemZ Scheduling Definitions --*- tblgen -*-=// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines the machine model for ZEC12 to support instruction +// scheduling and other instruction cost heuristics. +// +//===----------------------------------------------------------------------===// + +def ZEC12Model : SchedMachineModel { + + let UnsupportedFeatures = Arch10UnsupportedFeatures.List; + + let IssueWidth = 5; + let MicroOpBufferSize = 40; // Issue queues + let LoadLatency = 1; // Optimistic load latency. + + let PostRAScheduler = 1; + + // Extra cycles for a mispredicted branch. + let MispredictPenalty = 16; +} + +let SchedModel = ZEC12Model in { + +// These definitions could be put in a subtarget common include file, +// but it seems the include system in Tablegen currently rejects +// multiple includes of same file. +def : WriteRes<GroupAlone, []> { + let NumMicroOps = 0; + let BeginGroup = 1; + let EndGroup = 1; +} +def : WriteRes<EndGroup, []> { + let NumMicroOps = 0; + let EndGroup = 1; +} +def : WriteRes<Lat2, []> { let Latency = 2; let NumMicroOps = 0;} +def : WriteRes<Lat3, []> { let Latency = 3; let NumMicroOps = 0;} +def : WriteRes<Lat4, []> { let Latency = 4; let NumMicroOps = 0;} +def : WriteRes<Lat5, []> { let Latency = 5; let NumMicroOps = 0;} +def : WriteRes<Lat6, []> { let Latency = 6; let NumMicroOps = 0;} +def : WriteRes<Lat7, []> { let Latency = 7; let NumMicroOps = 0;} +def : WriteRes<Lat8, []> { let Latency = 8; let NumMicroOps = 0;} +def : WriteRes<Lat9, []> { let Latency = 9; let NumMicroOps = 0;} +def : WriteRes<Lat10, []> { let Latency = 10; let NumMicroOps = 0;} +def : WriteRes<Lat11, []> { let Latency = 11; let NumMicroOps = 0;} +def : WriteRes<Lat12, []> { let Latency = 12; let NumMicroOps = 0;} +def : WriteRes<Lat15, []> { let Latency = 15; let NumMicroOps = 0;} +def : WriteRes<Lat20, []> { let Latency = 20; let NumMicroOps = 0;} +def : WriteRes<Lat30, []> { let Latency = 30; let NumMicroOps = 0;} + +// Execution units. +def ZEC12_FXUnit : ProcResource<2>; +def ZEC12_LSUnit : ProcResource<2>; +def ZEC12_FPUnit : ProcResource<1>; +def ZEC12_VBUnit : ProcResource<1>; + +// Subtarget specific definitions of scheduling resources. +def : WriteRes<FXU, [ZEC12_FXUnit]> { let Latency = 1; } +def : WriteRes<LSU, [ZEC12_LSUnit]> { let Latency = 4; } +def : WriteRes<LSU_lat1, [ZEC12_LSUnit]> { let Latency = 1; } +def : WriteRes<FPU, [ZEC12_FPUnit]> { let Latency = 8; } +def : WriteRes<FPU2, [ZEC12_FPUnit, ZEC12_FPUnit]> { let Latency = 9; } +def : WriteRes<VBU, [ZEC12_VBUnit]>; // Virtual Branching Unit + +// -------------------------- INSTRUCTIONS ---------------------------------- // + +// InstRW constructs have been used in order to preserve the +// readability of the InstrInfo files. + +// For each instruction, as matched by a regexp, provide a list of +// resources that it needs. These will be combined into a SchedClass. + +//===----------------------------------------------------------------------===// +// Stack allocation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "ADJDYNALLOC$")>; // Pseudo -> LA / LAY + +//===----------------------------------------------------------------------===// +// Branch instructions +//===----------------------------------------------------------------------===// + +// Branch +def : InstRW<[VBU], (instregex "(Call)?BRC(L)?(Asm.*)?$")>; +def : InstRW<[VBU], (instregex "(Call)?J(G)?(Asm.*)?$")>; +def : InstRW<[LSU, Lat4], (instregex "(Call)?BC(R)?(Asm.*)?$")>; +def : InstRW<[LSU, Lat4], (instregex "(Call)?B(R)?(Asm.*)?$")>; +def : InstRW<[FXU, EndGroup], (instregex "BRCT(G)?$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BRCTH$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BCT(G)?(R)?$")>; +def : InstRW<[FXU, FXU, FXU, LSU, Lat7, GroupAlone], + (instregex "B(R)?X(H|L).*$")>; + +// Compare and branch +def : InstRW<[FXU], (instregex "C(L)?(G)?(I|R)J(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5, GroupAlone], + (instregex "C(L)?(G)?(I|R)B(Call|Return|Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Trap instructions +//===----------------------------------------------------------------------===// + +// Trap +def : InstRW<[VBU], (instregex "(Cond)?Trap$")>; + +// Compare and trap +def : InstRW<[FXU], (instregex "C(G)?(I|R)T(Asm.*)?$")>; +def : InstRW<[FXU], (instregex "CL(G)?RT(Asm.*)?$")>; +def : InstRW<[FXU], (instregex "CL(F|G)IT(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CL(G)?T(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Call and return instructions +//===----------------------------------------------------------------------===// + +// Call +def : InstRW<[VBU, FXU, FXU, Lat3, GroupAlone], (instregex "(Call)?BRAS$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BRASL$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "(Call)?BAS(R)?$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "TLS_(G|L)DCALL$")>; + +// Return +def : InstRW<[LSU_lat1, EndGroup], (instregex "Return$")>; +def : InstRW<[LSU_lat1], (instregex "CondReturn$")>; + +//===----------------------------------------------------------------------===// +// Select instructions +//===----------------------------------------------------------------------===// + +// Select pseudo +def : InstRW<[FXU], (instregex "Select(32|64|32Mux)$")>; + +// CondStore pseudos +def : InstRW<[FXU], (instregex "CondStore16(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore16Mux(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore32(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore64(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore8(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStore8Mux(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// Move instructions +//===----------------------------------------------------------------------===// + +// Moves +def : InstRW<[FXU, LSU, Lat5], (instregex "MV(G|H)?HI$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "MVI(Y)?$")>; + +// Move character +def : InstRW<[LSU, LSU, LSU, FXU, Lat8, GroupAlone], (instregex "MVC$")>; + +// Pseudo -> reg move +def : InstRW<[FXU], (instregex "COPY(_TO_REGCLASS)?$")>; +def : InstRW<[FXU], (instregex "EXTRACT_SUBREG$")>; +def : InstRW<[FXU], (instregex "INSERT_SUBREG$")>; +def : InstRW<[FXU], (instregex "REG_SEQUENCE$")>; +def : InstRW<[FXU], (instregex "SUBREG_TO_REG$")>; + +// Loads +def : InstRW<[LSU], (instregex "L(Y|FH|RL|Mux)?$")>; +def : InstRW<[LSU], (instregex "LG(RL)?$")>; +def : InstRW<[LSU], (instregex "L128$")>; + +def : InstRW<[FXU], (instregex "LLIH(F|H|L)$")>; +def : InstRW<[FXU], (instregex "LLIL(F|H|L)$")>; + +def : InstRW<[FXU], (instregex "LG(F|H)I$")>; +def : InstRW<[FXU], (instregex "LHI(Mux)?$")>; +def : InstRW<[FXU], (instregex "LR(Mux)?$")>; + +// Load and trap +def : InstRW<[FXU, LSU, Lat5], (instregex "L(FH|G)?AT$")>; + +// Load and test +def : InstRW<[FXU, LSU, Lat5], (instregex "LT(G)?$")>; +def : InstRW<[FXU], (instregex "LT(G)?R$")>; + +// Stores +def : InstRW<[FXU, LSU, Lat5], (instregex "STG(RL)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ST128$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ST(Y|FH|RL|Mux)?$")>; + +// String moves. +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "MVST$")>; + +//===----------------------------------------------------------------------===// +// Conditional move instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, Lat2], (instregex "LOC(G)?R(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat6], (instregex "LOC(G)?(Asm.*)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STOC(G)?(Asm.*)?$")>; + +//===----------------------------------------------------------------------===// +// Sign extensions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "L(B|H|G)R$")>; +def : InstRW<[FXU], (instregex "LG(B|H|F)R$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LTGF$")>; +def : InstRW<[FXU], (instregex "LTGFR$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LB(H|Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LH(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LH(H|Mux|RL)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LG(B|H|F)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LG(H|F)RL$")>; + +//===----------------------------------------------------------------------===// +// Zero extensions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LLCR(Mux)?$")>; +def : InstRW<[FXU], (instregex "LLHR(Mux)?$")>; +def : InstRW<[FXU], (instregex "LLG(C|H|F|T)R$")>; +def : InstRW<[LSU], (instregex "LLC(Mux)?$")>; +def : InstRW<[LSU], (instregex "LLH(Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LL(C|H)H$")>; +def : InstRW<[LSU], (instregex "LLHRL$")>; +def : InstRW<[LSU], (instregex "LLG(C|H|F|T|HRL|FRL)$")>; + +// Load and trap +def : InstRW<[FXU, LSU, Lat5], (instregex "LLG(F|T)?AT$")>; + +//===----------------------------------------------------------------------===// +// Truncations +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "STC(H|Y|Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STH(H|Y|RL|Mux)?$")>; + +//===----------------------------------------------------------------------===// +// Multi-register moves +//===----------------------------------------------------------------------===// + +// Load multiple (estimated average of 5 ops) +def : InstRW<[LSU, LSU, LSU, LSU, LSU, Lat10, GroupAlone], + (instregex "LM(H|Y|G)?$")>; + +// Store multiple (estimated average of 3 ops) +def : InstRW<[LSU, LSU, FXU, FXU, FXU, Lat10, GroupAlone], + (instregex "STM(H|Y|G)?$")>; + +//===----------------------------------------------------------------------===// +// Byte swaps +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LRV(G)?R$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LRV(G|H)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STRV(G|H)?$")>; + +//===----------------------------------------------------------------------===// +// Load address instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "LA(Y|RL)?$")>; + +// Load the Global Offset Table address +def : InstRW<[FXU], (instregex "GOT$")>; + +//===----------------------------------------------------------------------===// +// Absolute and Negation +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, Lat2], (instregex "LP(G)?R$")>; +def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "L(N|P)GFR$")>; +def : InstRW<[FXU, Lat2], (instregex "LN(R|GR)$")>; +def : InstRW<[FXU], (instregex "LC(R|GR)$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LCGFR$")>; + +//===----------------------------------------------------------------------===// +// Insertion +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "IC(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "IC32(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ICM(H|Y)?$")>; +def : InstRW<[FXU], (instregex "II(F|H|L)Mux$")>; +def : InstRW<[FXU], (instregex "IIHF(64)?$")>; +def : InstRW<[FXU], (instregex "IIHH(64)?$")>; +def : InstRW<[FXU], (instregex "IIHL(64)?$")>; +def : InstRW<[FXU], (instregex "IILF(64)?$")>; +def : InstRW<[FXU], (instregex "IILH(64)?$")>; +def : InstRW<[FXU], (instregex "IILL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Addition +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "A(Y|SI)?$")>; +def : InstRW<[FXU, LSU, Lat6], (instregex "AH(Y)?$")>; +def : InstRW<[FXU], (instregex "AIH$")>; +def : InstRW<[FXU], (instregex "AFI(Mux)?$")>; +def : InstRW<[FXU], (instregex "AGFI$")>; +def : InstRW<[FXU], (instregex "AGHI(K)?$")>; +def : InstRW<[FXU], (instregex "AGR(K)?$")>; +def : InstRW<[FXU], (instregex "AHI(K)?$")>; +def : InstRW<[FXU], (instregex "AHIMux(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "AL(Y)?$")>; +def : InstRW<[FXU], (instregex "AL(FI|HSIK)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "ALG(F)?$")>; +def : InstRW<[FXU], (instregex "ALGHSIK$")>; +def : InstRW<[FXU], (instregex "ALGF(I|R)$")>; +def : InstRW<[FXU], (instregex "ALGR(K)?$")>; +def : InstRW<[FXU], (instregex "ALR(K)?$")>; +def : InstRW<[FXU], (instregex "AR(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "AG(SI)?$")>; + +// Logical addition with carry +def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "ALC(G)?$")>; +def : InstRW<[FXU, Lat3, GroupAlone], (instregex "ALC(G)?R$")>; + +// Add with sign extension (32 -> 64) +def : InstRW<[FXU, LSU, Lat6], (instregex "AGF$")>; +def : InstRW<[FXU, Lat2], (instregex "AGFR$")>; + +//===----------------------------------------------------------------------===// +// Subtraction +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "S(G|Y)?$")>; +def : InstRW<[FXU, LSU, Lat6], (instregex "SH(Y)?$")>; +def : InstRW<[FXU], (instregex "SGR(K)?$")>; +def : InstRW<[FXU], (instregex "SLFI$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "SL(G|GF|Y)?$")>; +def : InstRW<[FXU], (instregex "SLGF(I|R)$")>; +def : InstRW<[FXU], (instregex "SLGR(K)?$")>; +def : InstRW<[FXU], (instregex "SLR(K)?$")>; +def : InstRW<[FXU], (instregex "SR(K)?$")>; + +// Subtraction with borrow +def : InstRW<[FXU, LSU, Lat7, GroupAlone], (instregex "SLB(G)?$")>; +def : InstRW<[FXU, Lat3, GroupAlone], (instregex "SLB(G)?R$")>; + +// Subtraction with sign extension (32 -> 64) +def : InstRW<[FXU, LSU, Lat6], (instregex "SGF$")>; +def : InstRW<[FXU, Lat2], (instregex "SGFR$")>; + +//===----------------------------------------------------------------------===// +// AND +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "N(G|Y)?$")>; +def : InstRW<[FXU], (instregex "NGR(K)?$")>; +def : InstRW<[FXU], (instregex "NI(FMux|HMux|LMux)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "NI(Y)?$")>; +def : InstRW<[FXU], (instregex "NIHF(64)?$")>; +def : InstRW<[FXU], (instregex "NIHH(64)?$")>; +def : InstRW<[FXU], (instregex "NIHL(64)?$")>; +def : InstRW<[FXU], (instregex "NILF(64)?$")>; +def : InstRW<[FXU], (instregex "NILH(64)?$")>; +def : InstRW<[FXU], (instregex "NILL(64)?$")>; +def : InstRW<[FXU], (instregex "NR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "NC$")>; + +//===----------------------------------------------------------------------===// +// OR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "O(G|Y)?$")>; +def : InstRW<[FXU], (instregex "OGR(K)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "OI(Y)?$")>; +def : InstRW<[FXU], (instregex "OI(FMux|HMux|LMux)$")>; +def : InstRW<[FXU], (instregex "OIHF(64)?$")>; +def : InstRW<[FXU], (instregex "OIHH(64)?$")>; +def : InstRW<[FXU], (instregex "OIHL(64)?$")>; +def : InstRW<[FXU], (instregex "OILF(64)?$")>; +def : InstRW<[FXU], (instregex "OILH(64)?$")>; +def : InstRW<[FXU], (instregex "OILL(64)?$")>; +def : InstRW<[FXU], (instregex "OR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "OC$")>; + +//===----------------------------------------------------------------------===// +// XOR +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "X(G|Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "XI(Y)?$")>; +def : InstRW<[FXU], (instregex "XIFMux$")>; +def : InstRW<[FXU], (instregex "XGR(K)?$")>; +def : InstRW<[FXU], (instregex "XIHF(64)?$")>; +def : InstRW<[FXU], (instregex "XILF(64)?$")>; +def : InstRW<[FXU], (instregex "XR(K)?$")>; +def : InstRW<[LSU, LSU, FXU, Lat9, GroupAlone], (instregex "XC$")>; + +//===----------------------------------------------------------------------===// +// Multiplication +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat10], (instregex "MS(GF|Y)?$")>; +def : InstRW<[FXU, Lat6], (instregex "MS(R|FI)$")>; +def : InstRW<[FXU, LSU, Lat12], (instregex "MSG$")>; +def : InstRW<[FXU, Lat8], (instregex "MSGR$")>; +def : InstRW<[FXU, Lat6], (instregex "MSGF(I|R)$")>; +def : InstRW<[FXU, LSU, Lat15, GroupAlone], (instregex "MLG$")>; +def : InstRW<[FXU, Lat9, GroupAlone], (instregex "MLGR$")>; +def : InstRW<[FXU, Lat5], (instregex "MGHI$")>; +def : InstRW<[FXU, Lat5], (instregex "MHI$")>; +def : InstRW<[FXU, LSU, Lat9], (instregex "MH(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Division and remainder +//===----------------------------------------------------------------------===// + +def : InstRW<[FPU2, FPU2, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DSG(F)?R$")>; +def : InstRW<[FPU2, FPU2, LSU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DSG(F)?$")>; +def : InstRW<[FPU2, FPU2, FXU, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DL(G)?R$")>; +def : InstRW<[FPU2, FPU2, LSU, FXU, FXU, FXU, FXU, Lat30, GroupAlone], + (instregex "DL(G)?$")>; + +//===----------------------------------------------------------------------===// +// Shifts +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "SLL(G|K)?$")>; +def : InstRW<[FXU], (instregex "SRL(G|K)?$")>; +def : InstRW<[FXU], (instregex "SRA(G|K)?$")>; +def : InstRW<[FXU], (instregex "SLA(K)?$")>; + +// Rotate +def : InstRW<[FXU, LSU, Lat6], (instregex "RLL(G)?$")>; + +// Rotate and insert +def : InstRW<[FXU], (instregex "RISBG(N|32)?$")>; +def : InstRW<[FXU], (instregex "RISBH(G|H|L)$")>; +def : InstRW<[FXU], (instregex "RISBL(G|H|L)$")>; +def : InstRW<[FXU], (instregex "RISBMux$")>; + +// Rotate and Select +def : InstRW<[FXU, FXU, Lat3, GroupAlone], (instregex "R(N|O|X)SBG$")>; + +//===----------------------------------------------------------------------===// +// Comparison +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat5], (instregex "C(G|Y|Mux|RL)?$")>; +def : InstRW<[FXU], (instregex "C(F|H)I(Mux)?$")>; +def : InstRW<[FXU], (instregex "CG(F|H)I$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CG(HSI|RL)$")>; +def : InstRW<[FXU], (instregex "C(G)?R$")>; +def : InstRW<[FXU], (instregex "CIH$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CH(F|SI)$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CL(Y|Mux|FHSI)?$")>; +def : InstRW<[FXU], (instregex "CLFI(Mux)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLG(HRL|HSI)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLGF(RL)?$")>; +def : InstRW<[FXU], (instregex "CLGF(I|R)$")>; +def : InstRW<[FXU], (instregex "CLGR$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLGRL$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLH(F|RL|HSI)$")>; +def : InstRW<[FXU], (instregex "CLIH$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLI(Y)?$")>; +def : InstRW<[FXU], (instregex "CLR$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "CLRL$")>; + +// Compare halfword +def : InstRW<[FXU, LSU, Lat6], (instregex "CH(Y|RL)?$")>; +def : InstRW<[FXU, LSU, Lat6], (instregex "CGH(RL)?$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CHHSI$")>; + +// Compare with sign extension (32 -> 64) +def : InstRW<[FXU, LSU, Lat6], (instregex "CGF(RL)?$")>; +def : InstRW<[FXU, Lat2], (instregex "CGFR$")>; + +// Compare logical character +def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "CLC$")>; + +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "CLST$")>; + +// Test under mask +def : InstRW<[FXU, LSU, Lat5], (instregex "TM(Y)?$")>; +def : InstRW<[FXU], (instregex "TM(H|L)Mux$")>; +def : InstRW<[FXU], (instregex "TMHH(64)?$")>; +def : InstRW<[FXU], (instregex "TMHL(64)?$")>; +def : InstRW<[FXU], (instregex "TMLH(64)?$")>; +def : InstRW<[FXU], (instregex "TMLL(64)?$")>; + +//===----------------------------------------------------------------------===// +// Prefetch and execution hint +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU], (instregex "PFD(RL)?$")>; +def : InstRW<[LSU], (instregex "BP(R)?P$")>; +def : InstRW<[FXU], (instregex "NIAI$")>; + +//===----------------------------------------------------------------------===// +// Atomic operations +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU, EndGroup], (instregex "Serialize$")>; + +def : InstRW<[FXU, LSU, Lat5], (instregex "LAA(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAAL(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAN(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAO(G)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "LAX(G)?$")>; + +// Test and set +def : InstRW<[FXU, LSU, Lat5, EndGroup], (instregex "TS$")>; + +// Compare and swap +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "CS(G|Y)?$")>; + +// Compare double and swap +def : InstRW<[FXU, FXU, FXU, FXU, FXU, LSU, Lat10, GroupAlone], + (instregex "CDS(Y)?$")>; +def : InstRW<[FXU, FXU, FXU, FXU, FXU, FXU, LSU, LSU, Lat12, GroupAlone], + (instregex "CDSG$")>; + +// Compare and swap and store +def : InstRW<[FXU, Lat30, GroupAlone], (instregex "CSST$")>; + +// Perform locked operation +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PLO$")>; + +// Load/store pair from/to quadword +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPQ$")>; +def : InstRW<[FXU, FXU, LSU, LSU, Lat6, GroupAlone], (instregex "STPQ$")>; + +// Load pair disjoint +def : InstRW<[LSU, LSU, Lat5, GroupAlone], (instregex "LPD(G)?$")>; + +//===----------------------------------------------------------------------===// +// Access registers +//===----------------------------------------------------------------------===// + +// Extract/set/copy access register +def : InstRW<[LSU], (instregex "(EAR|SAR|CPYA)$")>; + +// Load address extended +def : InstRW<[LSU, FXU, Lat5, GroupAlone], (instregex "LAE(Y)?$")>; + +// Load/store access multiple (not modeled precisely) +def : InstRW<[LSU, Lat30, GroupAlone], (instregex "(L|ST)AM(Y)?$")>; + +//===----------------------------------------------------------------------===// +// Program mask and addressing mode +//===----------------------------------------------------------------------===// + +// Insert Program Mask +def : InstRW<[FXU, Lat3, EndGroup], (instregex "IPM$")>; + +// Set Program Mask +def : InstRW<[LSU, EndGroup], (instregex "SPM$")>; + +// Branch and link +def : InstRW<[FXU, FXU, LSU, Lat8, GroupAlone], (instregex "BAL(R)?$")>; + +// Test addressing mode +def : InstRW<[FXU], (instregex "TAM$")>; + +// Set addressing mode +def : InstRW<[LSU, EndGroup], (instregex "SAM(24|31|64)$")>; + +// Branch (and save) and set mode. +def : InstRW<[FXU, LSU, Lat5, GroupAlone], (instregex "BSM$")>; +def : InstRW<[FXU, FXU, LSU, Lat6, GroupAlone], (instregex "BASSM$")>; + +//===----------------------------------------------------------------------===// +// Transactional execution +//===----------------------------------------------------------------------===// + +// Transaction begin +def : InstRW<[LSU, LSU, FXU, FXU, FXU, FXU, FXU, Lat15, GroupAlone], + (instregex "TBEGIN(C|_nofloat)?$")>; + +// Transaction end +def : InstRW<[LSU, GroupAlone], (instregex "TEND$")>; + +// Transaction abort +def : InstRW<[LSU, GroupAlone], (instregex "TABORT$")>; + +// Extract Transaction Nesting Depth +def : InstRW<[FXU], (instregex "ETND$")>; + +// Nontransactional store +def : InstRW<[FXU, LSU, Lat5], (instregex "NTSTG$")>; + +//===----------------------------------------------------------------------===// +// Processor assist +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "PPA$")>; + +//===----------------------------------------------------------------------===// +// Miscellaneous Instructions. +//===----------------------------------------------------------------------===// + +// Find leftmost one +def : InstRW<[FXU, Lat7, GroupAlone], (instregex "FLOGR$")>; + +// Population count +def : InstRW<[FXU, Lat3], (instregex "POPCNT$")>; + +// Extend +def : InstRW<[FXU], (instregex "AEXT128_64$")>; +def : InstRW<[FXU], (instregex "ZEXT128_(32|64)$")>; + +// String instructions +def : InstRW<[FXU, LSU, Lat30], (instregex "SRST$")>; + +// Move with key +def : InstRW<[LSU, Lat8, GroupAlone], (instregex "MVCK$")>; + +// Extract CPU Time +def : InstRW<[FXU, Lat5, LSU], (instregex "ECTG$")>; + +// Execute +def : InstRW<[LSU, GroupAlone], (instregex "EX(RL)?$")>; + +// Program return +def : InstRW<[FXU, Lat30], (instregex "PR$")>; + +// Inline assembly +def : InstRW<[FXU, LSU, LSU, Lat9, GroupAlone], (instregex "STCK(F)?$")>; +def : InstRW<[LSU, LSU, LSU, LSU, FXU, FXU, Lat20, GroupAlone], + (instregex "STCKE$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STFLE$")>; +def : InstRW<[FXU, Lat30], (instregex "SVC$")>; + +// Store real address +def : InstRW<[FXU, LSU, Lat5], (instregex "STRAG$")>; + +//===----------------------------------------------------------------------===// +// .insn directive instructions +//===----------------------------------------------------------------------===// + +// An "empty" sched-class will be assigned instead of the "invalid sched-class". +// getNumDecoderSlots() will then return 1 instead of 0. +def : InstRW<[], (instregex "Insn.*")>; + + +// ----------------------------- Floating point ----------------------------- // + +//===----------------------------------------------------------------------===// +// FP: Select instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU], (instregex "SelectF(32|64|128)$")>; +def : InstRW<[FXU], (instregex "CondStoreF32(Inv)?$")>; +def : InstRW<[FXU], (instregex "CondStoreF64(Inv)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Move instructions +//===----------------------------------------------------------------------===// + +// Load zero +def : InstRW<[FXU], (instregex "LZ(DR|ER)$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LZXR$")>; + +// Load +def : InstRW<[FXU], (instregex "LER$")>; +def : InstRW<[FXU], (instregex "LD(R|R32|GR)$")>; +def : InstRW<[FXU, Lat3], (instregex "LGDR$")>; +def : InstRW<[FXU, FXU, Lat2, GroupAlone], (instregex "LXR$")>; + +// Load and Test +def : InstRW<[FPU], (instregex "LT(D|E)BR$")>; +def : InstRW<[FPU], (instregex "LTEBRCompare(_VecPseudo)?$")>; +def : InstRW<[FPU], (instregex "LTDBRCompare(_VecPseudo)?$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "LTXBR$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], + (instregex "LTXBRCompare(_VecPseudo)?$")>; + +// Copy sign +def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRd(d|s)$")>; +def : InstRW<[FXU, FXU, Lat5, GroupAlone], (instregex "CPSDRs(d|s)$")>; + +//===----------------------------------------------------------------------===// +// FP: Load instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[LSU], (instregex "LE(Y)?$")>; +def : InstRW<[LSU], (instregex "LD(Y|E32)?$")>; +def : InstRW<[LSU], (instregex "LX$")>; + +//===----------------------------------------------------------------------===// +// FP: Store instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat7], (instregex "STD(Y)?$")>; +def : InstRW<[FXU, LSU, Lat7], (instregex "STE(Y)?$")>; +def : InstRW<[FXU, LSU, Lat5], (instregex "STX$")>; + +//===----------------------------------------------------------------------===// +// FP: Conversion instructions +//===----------------------------------------------------------------------===// + +// Load rounded +def : InstRW<[FPU], (instregex "LEDBR(A)?$")>; +def : InstRW<[FPU, FPU, Lat20], (instregex "LEXBR(A)?$")>; +def : InstRW<[FPU, FPU, Lat20], (instregex "LDXBR(A)?$")>; + +// Load lengthened +def : InstRW<[FPU, LSU, Lat12], (instregex "LDEB$")>; +def : InstRW<[FPU], (instregex "LDEBR$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "LX(D|E)B$")>; +def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "LX(D|E)BR$")>; + +// Convert from fixed / logical +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F|G)BR(A?)$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F|G)BR(A?)$")>; +def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CX(F|G)BR(A?)$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CEL(F|G)BR$")>; +def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CDL(F|G)BR$")>; +def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CXL(F|G)BR$")>; + +// Convert to fixed / logical +def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E|D)BR(A?)$")>; +def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E|D)BR(A?)$")>; +def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F|G)XBR(A?)$")>; +def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLF(E|D)BR$")>; +def : InstRW<[FXU, FPU, Lat11, GroupAlone], (instregex "CLG(E|D)BR$")>; +def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "CL(F|G)XBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Unary arithmetic +//===----------------------------------------------------------------------===// + +// Load Complement / Negative / Positive +def : InstRW<[FPU], (instregex "L(C|N|P)DBR$")>; +def : InstRW<[FPU], (instregex "L(C|N|P)EBR$")>; +def : InstRW<[FXU], (instregex "LCDFR(_32)?$")>; +def : InstRW<[FXU], (instregex "LNDFR(_32)?$")>; +def : InstRW<[FXU], (instregex "LPDFR(_32)?$")>; +def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "L(C|N|P)XBR$")>; + +// Square root +def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E|D)B$")>; +def : InstRW<[FPU, Lat30], (instregex "SQ(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "SQXBR$")>; + +// Load FP integer +def : InstRW<[FPU], (instregex "FIEBR(A)?$")>; +def : InstRW<[FPU], (instregex "FIDBR(A)?$")>; +def : InstRW<[FPU2, FPU2, Lat15, GroupAlone], (instregex "FIXBR(A)?$")>; + +//===----------------------------------------------------------------------===// +// FP: Binary arithmetic +//===----------------------------------------------------------------------===// + +// Addition +def : InstRW<[FPU, LSU, Lat12], (instregex "A(E|D)B$")>; +def : InstRW<[FPU], (instregex "A(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "AXBR$")>; + +// Subtraction +def : InstRW<[FPU, LSU, Lat12], (instregex "S(E|D)B$")>; +def : InstRW<[FPU], (instregex "S(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "SXBR$")>; + +// Multiply +def : InstRW<[FPU, LSU, Lat12], (instregex "M(D|DE|EE)B$")>; +def : InstRW<[FPU], (instregex "M(D|DE|EE)BR$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MXDB$")>; +def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MXDBR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "MXBR$")>; + +// Multiply and add / subtract +def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)EB$")>; +def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)EBR$")>; +def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)DB$")>; +def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)DBR$")>; + +// Division +def : InstRW<[FPU, LSU, Lat30], (instregex "D(E|D)B$")>; +def : InstRW<[FPU, Lat30], (instregex "D(E|D)BR$")>; +def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "DXBR$")>; + +//===----------------------------------------------------------------------===// +// FP: Comparisons +//===----------------------------------------------------------------------===// + +// Compare +def : InstRW<[FPU, LSU, Lat12], (instregex "C(E|D)B$")>; +def : InstRW<[FPU], (instregex "C(E|D)BR$")>; +def : InstRW<[FPU, FPU, Lat30], (instregex "CXBR$")>; + +// Test Data Class +def : InstRW<[FPU, LSU, Lat15], (instregex "TC(E|D)B$")>; +def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "TCXB$")>; + +//===----------------------------------------------------------------------===// +// FP: Floating-point control register instructions +//===----------------------------------------------------------------------===// + +def : InstRW<[FXU, LSU, Lat4, GroupAlone], (instregex "EFPC$")>; +def : InstRW<[LSU, Lat3, GroupAlone], (instregex "SFPC$")>; +def : InstRW<[LSU, LSU, Lat6, GroupAlone], (instregex "LFPC$")>; +def : InstRW<[LSU, Lat3, GroupAlone], (instregex "STFPC$")>; +def : InstRW<[FXU, Lat30, GroupAlone], (instregex "SFASR$")>; +def : InstRW<[FXU, LSU, Lat30, GroupAlone], (instregex "LFAS$")>; +def : InstRW<[FXU, Lat2, GroupAlone], (instregex "SRNM(B|T)?$")>; + +} + diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZShortenInst.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZShortenInst.cpp index 7f26a35..83882fc 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZShortenInst.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZShortenInst.cpp @@ -29,7 +29,7 @@ public: static char ID; SystemZShortenInst(const SystemZTargetMachine &tm); - const char *getPassName() const override { + StringRef getPassName() const override { return "SystemZ Instruction Shortening"; } @@ -37,7 +37,7 @@ public: bool runOnMachineFunction(MachineFunction &F) override; MachineFunctionProperties getRequiredProperties() const override { return MachineFunctionProperties().set( - MachineFunctionProperties::Property::AllVRegsAllocated); + MachineFunctionProperties::Property::NoVRegs); } private: @@ -275,7 +275,7 @@ bool SystemZShortenInst::runOnMachineFunction(MachineFunction &F) { const SystemZSubtarget &ST = F.getSubtarget<SystemZSubtarget>(); TII = ST.getInstrInfo(); TRI = ST.getRegisterInfo(); - LiveRegs.init(TRI); + LiveRegs.init(*TRI); bool Changed = false; for (auto &MBB : F) diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.cpp index 67d5e01..ce07ea3 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.cpp @@ -39,10 +39,12 @@ SystemZSubtarget::SystemZSubtarget(const Triple &TT, const std::string &CPU, HasLoadStoreOnCond(false), HasHighWord(false), HasFPExtension(false), HasPopulationCount(false), HasFastSerialization(false), HasInterlockedAccess1(false), HasMiscellaneousExtensions(false), + HasExecutionHint(false), HasLoadAndTrap(false), HasTransactionalExecution(false), HasProcessorAssist(false), - HasVector(false), HasLoadStoreOnCond2(false), TargetTriple(TT), - InstrInfo(initializeSubtargetDependencies(CPU, FS)), TLInfo(TM, *this), - TSInfo(), FrameLowering() {} + HasVector(false), HasLoadStoreOnCond2(false), + HasLoadAndZeroRightmostByte(false), + TargetTriple(TT), InstrInfo(initializeSubtargetDependencies(CPU, FS)), + TLInfo(TM, *this), TSInfo(), FrameLowering() {} bool SystemZSubtarget::isPC32DBLSymbol(const GlobalValue *GV, CodeModel::Model CM) const { diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.h b/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.h index 6007f6f..cdb6132 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZSubtarget.h @@ -42,10 +42,13 @@ protected: bool HasFastSerialization; bool HasInterlockedAccess1; bool HasMiscellaneousExtensions; + bool HasExecutionHint; + bool HasLoadAndTrap; bool HasTransactionalExecution; bool HasProcessorAssist; bool HasVector; bool HasLoadStoreOnCond2; + bool HasLoadAndZeroRightmostByte; private: Triple TargetTriple; @@ -77,6 +80,9 @@ public: // This is important for reducing register pressure in vector code. bool useAA() const override { return true; } + // Always enable the early if-conversion pass. + bool enableEarlyIfConversion() const override { return true; } + // Automatically generated by tblgen. void ParseSubtargetFeatures(StringRef CPU, StringRef FS); @@ -109,12 +115,23 @@ public: return HasMiscellaneousExtensions; } + // Return true if the target has the execution-hint facility. + bool hasExecutionHint() const { return HasExecutionHint; } + + // Return true if the target has the load-and-trap facility. + bool hasLoadAndTrap() const { return HasLoadAndTrap; } + // Return true if the target has the transactional-execution facility. bool hasTransactionalExecution() const { return HasTransactionalExecution; } // Return true if the target has the processor-assist facility. bool hasProcessorAssist() const { return HasProcessorAssist; } + // Return true if the target has the load-and-zero-rightmost-byte facility. + bool hasLoadAndZeroRightmostByte() const { + return HasLoadAndZeroRightmostByte; + } + // Return true if the target has the vector facility. bool hasVector() const { return HasVector; } diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp index 85a3f6f..33fdb8f 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp @@ -9,6 +9,7 @@ #include "SystemZTargetMachine.h" #include "SystemZTargetTransformInfo.h" +#include "SystemZMachineScheduler.h" #include "llvm/CodeGen/Passes.h" #include "llvm/CodeGen/TargetPassConfig.h" #include "llvm/Support/TargetRegistry.h" @@ -17,10 +18,9 @@ using namespace llvm; -extern cl::opt<bool> MISchedPostRA; extern "C" void LLVMInitializeSystemZTarget() { // Register the target. - RegisterTargetMachine<SystemZTargetMachine> X(TheSystemZTarget); + RegisterTargetMachine<SystemZTargetMachine> X(getTheSystemZTarget()); } // Determine whether we use the vector ABI. @@ -114,8 +114,15 @@ public: return getTM<SystemZTargetMachine>(); } + ScheduleDAGInstrs * + createPostMachineScheduler(MachineSchedContext *C) const override { + return new ScheduleDAGMI(C, make_unique<SystemZPostRASchedStrategy>(C), + /*RemoveKillFlags=*/true); + } + void addIRPasses() override; bool addInstSelector() override; + bool addILPOpts() override; void addPreSched2() override; void addPreEmitPass() override; }; @@ -137,7 +144,14 @@ bool SystemZPassConfig::addInstSelector() { return false; } +bool SystemZPassConfig::addILPOpts() { + addPass(&EarlyIfConverterID); + return true; +} + void SystemZPassConfig::addPreSched2() { + addPass(createSystemZExpandPseudoPass(getSystemZTargetMachine())); + if (getOptLevel() != CodeGenOpt::None) addPass(&IfConverterID); } @@ -180,12 +194,8 @@ void SystemZPassConfig::addPreEmitPass() { // Do final scheduling after all other optimizations, to get an // optimal input for the decoder (branch relaxation must happen // after block placement). - if (getOptLevel() != CodeGenOpt::None) { - if (MISchedPostRA) - addPass(&PostMachineSchedulerID); - else - addPass(&PostRASchedulerID); - } + if (getOptLevel() != CodeGenOpt::None) + addPass(&PostMachineSchedulerID); } TargetPassConfig *SystemZTargetMachine::createPassConfig(PassManagerBase &PM) { diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp b/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp index 5ff5b21..b10c0e0 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp +++ b/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp @@ -238,6 +238,63 @@ SystemZTTIImpl::getPopcntSupport(unsigned TyWidth) { return TTI::PSK_Software; } +void SystemZTTIImpl::getUnrollingPreferences(Loop *L, + TTI::UnrollingPreferences &UP) { + // Find out if L contains a call, what the machine instruction count + // estimate is, and how many stores there are. + bool HasCall = false; + unsigned NumStores = 0; + for (auto &BB : L->blocks()) + for (auto &I : *BB) { + if (isa<CallInst>(&I) || isa<InvokeInst>(&I)) { + ImmutableCallSite CS(&I); + if (const Function *F = CS.getCalledFunction()) { + if (isLoweredToCall(F)) + HasCall = true; + if (F->getIntrinsicID() == Intrinsic::memcpy || + F->getIntrinsicID() == Intrinsic::memset) + NumStores++; + } else { // indirect call. + HasCall = true; + } + } + if (isa<StoreInst>(&I)) { + NumStores++; + Type *MemAccessTy = I.getOperand(0)->getType(); + if((MemAccessTy->isIntegerTy() || MemAccessTy->isFloatingPointTy()) && + (getDataLayout().getTypeSizeInBits(MemAccessTy) == 128)) + NumStores++; // 128 bit fp/int stores get split. + } + } + + // The z13 processor will run out of store tags if too many stores + // are fed into it too quickly. Therefore make sure there are not + // too many stores in the resulting unrolled loop. + unsigned const Max = (NumStores ? (12 / NumStores) : UINT_MAX); + + if (HasCall) { + // Only allow full unrolling if loop has any calls. + UP.FullUnrollMaxCount = Max; + UP.MaxCount = 1; + return; + } + + UP.MaxCount = Max; + if (UP.MaxCount <= 1) + return; + + // Allow partial and runtime trip count unrolling. + UP.Partial = UP.Runtime = true; + + UP.PartialThreshold = 75; + UP.DefaultUnrollRuntimeCount = 4; + + // Allow expensive instructions in the pre-header of the loop. + UP.AllowExpensiveTripCount = true; + + UP.Force = true; +} + unsigned SystemZTTIImpl::getNumberOfRegisters(bool Vector) { if (!Vector) // Discount the stack pointer. Also leave out %r0, since it can't diff --git a/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h b/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h index 9ae736d..f7d2d82 100644 --- a/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h +++ b/contrib/llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h @@ -32,13 +32,6 @@ public: : BaseT(TM, F.getParent()->getDataLayout()), ST(TM->getSubtargetImpl(F)), TLI(ST->getTargetLowering()) {} - // Provide value semantics. MSVC requires that we spell all of these out. - SystemZTTIImpl(const SystemZTTIImpl &Arg) - : BaseT(static_cast<const BaseT &>(Arg)), ST(Arg.ST), TLI(Arg.TLI) {} - SystemZTTIImpl(SystemZTTIImpl &&Arg) - : BaseT(std::move(static_cast<BaseT &>(Arg))), ST(std::move(Arg.ST)), - TLI(std::move(Arg.TLI)) {} - /// \name Scalar TTI Implementations /// @{ @@ -50,6 +43,8 @@ public: TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth); + void getUnrollingPreferences(Loop *L, TTI::UnrollingPreferences &UP); + /// @} /// \name Vector TTI Implementations diff --git a/contrib/llvm/lib/Target/SystemZ/TargetInfo/SystemZTargetInfo.cpp b/contrib/llvm/lib/Target/SystemZ/TargetInfo/SystemZTargetInfo.cpp index 8f9aa28..d3c53a4 100644 --- a/contrib/llvm/lib/Target/SystemZ/TargetInfo/SystemZTargetInfo.cpp +++ b/contrib/llvm/lib/Target/SystemZ/TargetInfo/SystemZTargetInfo.cpp @@ -12,9 +12,12 @@ using namespace llvm; -Target llvm::TheSystemZTarget; +Target &llvm::getTheSystemZTarget() { + static Target TheSystemZTarget; + return TheSystemZTarget; +} extern "C" void LLVMInitializeSystemZTargetInfo() { - RegisterTarget<Triple::systemz, /*HasJIT=*/true> - X(TheSystemZTarget, "systemz", "SystemZ"); + RegisterTarget<Triple::systemz, /*HasJIT=*/true> X(getTheSystemZTarget(), + "systemz", "SystemZ"); } |