diff options
author | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
---|---|---|
committer | dim <dim@FreeBSD.org> | 2017-04-02 17:24:58 +0000 |
commit | 60b571e49a90d38697b3aca23020d9da42fc7d7f (patch) | |
tree | 99351324c24d6cb146b6285b6caffa4d26fce188 /contrib/llvm/tools/clang/lib/Headers/intrin.h | |
parent | bea1b22c7a9bce1dfdd73e6e5b65bc4752215180 (diff) | |
download | FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.zip FreeBSD-src-60b571e49a90d38697b3aca23020d9da42fc7d7f.tar.gz |
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste):
Add WITH_LLD_AS_LD build knob
If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not
capable of linking the world and kernel, but can self-host and link many
substantial applications. GNU ld continues to be used for the world and
kernel build, regardless of how this knob is set.
It is on by default for arm64, and off for all other CPU architectures.
Sponsored by: The FreeBSD Foundation
MFC r310840:
Reapply 310775, now it also builds correctly if lldb is disabled:
Move llvm-objdump from CLANG_EXTRAS to installed by default
We currently install three tools from binutils 2.17.50: as, ld, and
objdump. Work is underway to migrate to a permissively-licensed
tool-chain, with one goal being the retirement of binutils 2.17.50.
LLVM's llvm-objdump is intended to be compatible with GNU objdump
although it is currently missing some options and may have formatting
differences. Enable it by default for testing and further investigation.
It may later be changed to install as /usr/bin/objdump, it becomes a
fully viable replacement.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8879
MFC r312855 (by emaste):
Rename LLD_AS_LD to LLD_IS_LD, for consistency with CLANG_IS_CC
Reported by: Dan McGregor <dan.mcgregor usask.ca>
MFC r313559 | glebius | 2017-02-10 18:34:48 +0100 (Fri, 10 Feb 2017) | 5 lines
Don't check struct rtentry on FreeBSD, it is an internal kernel structure.
On other systems it may be API structure for SIOCADDRT/SIOCDELRT.
Reviewed by: emaste, dim
MFC r314152 (by jkim):
Remove an assembler flag, which is redundant since r309124. The upstream
took care of it by introducing a macro NO_EXEC_STACK_DIRECTIVE.
http://llvm.org/viewvc/llvm-project?rev=273500&view=rev
Reviewed by: dim
MFC r314564:
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC r314708:
For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
MFC r314795:
Reapply r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
MFC r314907 | mmel | 2017-03-08 12:40:27 +0100 (Wed, 08 Mar 2017) | 7 lines
Unbreak ARMv6 world.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
MFC r315016:
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release.
We were already very close to the last release candidate, so this is a
pretty minor update.
Relnotes: yes
MFC r316005:
Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by
Weiming Zhao):
builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.
Summary:
Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation
mode (-mthumb, -marm), it reflect's capability of given CPU.
Due to this:
- use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB
- use '.thumb' directive consistently in all affected files
- decorate all thumb functions using
DEFINE_COMPILERRT_THUMB_FUNCTION()
---------
Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 !
Reviewers: weimingz, rengolin, compnerd
Subscribers: aemerson, dim
Differential Revision: https://reviews.llvm.org/D30938
Discussed with: mmel
Diffstat (limited to 'contrib/llvm/tools/clang/lib/Headers/intrin.h')
-rw-r--r-- | contrib/llvm/tools/clang/lib/Headers/intrin.h | 772 |
1 files changed, 418 insertions, 354 deletions
diff --git a/contrib/llvm/tools/clang/lib/Headers/intrin.h b/contrib/llvm/tools/clang/lib/Headers/intrin.h index f18711a..a35262a 100644 --- a/contrib/llvm/tools/clang/lib/Headers/intrin.h +++ b/contrib/llvm/tools/clang/lib/Headers/intrin.h @@ -34,6 +34,10 @@ #include <x86intrin.h> #endif +#if defined(__arm__) +#include <armintr.h> +#endif + /* For the definition of jmp_buf. */ #if __STDC_HOSTED__ #include <setjmp.h> @@ -61,8 +65,9 @@ static __inline__ void __cpuid(int[4], int); static __inline__ void __cpuidex(int[4], int, int); -void __debugbreak(void); +static __inline__ __int64 __emul(int, int); +static __inline__ unsigned __int64 __emulu(unsigned int, unsigned int); void __cdecl __fastfail(unsigned int); unsigned int __getcallerseflags(void); @@ -93,6 +98,7 @@ static __inline__ void __movsd(unsigned long *, unsigned long const *, size_t); static __inline__ void __movsw(unsigned short *, unsigned short const *, size_t); +static __inline__ void __nop(void); void __nvreg_restore_fence(void); void __nvreg_save_fence(void); @@ -102,10 +108,6 @@ void __outdword(unsigned short, unsigned long); void __outdwordstring(unsigned short, unsigned long *, unsigned long); void __outword(unsigned short, unsigned short); void __outwordstring(unsigned short, unsigned short *, unsigned long); -static __inline__ -unsigned int __popcnt(unsigned int); -static __inline__ -unsigned short __popcnt16(unsigned short); unsigned long __readcr0(void); unsigned long __readcr2(void); static __inline__ @@ -117,8 +119,6 @@ unsigned int __readdr(unsigned int); static __inline__ unsigned char __readfsbyte(unsigned long); static __inline__ -unsigned long __readfsdword(unsigned long); -static __inline__ unsigned __int64 __readfsqword(unsigned long); static __inline__ unsigned short __readfsword(unsigned long); @@ -172,106 +172,34 @@ static __inline__ unsigned char _bittestandreset(long *, long); static __inline__ unsigned char _bittestandset(long *, long); -unsigned __int64 __cdecl _byteswap_uint64(unsigned __int64); -unsigned long __cdecl _byteswap_ulong(unsigned long); -unsigned short __cdecl _byteswap_ushort(unsigned short); void __cdecl _disable(void); void __cdecl _enable(void); long _InterlockedAddLargeStatistic(__int64 volatile *_Addend, long _Value); -static __inline__ -long _InterlockedAnd(long volatile *_Value, long _Mask); -static __inline__ -short _InterlockedAnd16(short volatile *_Value, short _Mask); -static __inline__ -char _InterlockedAnd8(char volatile *_Value, char _Mask); unsigned char _interlockedbittestandreset(long volatile *, long); static __inline__ unsigned char _interlockedbittestandset(long volatile *, long); -static __inline__ -long __cdecl _InterlockedCompareExchange(long volatile *_Destination, - long _Exchange, long _Comparand); long _InterlockedCompareExchange_HLEAcquire(long volatile *, long, long); long _InterlockedCompareExchange_HLERelease(long volatile *, long, long); -static __inline__ -short _InterlockedCompareExchange16(short volatile *_Destination, - short _Exchange, short _Comparand); -static __inline__ -__int64 _InterlockedCompareExchange64(__int64 volatile *_Destination, - __int64 _Exchange, __int64 _Comparand); __int64 _InterlockedcompareExchange64_HLEAcquire(__int64 volatile *, __int64, __int64); __int64 _InterlockedCompareExchange64_HLERelease(__int64 volatile *, __int64, __int64); -static __inline__ -char _InterlockedCompareExchange8(char volatile *_Destination, char _Exchange, - char _Comparand); void *_InterlockedCompareExchangePointer_HLEAcquire(void *volatile *, void *, void *); void *_InterlockedCompareExchangePointer_HLERelease(void *volatile *, void *, void *); -static __inline__ -long __cdecl _InterlockedDecrement(long volatile *_Addend); -static __inline__ -short _InterlockedDecrement16(short volatile *_Addend); -long _InterlockedExchange(long volatile *_Target, long _Value); -static __inline__ -short _InterlockedExchange16(short volatile *_Target, short _Value); -static __inline__ -char _InterlockedExchange8(char volatile *_Target, char _Value); -static __inline__ -long __cdecl _InterlockedExchangeAdd(long volatile *_Addend, long _Value); long _InterlockedExchangeAdd_HLEAcquire(long volatile *, long); long _InterlockedExchangeAdd_HLERelease(long volatile *, long); -static __inline__ -short _InterlockedExchangeAdd16(short volatile *_Addend, short _Value); __int64 _InterlockedExchangeAdd64_HLEAcquire(__int64 volatile *, __int64); __int64 _InterlockedExchangeAdd64_HLERelease(__int64 volatile *, __int64); -static __inline__ -char _InterlockedExchangeAdd8(char volatile *_Addend, char _Value); -static __inline__ -long __cdecl _InterlockedIncrement(long volatile *_Addend); -static __inline__ -short _InterlockedIncrement16(short volatile *_Addend); -static __inline__ -long _InterlockedOr(long volatile *_Value, long _Mask); -static __inline__ -short _InterlockedOr16(short volatile *_Value, short _Mask); -static __inline__ -char _InterlockedOr8(char volatile *_Value, char _Mask); -static __inline__ -long _InterlockedXor(long volatile *_Value, long _Mask); -static __inline__ -short _InterlockedXor16(short volatile *_Value, short _Mask); -static __inline__ -char _InterlockedXor8(char volatile *_Value, char _Mask); void __cdecl _invpcid(unsigned int, void *); -static __inline__ -unsigned long __cdecl _lrotl(unsigned long, int); -static __inline__ -unsigned long __cdecl _lrotr(unsigned long, int); -static __inline__ -void _ReadBarrier(void); -static __inline__ -void _ReadWriteBarrier(void); -static __inline__ -void *_ReturnAddress(void); +static __inline__ void +__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) +_ReadBarrier(void); +static __inline__ void +__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) +_ReadWriteBarrier(void); unsigned int _rorx_u32(unsigned int, const unsigned int); -static __inline__ -unsigned int __cdecl _rotl(unsigned int _Value, int _Shift); -static __inline__ -unsigned short _rotl16(unsigned short _Value, unsigned char _Shift); -static __inline__ -unsigned __int64 __cdecl _rotl64(unsigned __int64 _Value, int _Shift); -static __inline__ -unsigned char _rotl8(unsigned char _Value, unsigned char _Shift); -static __inline__ -unsigned int __cdecl _rotr(unsigned int _Value, int _Shift); -static __inline__ -unsigned short _rotr16(unsigned short _Value, unsigned char _Shift); -static __inline__ -unsigned __int64 __cdecl _rotr64(unsigned __int64 _Value, int _Shift); -static __inline__ -unsigned char _rotr8(unsigned char _Value, unsigned char _Shift); int _sarx_i32(int, unsigned int); #if __STDC_HOSTED__ int __cdecl _setjmp(jmp_buf); @@ -281,8 +209,9 @@ unsigned int _shrx_u32(unsigned int, unsigned int); void _Store_HLERelease(long volatile *, long); void _Store64_HLERelease(__int64 volatile *, __int64); void _StorePointer_HLERelease(void *volatile *, void *); -static __inline__ -void _WriteBarrier(void); +static __inline__ void +__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) +_WriteBarrier(void); unsigned __int32 xbegin(void); void _xend(void); static __inline__ @@ -307,9 +236,6 @@ void __lwpval64(unsigned __int64, unsigned int, unsigned int); unsigned __int64 __lzcnt64(unsigned __int64); static __inline__ void __movsq(unsigned long long *, unsigned long long const *, size_t); -__int64 __mulh(__int64, __int64); -static __inline__ -unsigned __int64 __popcnt64(unsigned __int64); static __inline__ unsigned char __readgsbyte(unsigned long); static __inline__ @@ -348,7 +274,6 @@ static __inline__ unsigned char _bittestandreset64(__int64 *, __int64); static __inline__ unsigned char _bittestandset64(__int64 *, __int64); -unsigned __int64 __cdecl _byteswap_uint64(unsigned __int64); long _InterlockedAnd_np(long volatile *_Value, long _Mask); short _InterlockedAnd16_np(short volatile *_Value, short _Mask); __int64 _InterlockedAnd64_np(__int64 volatile *_Value, __int64 _Mask); @@ -374,154 +299,58 @@ __int64 _InterlockedCompareExchange64_HLERelease(__int64 volatile *, __int64, __int64); __int64 _InterlockedCompareExchange64_np(__int64 volatile *_Destination, __int64 _Exchange, __int64 _Comparand); -void *_InterlockedCompareExchangePointer(void *volatile *_Destination, - void *_Exchange, void *_Comparand); void *_InterlockedCompareExchangePointer_np(void *volatile *_Destination, void *_Exchange, void *_Comparand); -static __inline__ -__int64 _InterlockedDecrement64(__int64 volatile *_Addend); -static __inline__ -__int64 _InterlockedExchange64(__int64 volatile *_Target, __int64 _Value); -static __inline__ -__int64 _InterlockedExchangeAdd64(__int64 volatile *_Addend, __int64 _Value); -void *_InterlockedExchangePointer(void *volatile *_Target, void *_Value); -static __inline__ -__int64 _InterlockedIncrement64(__int64 volatile *_Addend); long _InterlockedOr_np(long volatile *_Value, long _Mask); short _InterlockedOr16_np(short volatile *_Value, short _Mask); -static __inline__ -__int64 _InterlockedOr64(__int64 volatile *_Value, __int64 _Mask); __int64 _InterlockedOr64_np(__int64 volatile *_Value, __int64 _Mask); char _InterlockedOr8_np(char volatile *_Value, char _Mask); long _InterlockedXor_np(long volatile *_Value, long _Mask); short _InterlockedXor16_np(short volatile *_Value, short _Mask); -static __inline__ -__int64 _InterlockedXor64(__int64 volatile *_Value, __int64 _Mask); __int64 _InterlockedXor64_np(__int64 volatile *_Value, __int64 _Mask); char _InterlockedXor8_np(char volatile *_Value, char _Mask); -static __inline__ -__int64 _mul128(__int64 _Multiplier, __int64 _Multiplicand, - __int64 *_HighProduct); unsigned __int64 _rorx_u64(unsigned __int64, const unsigned int); __int64 _sarx_i64(__int64, unsigned int); -#if __STDC_HOSTED__ -int __cdecl _setjmpex(jmp_buf); -#endif unsigned __int64 _shlx_u64(unsigned __int64, unsigned int); unsigned __int64 _shrx_u64(unsigned __int64, unsigned int); -/* - * Multiply two 64-bit integers and obtain a 64-bit result. - * The low-half is returned directly and the high half is in an out parameter. - */ -static __inline__ unsigned __int64 __DEFAULT_FN_ATTRS -_umul128(unsigned __int64 _Multiplier, unsigned __int64 _Multiplicand, - unsigned __int64 *_HighProduct) { - unsigned __int128 _FullProduct = - (unsigned __int128)_Multiplier * (unsigned __int128)_Multiplicand; - *_HighProduct = _FullProduct >> 64; - return _FullProduct; -} -static __inline__ unsigned __int64 __DEFAULT_FN_ATTRS -__umulh(unsigned __int64 _Multiplier, unsigned __int64 _Multiplicand) { - unsigned __int128 _FullProduct = - (unsigned __int128)_Multiplier * (unsigned __int128)_Multiplicand; - return _FullProduct >> 64; -} +static __inline__ +__int64 __mulh(__int64, __int64); +static __inline__ +unsigned __int64 __umulh(unsigned __int64, unsigned __int64); +static __inline__ +__int64 _mul128(__int64, __int64, __int64*); +static __inline__ +unsigned __int64 _umul128(unsigned __int64, + unsigned __int64, + unsigned __int64*); #endif /* __x86_64__ */ -/*----------------------------------------------------------------------------*\ -|* Multiplication -\*----------------------------------------------------------------------------*/ -static __inline__ __int64 __DEFAULT_FN_ATTRS -__emul(int __in1, int __in2) { - return (__int64)__in1 * (__int64)__in2; -} -static __inline__ unsigned __int64 __DEFAULT_FN_ATTRS -__emulu(unsigned int __in1, unsigned int __in2) { - return (unsigned __int64)__in1 * (unsigned __int64)__in2; -} -/*----------------------------------------------------------------------------*\ -|* Bit Twiddling -\*----------------------------------------------------------------------------*/ -static __inline__ unsigned char __DEFAULT_FN_ATTRS -_rotl8(unsigned char _Value, unsigned char _Shift) { - _Shift &= 0x7; - return _Shift ? (_Value << _Shift) | (_Value >> (8 - _Shift)) : _Value; -} -static __inline__ unsigned char __DEFAULT_FN_ATTRS -_rotr8(unsigned char _Value, unsigned char _Shift) { - _Shift &= 0x7; - return _Shift ? (_Value >> _Shift) | (_Value << (8 - _Shift)) : _Value; -} -static __inline__ unsigned short __DEFAULT_FN_ATTRS -_rotl16(unsigned short _Value, unsigned char _Shift) { - _Shift &= 0xf; - return _Shift ? (_Value << _Shift) | (_Value >> (16 - _Shift)) : _Value; -} -static __inline__ unsigned short __DEFAULT_FN_ATTRS -_rotr16(unsigned short _Value, unsigned char _Shift) { - _Shift &= 0xf; - return _Shift ? (_Value >> _Shift) | (_Value << (16 - _Shift)) : _Value; -} -static __inline__ unsigned int __DEFAULT_FN_ATTRS -_rotl(unsigned int _Value, int _Shift) { - _Shift &= 0x1f; - return _Shift ? (_Value << _Shift) | (_Value >> (32 - _Shift)) : _Value; -} -static __inline__ unsigned int __DEFAULT_FN_ATTRS -_rotr(unsigned int _Value, int _Shift) { - _Shift &= 0x1f; - return _Shift ? (_Value >> _Shift) | (_Value << (32 - _Shift)) : _Value; -} -static __inline__ unsigned long __DEFAULT_FN_ATTRS -_lrotl(unsigned long _Value, int _Shift) { - _Shift &= 0x1f; - return _Shift ? (_Value << _Shift) | (_Value >> (32 - _Shift)) : _Value; -} -static __inline__ unsigned long __DEFAULT_FN_ATTRS -_lrotr(unsigned long _Value, int _Shift) { - _Shift &= 0x1f; - return _Shift ? (_Value >> _Shift) | (_Value << (32 - _Shift)) : _Value; -} -static -__inline__ unsigned __int64 __DEFAULT_FN_ATTRS -_rotl64(unsigned __int64 _Value, int _Shift) { - _Shift &= 0x3f; - return _Shift ? (_Value << _Shift) | (_Value >> (64 - _Shift)) : _Value; -} -static -__inline__ unsigned __int64 __DEFAULT_FN_ATTRS -_rotr64(unsigned __int64 _Value, int _Shift) { - _Shift &= 0x3f; - return _Shift ? (_Value >> _Shift) | (_Value << (64 - _Shift)) : _Value; -} +#if defined(__x86_64__) || defined(__arm__) + +static __inline__ +__int64 _InterlockedDecrement64(__int64 volatile *_Addend); +static __inline__ +__int64 _InterlockedExchange64(__int64 volatile *_Target, __int64 _Value); +static __inline__ +__int64 _InterlockedExchangeAdd64(__int64 volatile *_Addend, __int64 _Value); +static __inline__ +__int64 _InterlockedExchangeSub64(__int64 volatile *_Subend, __int64 _Value); +static __inline__ +__int64 _InterlockedIncrement64(__int64 volatile *_Addend); +static __inline__ +__int64 _InterlockedOr64(__int64 volatile *_Value, __int64 _Mask); +static __inline__ +__int64 _InterlockedXor64(__int64 volatile *_Value, __int64 _Mask); +static __inline__ +__int64 _InterlockedAnd64(__int64 volatile *_Value, __int64 _Mask); + +#endif + /*----------------------------------------------------------------------------*\ |* Bit Counting and Testing \*----------------------------------------------------------------------------*/ static __inline__ unsigned char __DEFAULT_FN_ATTRS -_BitScanForward(unsigned long *_Index, unsigned long _Mask) { - if (!_Mask) - return 0; - *_Index = __builtin_ctzl(_Mask); - return 1; -} -static __inline__ unsigned char __DEFAULT_FN_ATTRS -_BitScanReverse(unsigned long *_Index, unsigned long _Mask) { - if (!_Mask) - return 0; - *_Index = 31 - __builtin_clzl(_Mask); - return 1; -} -static __inline__ unsigned short __DEFAULT_FN_ATTRS -__popcnt16(unsigned short _Value) { - return __builtin_popcount((int)_Value); -} -static __inline__ unsigned int __DEFAULT_FN_ATTRS -__popcnt(unsigned int _Value) { - return __builtin_popcount(_Value); -} -static __inline__ unsigned char __DEFAULT_FN_ATTRS _bittest(long const *_BitBase, long _BitPos) { return (*_BitBase >> _BitPos) & 1; } @@ -548,26 +377,24 @@ _interlockedbittestandset(long volatile *_BitBase, long _BitPos) { long _PrevVal = __atomic_fetch_or(_BitBase, 1l << _BitPos, __ATOMIC_SEQ_CST); return (_PrevVal >> _BitPos) & 1; } -#ifdef __x86_64__ +#if defined(__arm__) || defined(__aarch64__) static __inline__ unsigned char __DEFAULT_FN_ATTRS -_BitScanForward64(unsigned long *_Index, unsigned __int64 _Mask) { - if (!_Mask) - return 0; - *_Index = __builtin_ctzll(_Mask); - return 1; +_interlockedbittestandset_acq(long volatile *_BitBase, long _BitPos) { + long _PrevVal = __atomic_fetch_or(_BitBase, 1l << _BitPos, __ATOMIC_ACQUIRE); + return (_PrevVal >> _BitPos) & 1; } static __inline__ unsigned char __DEFAULT_FN_ATTRS -_BitScanReverse64(unsigned long *_Index, unsigned __int64 _Mask) { - if (!_Mask) - return 0; - *_Index = 63 - __builtin_clzll(_Mask); - return 1; +_interlockedbittestandset_nf(long volatile *_BitBase, long _BitPos) { + long _PrevVal = __atomic_fetch_or(_BitBase, 1l << _BitPos, __ATOMIC_RELAXED); + return (_PrevVal >> _BitPos) & 1; } -static __inline__ -unsigned __int64 __DEFAULT_FN_ATTRS -__popcnt64(unsigned __int64 _Value) { - return __builtin_popcountll(_Value); +static __inline__ unsigned char __DEFAULT_FN_ATTRS +_interlockedbittestandset_rel(long volatile *_BitBase, long _BitPos) { + long _PrevVal = __atomic_fetch_or(_BitBase, 1l << _BitPos, __ATOMIC_RELEASE); + return (_PrevVal >> _BitPos) & 1; } +#endif +#ifdef __x86_64__ static __inline__ unsigned char __DEFAULT_FN_ATTRS _bittest64(__int64 const *_BitBase, __int64 _BitPos) { return (*_BitBase >> _BitPos) & 1; @@ -600,196 +427,449 @@ _interlockedbittestandset64(__int64 volatile *_BitBase, __int64 _BitPos) { /*----------------------------------------------------------------------------*\ |* Interlocked Exchange Add \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedExchangeAdd8(char volatile *_Addend, char _Value) { - return __atomic_fetch_add(_Addend, _Value, __ATOMIC_SEQ_CST); -} -static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedExchangeAdd16(short volatile *_Addend, short _Value) { - return __atomic_fetch_add(_Addend, _Value, __ATOMIC_SEQ_CST); +_InterlockedExchangeAdd8_acq(char volatile *_Addend, char _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_ACQUIRE); } -#ifdef __x86_64__ -static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedExchangeAdd64(__int64 volatile *_Addend, __int64 _Value) { - return __atomic_fetch_add(_Addend, _Value, __ATOMIC_SEQ_CST); +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd8_nf(char volatile *_Addend, char _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELAXED); } -#endif -/*----------------------------------------------------------------------------*\ -|* Interlocked Exchange Sub -\*----------------------------------------------------------------------------*/ static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedExchangeSub8(char volatile *_Subend, char _Value) { - return __atomic_fetch_sub(_Subend, _Value, __ATOMIC_SEQ_CST); +_InterlockedExchangeAdd8_rel(char volatile *_Addend, char _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELAXED); } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedExchangeSub16(short volatile *_Subend, short _Value) { - return __atomic_fetch_sub(_Subend, _Value, __ATOMIC_SEQ_CST); +_InterlockedExchangeAdd16_acq(short volatile *_Addend, short _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd16_nf(short volatile *_Addend, short _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELAXED); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd16_rel(short volatile *_Addend, short _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELEASE); } static __inline__ long __DEFAULT_FN_ATTRS -_InterlockedExchangeSub(long volatile *_Subend, long _Value) { - return __atomic_fetch_sub(_Subend, _Value, __ATOMIC_SEQ_CST); +_InterlockedExchangeAdd_acq(long volatile *_Addend, long _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd_nf(long volatile *_Addend, long _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd_rel(long volatile *_Addend, long _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd64_acq(__int64 volatile *_Addend, __int64 _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_ACQUIRE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedExchangeAdd64_nf(__int64 volatile *_Addend, __int64 _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELAXED); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedExchangeSub64(__int64 volatile *_Subend, __int64 _Value) { - return __atomic_fetch_sub(_Subend, _Value, __ATOMIC_SEQ_CST); +_InterlockedExchangeAdd64_rel(__int64 volatile *_Addend, __int64 _Value) { + return __atomic_fetch_add(_Addend, _Value, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Increment \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedIncrement16(short volatile *_Value) { - return __atomic_add_fetch(_Value, 1, __ATOMIC_SEQ_CST); +_InterlockedIncrement16_acq(short volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedIncrement16_nf(short volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELAXED); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedIncrement16_rel(short volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELEASE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedIncrement_acq(long volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedIncrement_nf(long volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedIncrement_rel(long volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedIncrement64_acq(__int64 volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedIncrement64_nf(__int64 volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELAXED); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedIncrement64(__int64 volatile *_Value) { - return __atomic_add_fetch(_Value, 1, __ATOMIC_SEQ_CST); +_InterlockedIncrement64_rel(__int64 volatile *_Value) { + return __atomic_add_fetch(_Value, 1, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Decrement \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedDecrement16(short volatile *_Value) { - return __atomic_sub_fetch(_Value, 1, __ATOMIC_SEQ_CST); +_InterlockedDecrement16_acq(short volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedDecrement16_nf(short volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELAXED); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedDecrement16_rel(short volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELEASE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedDecrement_acq(long volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedDecrement_nf(long volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedDecrement_rel(long volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedDecrement64_acq(__int64 volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_ACQUIRE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedDecrement64_nf(__int64 volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELAXED); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedDecrement64(__int64 volatile *_Value) { - return __atomic_sub_fetch(_Value, 1, __ATOMIC_SEQ_CST); +_InterlockedDecrement64_rel(__int64 volatile *_Value) { + return __atomic_sub_fetch(_Value, 1, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked And \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedAnd8_acq(char volatile *_Value, char _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedAnd8_nf(char volatile *_Value, char _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELAXED); +} static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedAnd8(char volatile *_Value, char _Mask) { - return __atomic_fetch_and(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedAnd8_rel(char volatile *_Value, char _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELEASE); } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedAnd16(short volatile *_Value, short _Mask) { - return __atomic_fetch_and(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedAnd16_acq(short volatile *_Value, short _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedAnd16_nf(short volatile *_Value, short _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedAnd16_rel(short volatile *_Value, short _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELEASE); } static __inline__ long __DEFAULT_FN_ATTRS -_InterlockedAnd(long volatile *_Value, long _Mask) { - return __atomic_fetch_and(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedAnd_acq(long volatile *_Value, long _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedAnd_nf(long volatile *_Value, long _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedAnd_rel(long volatile *_Value, long _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedAnd64_acq(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedAnd64_nf(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELAXED); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedAnd64(__int64 volatile *_Value, __int64 _Mask) { - return __atomic_fetch_and(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedAnd64_rel(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_and(_Value, _Mask, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Or \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedOr8_acq(char volatile *_Value, char _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_ACQUIRE); +} static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedOr8(char volatile *_Value, char _Mask) { - return __atomic_fetch_or(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedOr8_nf(char volatile *_Value, char _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedOr8_rel(char volatile *_Value, char _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELEASE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedOr16_acq(short volatile *_Value, short _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedOr16_nf(short volatile *_Value, short _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELAXED); } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedOr16(short volatile *_Value, short _Mask) { - return __atomic_fetch_or(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedOr16_rel(short volatile *_Value, short _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELEASE); } static __inline__ long __DEFAULT_FN_ATTRS -_InterlockedOr(long volatile *_Value, long _Mask) { - return __atomic_fetch_or(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedOr_acq(long volatile *_Value, long _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedOr_nf(long volatile *_Value, long _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedOr_rel(long volatile *_Value, long _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedOr64_acq(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedOr64_nf(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELAXED); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedOr64(__int64 volatile *_Value, __int64 _Mask) { - return __atomic_fetch_or(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedOr64_rel(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_or(_Value, _Mask, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Xor \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedXor8(char volatile *_Value, char _Mask) { - return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedXor8_acq(char volatile *_Value, char _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedXor8_nf(char volatile *_Value, char _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedXor8_rel(char volatile *_Value, char _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELEASE); } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedXor16(short volatile *_Value, short _Mask) { - return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedXor16_acq(short volatile *_Value, short _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedXor16_nf(short volatile *_Value, short _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedXor16_rel(short volatile *_Value, short _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELEASE); } static __inline__ long __DEFAULT_FN_ATTRS -_InterlockedXor(long volatile *_Value, long _Mask) { - return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedXor_acq(long volatile *_Value, long _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_ACQUIRE); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedXor_nf(long volatile *_Value, long _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedXor_rel(long volatile *_Value, long _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELEASE); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedXor64_acq(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_ACQUIRE); } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedXor64(__int64 volatile *_Value, __int64 _Mask) { - return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_SEQ_CST); +_InterlockedXor64_nf(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELAXED); +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedXor64_rel(__int64 volatile *_Value, __int64 _Mask) { + return __atomic_fetch_xor(_Value, _Mask, __ATOMIC_RELEASE); } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Exchange \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedExchange8_acq(char volatile *_Target, char _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_ACQUIRE); + return _Value; +} static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedExchange8(char volatile *_Target, char _Value) { - __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_SEQ_CST); +_InterlockedExchange8_nf(char volatile *_Target, char _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELAXED); + return _Value; +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedExchange8_rel(char volatile *_Target, char _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELEASE); return _Value; } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedExchange16(short volatile *_Target, short _Value) { - __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_SEQ_CST); +_InterlockedExchange16_acq(short volatile *_Target, short _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_ACQUIRE); + return _Value; +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedExchange16_nf(short volatile *_Target, short _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELAXED); + return _Value; +} +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedExchange16_rel(short volatile *_Target, short _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELEASE); + return _Value; +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedExchange_acq(long volatile *_Target, long _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_ACQUIRE); + return _Value; +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedExchange_nf(long volatile *_Target, long _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELAXED); + return _Value; +} +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedExchange_rel(long volatile *_Target, long _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELEASE); + return _Value; +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedExchange64_acq(__int64 volatile *_Target, __int64 _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_ACQUIRE); + return _Value; +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedExchange64_nf(__int64 volatile *_Target, __int64 _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELAXED); return _Value; } -#ifdef __x86_64__ static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedExchange64(__int64 volatile *_Target, __int64 _Value) { - __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_SEQ_CST); +_InterlockedExchange64_rel(__int64 volatile *_Target, __int64 _Value) { + __atomic_exchange(_Target, &_Value, &_Value, __ATOMIC_RELEASE); return _Value; } #endif /*----------------------------------------------------------------------------*\ |* Interlocked Compare Exchange \*----------------------------------------------------------------------------*/ +#if defined(__arm__) || defined(__aarch64__) +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedCompareExchange8_acq(char volatile *_Destination, + char _Exchange, char _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE); + return _Comparand; +} +static __inline__ char __DEFAULT_FN_ATTRS +_InterlockedCompareExchange8_nf(char volatile *_Destination, + char _Exchange, char _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); + return _Comparand; +} static __inline__ char __DEFAULT_FN_ATTRS -_InterlockedCompareExchange8(char volatile *_Destination, +_InterlockedCompareExchange8_rel(char volatile *_Destination, char _Exchange, char _Comparand) { __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, - __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); + __ATOMIC_SEQ_CST, __ATOMIC_RELEASE); return _Comparand; } static __inline__ short __DEFAULT_FN_ATTRS -_InterlockedCompareExchange16(short volatile *_Destination, +_InterlockedCompareExchange16_acq(short volatile *_Destination, short _Exchange, short _Comparand) { __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, - __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); + __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE); return _Comparand; } -static __inline__ __int64 __DEFAULT_FN_ATTRS -_InterlockedCompareExchange64(__int64 volatile *_Destination, - __int64 _Exchange, __int64 _Comparand) { +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedCompareExchange16_nf(short volatile *_Destination, + short _Exchange, short _Comparand) { __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, - __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); + __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); return _Comparand; } -/*----------------------------------------------------------------------------*\ -|* Barriers -\*----------------------------------------------------------------------------*/ -static __inline__ void __DEFAULT_FN_ATTRS -__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) -_ReadWriteBarrier(void) { - __atomic_signal_fence(__ATOMIC_SEQ_CST); +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedCompareExchange16_rel(short volatile *_Destination, + short _Exchange, short _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELEASE); + return _Comparand; } -static __inline__ void __DEFAULT_FN_ATTRS -__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) -_ReadBarrier(void) { - __atomic_signal_fence(__ATOMIC_SEQ_CST); +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedCompareExchange_acq(long volatile *_Destination, + long _Exchange, long _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE); + return _Comparand; } -static __inline__ void __DEFAULT_FN_ATTRS -__attribute__((__deprecated__("use other intrinsics or C++11 atomics instead"))) -_WriteBarrier(void) { - __atomic_signal_fence(__ATOMIC_SEQ_CST); +static __inline__ long __DEFAULT_FN_ATTRS +_InterlockedCompareExchange_nf(long volatile *_Destination, + long _Exchange, long _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); + return _Comparand; } -#ifdef __x86_64__ -static __inline__ void __DEFAULT_FN_ATTRS -__faststorefence(void) { - __atomic_thread_fence(__ATOMIC_SEQ_CST); +static __inline__ short __DEFAULT_FN_ATTRS +_InterlockedCompareExchange_rel(long volatile *_Destination, + long _Exchange, long _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELEASE); + return _Comparand; +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedCompareExchange64_acq(__int64 volatile *_Destination, + __int64 _Exchange, __int64 _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE); + return _Comparand; +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedCompareExchange64_nf(__int64 volatile *_Destination, + __int64 _Exchange, __int64 _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); + return _Comparand; +} +static __inline__ __int64 __DEFAULT_FN_ATTRS +_InterlockedCompareExchange64_rel(__int64 volatile *_Destination, + __int64 _Exchange, __int64 _Comparand) { + __atomic_compare_exchange(_Destination, &_Comparand, &_Exchange, 0, + __ATOMIC_SEQ_CST, __ATOMIC_RELEASE); + return _Comparand; } #endif /*----------------------------------------------------------------------------*\ @@ -840,59 +920,39 @@ __readgsqword(unsigned long __offset) { #if defined(__i386__) || defined(__x86_64__) static __inline__ void __DEFAULT_FN_ATTRS __movsb(unsigned char *__dst, unsigned char const *__src, size_t __n) { - __asm__("rep movsb" : : "D"(__dst), "S"(__src), "c"(__n) - : "%edi", "%esi", "%ecx"); + __asm__("rep movsb" : : "D"(__dst), "S"(__src), "c"(__n)); } static __inline__ void __DEFAULT_FN_ATTRS __movsd(unsigned long *__dst, unsigned long const *__src, size_t __n) { - __asm__("rep movsl" : : "D"(__dst), "S"(__src), "c"(__n) - : "%edi", "%esi", "%ecx"); + __asm__("rep movsl" : : "D"(__dst), "S"(__src), "c"(__n)); } static __inline__ void __DEFAULT_FN_ATTRS __movsw(unsigned short *__dst, unsigned short const *__src, size_t __n) { - __asm__("rep movsw" : : "D"(__dst), "S"(__src), "c"(__n) - : "%edi", "%esi", "%ecx"); -} -static __inline__ void __DEFAULT_FN_ATTRS -__stosb(unsigned char *__dst, unsigned char __x, size_t __n) { - __asm__("rep stosb" : : "D"(__dst), "a"(__x), "c"(__n) - : "%edi", "%ecx"); + __asm__("rep movsw" : : "D"(__dst), "S"(__src), "c"(__n)); } static __inline__ void __DEFAULT_FN_ATTRS __stosd(unsigned long *__dst, unsigned long __x, size_t __n) { - __asm__("rep stosl" : : "D"(__dst), "a"(__x), "c"(__n) - : "%edi", "%ecx"); + __asm__("rep stosl" : : "D"(__dst), "a"(__x), "c"(__n)); } static __inline__ void __DEFAULT_FN_ATTRS __stosw(unsigned short *__dst, unsigned short __x, size_t __n) { - __asm__("rep stosw" : : "D"(__dst), "a"(__x), "c"(__n) - : "%edi", "%ecx"); + __asm__("rep stosw" : : "D"(__dst), "a"(__x), "c"(__n)); } #endif #ifdef __x86_64__ static __inline__ void __DEFAULT_FN_ATTRS __movsq(unsigned long long *__dst, unsigned long long const *__src, size_t __n) { - __asm__("rep movsq" : : "D"(__dst), "S"(__src), "c"(__n) - : "%edi", "%esi", "%ecx"); + __asm__("rep movsq" : : "D"(__dst), "S"(__src), "c"(__n)); } static __inline__ void __DEFAULT_FN_ATTRS __stosq(unsigned __int64 *__dst, unsigned __int64 __x, size_t __n) { - __asm__("rep stosq" : : "D"(__dst), "a"(__x), "c"(__n) - : "%edi", "%ecx"); + __asm__("rep stosq" : : "D"(__dst), "a"(__x), "c"(__n)); } #endif /*----------------------------------------------------------------------------*\ |* Misc \*----------------------------------------------------------------------------*/ -static __inline__ void * __DEFAULT_FN_ATTRS -_AddressOfReturnAddress(void) { - return (void*)((char*)__builtin_frame_address(0) + sizeof(void*)); -} -static __inline__ void * __DEFAULT_FN_ATTRS -_ReturnAddress(void) { - return __builtin_return_address(0); -} #if defined(__i386__) || defined(__x86_64__) static __inline__ void __DEFAULT_FN_ATTRS __cpuid(int __info[4], int __level) { @@ -914,6 +974,10 @@ static __inline__ void __DEFAULT_FN_ATTRS __halt(void) { __asm__ volatile ("hlt"); } +static __inline__ void __DEFAULT_FN_ATTRS +__nop(void) { + __asm__ volatile ("nop"); +} #endif /*----------------------------------------------------------------------------*\ |