vecmathlib - vecmathlib forked from https://bitbucket.org/eschnett/vecmathlib for POWER8 improvements

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Shorten floatprops<T> to FP	Erik Schnetter	2013-02-14	1	-4/+4
\|
*	Rename "pseudovectors" to "libm"-vectors	Erik Schnetter	2013-02-14	1	-1/+1
\|
*	Move definition of all and any into class	Erik Schnetter	2013-02-14	1	-20/+12
\|
*	Fold Chebyshev versions of sin and cos back into the main versions	Erik Schnetter	2013-02-14	6	-207/+46
\|
*	Use SSE 3 and SSE 4.1 only when available	Erik Schnetter	2013-02-14	2	-8/+72
\|
*	Introduce VML_NODEBUG	Erik Schnetter	2013-02-14	2	-1/+9
\|
*	Remove unused functions	Erik Schnetter	2013-02-14	1	-13/+0
\|
*	Improve pow()	Erik Schnetter	2013-02-14	1	-1/+2
\|
*	Correct ceil() and floor()	Erik Schnetter	2013-02-14	1	-4/+6
\|
*	Provide scalbn with scalar int argument	Erik Schnetter	2013-02-14	6	-0/+26
\|
*	Test conversions and rounding also for 0	Erik Schnetter	2013-02-14	1	-6/+13
\|
*	Correct vector types used in math functions	Erik Schnetter	2013-02-14	1	-2/+2
\|
*	Merged in upcaste/vecmathlib (pull request #2)	Erik Schnetter	2013-02-11	8	-17/+165
\|\ \| \| \| \|	Re-generate pull request
\| *	Merged eschnett/vecmathlib into master	Erik Schnetter	2013-02-11	1	-0/+64
\| \|\ \| \|/ \|/\|
* \|	Add beginning of documentation	Erik Schnetter	2013-02-11	1	-0/+64
\| \|
\| *	Added optimized versions of sin and cos.	Jesse W. Towner	2013-02-11	8	-17/+165
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new functions have been added to the mathfuncs template class, and are named vml_sin_chebyshev_single, vml_sin_chebyshev_double, vml_cos_chebyshev_single, and vml_cos_chebyshev_double. The corresponding sin and cos member functions in the vector template structs have been updated to call into the new implementations. These functions use float optimized minimaxed Chebyshev polynomial approximations. They have good relative error distributions for IEEE-754 floating point numbers, as the highest contributing coefficient is selected to precisely map to either a 32-bit or 64-bit IEEE number for the _single and _double function variants respectively. The _single variants produce approximately ~30-bits of precision in the mantissa, and the _double variants produce around ~60-bits, which is more than enough to produce accurate values. The vml_tan function hasn't been updated, so it calls both sin and cos as it used to, and thus relies on the compiler to factor out common code. It's possible to implement a sincos function using these polynomials that interleaves the fmas, and since the fma instructions in both the sin and cos paths don't have any dependencies on one another, one of the paths is computed for essentially free on x86-64 platforms due to instruction parallelism. Alternatiely, tan can be implemented in terms of a specifically optimized Chebyshev rational function with good performance and properties.
*	Remove unused file	Erik Schnetter	2013-02-06	1	-0/+0
\|
*	Find out good clang options for using a modern STL	Erik Schnetter	2013-02-06	3	-107/+44
\| \| \| \| \|	Remove clang-specific work-arounds. Switch to building with clang by default.
*	Avoid more gcc extensions	Erik Schnetter	2013-02-05	2	-16/+24
\|
*	Avoid gcc extension	Erik Schnetter	2013-02-05	1	-4/+4
\|
*	Output timings for looping example	Erik Schnetter	2013-02-05	1	-9/+116
\|
*	Remove superfluous instructions	Erik Schnetter	2013-02-05	1	-4/+0
\|
*	Correct termination condition in mask class	Erik Schnetter	2013-02-05	1	-5/+8
\|
*	Add lots of work-arounds for missing clang++ functionality	Erik Schnetter	2013-02-05	3	-45/+111
\|
*	Add small comment	Erik Schnetter	2013-02-05	1	-0/+1
\|
*	Activate #defines handling IEEE compliance	Erik Schnetter	2013-02-05	1	-4/+4
\|
*	Provide missing shift operators in pseudovec classes	Erik Schnetter	2013-02-05	1	-1/+27
\|
*	Improve scalbn()	Erik Schnetter	2013-02-05	1	-1/+1
\|
*	Update clang build instructions	Erik Schnetter	2013-02-05	1	-1/+6
\|
*	Handle rint() overflow correctly	Erik Schnetter	2013-02-05	1	-2/+18
\|
*	Use correct version of rint()	Erik Schnetter	2013-02-05	1	-6/+6
\|
*	Omit unused header file	Erik Schnetter	2013-02-05	1	-1/+0
\|
*	Correct clang instructions (doesn't work yet). Add Intel instructions (works).	Erik Schnetter	2013-02-05	1	-2/+6
\|
*	Don't use constexpr; Intel compiler doesn't handle it well	Erik Schnetter	2013-02-05	1	-44/+47
\|
*	Comment out unused variable	Erik Schnetter	2013-02-05	1	-1/+1
\|
*	Provide intvec_t::iota()	Erik Schnetter	2013-02-04	6	-0/+11
\|
*	Add example that uses a manually vectorised loop	Erik Schnetter	2013-02-04	1	-0/+107
\|
*	Provide memory access functions	Erik Schnetter	2013-02-04	11	-0/+858
\|
*	Add flag VML_DEBUG	Erik Schnetter	2013-02-04	1	-4/+16
\| \| \| \|	Also define VML_ASSERT for error-checking
*	White space change	Erik Schnetter	2013-02-04	1	-0/+2
\|
*	Update build instructions	Erik Schnetter	2013-02-04	3	-8/+14
\|
*	Improve indentation	Erik Schnetter	2013-02-03	1	-27/+38
\|
*	Use pseudovec template	Erik Schnetter	2013-02-03	2	-63/+26
\|
*	Add new template {bool\|int\|real}pseudovec that scalarizes all operations	Erik Schnetter	2013-02-03	2	-0/+1104
\|
*	Whitespace change	Erik Schnetter	2013-02-03	1	-0/+2
\|
*	Use memcpy instead of a union to re-interpret values	Erik Schnetter	2013-02-03	1	-8/+21
\|
*	Correct indentation	Erik Schnetter	2013-02-03	1	-8/+8
\|
*	Describe how to build with make and with ninja, respectively	Erik Schnetter	2013-02-03	1	-2/+3
\|
*	Build with make instead of ninja by default	Erik Schnetter	2013-02-03	1	-1/+6
\|
*	Check for non-finite number instead of nan after running benchmarks	Erik Schnetter	2013-02-01	1	-2/+2
\|