vecmathlib - vecmathlib forked from https://bitbucket.org/eschnett/vecmathlib for POWER8 improvements

	Commit message (Collapse)	Author	Age	Files	Lines
*	Implement frexp	Erik Schnetter	2013-06-28	1	-0/+1
\|
*	Implement nextafter	Erik Schnetter	2013-06-09	1	-0/+1
\|
*	Implement atan2 routine	Erik Schnetter	2013-06-06	1	-0/+1
\|
*	Implement IEEE-versions of isnan etc. that are not optimized away	Erik Schnetter	2013-04-22	1	-0/+4
\|
*	Correct rounding and conversion functions	Erik Schnetter	2013-03-21	1	-0/+1
\|
*	Add rint(), correct round()	Erik Schnetter	2013-02-19	1	-0/+1
\|
*	Add cbrt, hypot, trunc; rename scalbn to ldexp	Erik Schnetter	2013-02-16	1	-1/+4
\|
*	Fold Chebyshev versions of sin and cos back into the main versions	Erik Schnetter	2013-02-14	1	-4/+0
\|
*	Correct vector types used in math functions	Erik Schnetter	2013-02-14	1	-2/+2
\|
*	Added optimized versions of sin and cos.	Jesse W. Towner	2013-02-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new functions have been added to the mathfuncs template class, and are named vml_sin_chebyshev_single, vml_sin_chebyshev_double, vml_cos_chebyshev_single, and vml_cos_chebyshev_double. The corresponding sin and cos member functions in the vector template structs have been updated to call into the new implementations. These functions use float optimized minimaxed Chebyshev polynomial approximations. They have good relative error distributions for IEEE-754 floating point numbers, as the highest contributing coefficient is selected to precisely map to either a 32-bit or 64-bit IEEE number for the _single and _double function variants respectively. The _single variants produce approximately ~30-bits of precision in the mantissa, and the _double variants produce around ~60-bits, which is more than enough to produce accurate values. The vml_tan function hasn't been updated, so it calls both sin and cos as it used to, and thus relies on the compiler to factor out common code. It's possible to implement a sincos function using these polynomials that interleaves the fmas, and since the fma instructions in both the sin and cos paths don't have any dependencies on one another, one of the paths is computed for essentially free on x86-64 platforms due to instruction parallelism. Alternatiely, tan can be implemented in terms of a specifically optimized Chebyshev rational function with good performance and properties.
*	Add fdim fmax fmin, fma, isfinite isinf isnan isnormal	Erik Schnetter	2012-12-18	1	-0/+8
\|
*	Implement sin, make optimised builds work	Erik Schnetter	2012-12-01	1	-0/+6
\|
*	Implement cosh sinh tanh	Erik Schnetter	2012-12-01	1	-0/+10
\|
*	Implement ceil floor fmod pow remainder round	Erik Schnetter	2012-12-01	1	-0/+8
\|
*	Implement asinh and exp	Erik Schnetter	2012-12-01	1	-0/+11
\|
*	Import initial version	Erik Schnetter	2012-11-30	1	-0/+67