Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | Shorten floatprops<T> to FP | Erik Schnetter | 2013-02-14 | 1 | -4/+4 | |
| | ||||||
* | Rename "pseudovectors" to "libm"-vectors | Erik Schnetter | 2013-02-14 | 1 | -1/+1 | |
| | ||||||
* | Move definition of all and any into class | Erik Schnetter | 2013-02-14 | 1 | -20/+12 | |
| | ||||||
* | Fold Chebyshev versions of sin and cos back into the main versions | Erik Schnetter | 2013-02-14 | 6 | -207/+46 | |
| | ||||||
* | Use SSE 3 and SSE 4.1 only when available | Erik Schnetter | 2013-02-14 | 2 | -8/+72 | |
| | ||||||
* | Introduce VML_NODEBUG | Erik Schnetter | 2013-02-14 | 2 | -1/+9 | |
| | ||||||
* | Remove unused functions | Erik Schnetter | 2013-02-14 | 1 | -13/+0 | |
| | ||||||
* | Improve pow() | Erik Schnetter | 2013-02-14 | 1 | -1/+2 | |
| | ||||||
* | Correct ceil() and floor() | Erik Schnetter | 2013-02-14 | 1 | -4/+6 | |
| | ||||||
* | Provide scalbn with scalar int argument | Erik Schnetter | 2013-02-14 | 6 | -0/+26 | |
| | ||||||
* | Test conversions and rounding also for 0 | Erik Schnetter | 2013-02-14 | 1 | -6/+13 | |
| | ||||||
* | Correct vector types used in math functions | Erik Schnetter | 2013-02-14 | 1 | -2/+2 | |
| | ||||||
* | Merged in upcaste/vecmathlib (pull request #2) | Erik Schnetter | 2013-02-11 | 8 | -17/+165 | |
|\ | | | | | Re-generate pull request | |||||
| * | Merged eschnett/vecmathlib into master | Erik Schnetter | 2013-02-11 | 1 | -0/+64 | |
| |\ | |/ |/| | ||||||
* | | Add beginning of documentation | Erik Schnetter | 2013-02-11 | 1 | -0/+64 | |
| | | ||||||
| * | Added optimized versions of sin and cos. | Jesse W. Towner | 2013-02-11 | 8 | -17/+165 | |
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new functions have been added to the mathfuncs template class, and are named vml_sin_chebyshev_single, vml_sin_chebyshev_double, vml_cos_chebyshev_single, and vml_cos_chebyshev_double. The corresponding sin and cos member functions in the vector template structs have been updated to call into the new implementations. These functions use float optimized minimaxed Chebyshev polynomial approximations. They have good relative error distributions for IEEE-754 floating point numbers, as the highest contributing coefficient is selected to precisely map to either a 32-bit or 64-bit IEEE number for the _single and _double function variants respectively. The _single variants produce approximately ~30-bits of precision in the mantissa, and the _double variants produce around ~60-bits, which is more than enough to produce accurate values. The vml_tan function hasn't been updated, so it calls both sin and cos as it used to, and thus relies on the compiler to factor out common code. It's possible to implement a sincos function using these polynomials that interleaves the fmas, and since the fma instructions in both the sin and cos paths don't have any dependencies on one another, one of the paths is computed for essentially free on x86-64 platforms due to instruction parallelism. Alternatiely, tan can be implemented in terms of a specifically optimized Chebyshev rational function with good performance and properties. | |||||
* | Remove unused file | Erik Schnetter | 2013-02-06 | 1 | -0/+0 | |
| | ||||||
* | Find out good clang options for using a modern STL | Erik Schnetter | 2013-02-06 | 3 | -107/+44 | |
| | | | | | Remove clang-specific work-arounds. Switch to building with clang by default. | |||||
* | Avoid more gcc extensions | Erik Schnetter | 2013-02-05 | 2 | -16/+24 | |
| | ||||||
* | Avoid gcc extension | Erik Schnetter | 2013-02-05 | 1 | -4/+4 | |
| | ||||||
* | Output timings for looping example | Erik Schnetter | 2013-02-05 | 1 | -9/+116 | |
| | ||||||
* | Remove superfluous instructions | Erik Schnetter | 2013-02-05 | 1 | -4/+0 | |
| | ||||||
* | Correct termination condition in mask class | Erik Schnetter | 2013-02-05 | 1 | -5/+8 | |
| | ||||||
* | Add lots of work-arounds for missing clang++ functionality | Erik Schnetter | 2013-02-05 | 3 | -45/+111 | |
| | ||||||
* | Add small comment | Erik Schnetter | 2013-02-05 | 1 | -0/+1 | |
| | ||||||
* | Activate #defines handling IEEE compliance | Erik Schnetter | 2013-02-05 | 1 | -4/+4 | |
| | ||||||
* | Provide missing shift operators in pseudovec classes | Erik Schnetter | 2013-02-05 | 1 | -1/+27 | |
| | ||||||
* | Improve scalbn() | Erik Schnetter | 2013-02-05 | 1 | -1/+1 | |
| | ||||||
* | Update clang build instructions | Erik Schnetter | 2013-02-05 | 1 | -1/+6 | |
| | ||||||
* | Handle rint() overflow correctly | Erik Schnetter | 2013-02-05 | 1 | -2/+18 | |
| | ||||||
* | Use correct version of rint() | Erik Schnetter | 2013-02-05 | 1 | -6/+6 | |
| | ||||||
* | Omit unused header file | Erik Schnetter | 2013-02-05 | 1 | -1/+0 | |
| | ||||||
* | Correct clang instructions (doesn't work yet). Add Intel instructions (works). | Erik Schnetter | 2013-02-05 | 1 | -2/+6 | |
| | ||||||
* | Don't use constexpr; Intel compiler doesn't handle it well | Erik Schnetter | 2013-02-05 | 1 | -44/+47 | |
| | ||||||
* | Comment out unused variable | Erik Schnetter | 2013-02-05 | 1 | -1/+1 | |
| | ||||||
* | Provide intvec_t::iota() | Erik Schnetter | 2013-02-04 | 6 | -0/+11 | |
| | ||||||
* | Add example that uses a manually vectorised loop | Erik Schnetter | 2013-02-04 | 1 | -0/+107 | |
| | ||||||
* | Provide memory access functions | Erik Schnetter | 2013-02-04 | 11 | -0/+858 | |
| | ||||||
* | Add flag VML_DEBUG | Erik Schnetter | 2013-02-04 | 1 | -4/+16 | |
| | | | | Also define VML_ASSERT for error-checking | |||||
* | White space change | Erik Schnetter | 2013-02-04 | 1 | -0/+2 | |
| | ||||||
* | Update build instructions | Erik Schnetter | 2013-02-04 | 3 | -8/+14 | |
| | ||||||
* | Improve indentation | Erik Schnetter | 2013-02-03 | 1 | -27/+38 | |
| | ||||||
* | Use pseudovec template | Erik Schnetter | 2013-02-03 | 2 | -63/+26 | |
| | ||||||
* | Add new template {bool|int|real}pseudovec that scalarizes all operations | Erik Schnetter | 2013-02-03 | 2 | -0/+1104 | |
| | ||||||
* | Whitespace change | Erik Schnetter | 2013-02-03 | 1 | -0/+2 | |
| | ||||||
* | Use memcpy instead of a union to re-interpret values | Erik Schnetter | 2013-02-03 | 1 | -8/+21 | |
| | ||||||
* | Correct indentation | Erik Schnetter | 2013-02-03 | 1 | -8/+8 | |
| | ||||||
* | Describe how to build with make and with ninja, respectively | Erik Schnetter | 2013-02-03 | 1 | -2/+3 | |
| | ||||||
* | Build with make instead of ninja by default | Erik Schnetter | 2013-02-03 | 1 | -1/+6 | |
| | ||||||
* | Check for non-finite number instead of nan after running benchmarks | Erik Schnetter | 2013-02-01 | 1 | -2/+2 | |
| |