Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | neon_static_x4_f and neon_static_x4_i don't use the second passed argument, ↵ | Jukka Ojanen | 2016-03-15 | 4 | -126/+140 |
| | | | | and reschedule instructions for possible dual issue | ||||
* | Coverage analysis shows unused if-else branches | Jukka Ojanen | 2016-03-14 | 1 | -18/+26 |
| | |||||
* | Unroll to minimize recursive function call depth (overhead) | Jukka Ojanen | 2016-03-14 | 1 | -45/+91 |
| | |||||
* | Peel off top-level only if-case from ARM NEON recursive implementation | Jukka Ojanen | 2016-03-14 | 2 | -99/+120 |
| | |||||
* | Restore ARM NEON optimized recursive version | Jukka Ojanen | 2016-03-11 | 2 | -13/+85 |
| | |||||
* | Enable building shared library and start version numbering from 0.9.0. On ↵ | Jukka Ojanen | 2015-11-30 | 4 | -8/+14 |
| | | | | Windows when using FFTS as a DLL, define FFTS_SHARED. This is not mandatory, but it offers a little performance increase. Hide symbols when possible to improve compiler optimization and sizeof binary. Use CMake target alias "ffts" to choose between static and shared library, preferring static | ||||
* | Fix MSVC error C2719 | Jukka Ojanen | 2015-10-13 | 2 | -20/+21 |
| | |||||
* | Add SSE2 optimized ffts_generate_cosine_sine_pow2_32f | Jukka Ojanen | 2015-09-17 | 1 | -14/+84 |
| | |||||
* | Add double-double arithmetic to generate "exact" double precision cosine and ↵ | Jukka Ojanen | 2015-09-16 | 3 | -0/+379 |
| | | | | sine tables. Correct rounding verified using MPFR upto 2^28. SSE2 optimized ffts_generate_cosine_sine_pow2_64f takes twice as long as ffts_generate_cosine_sine_pow2_32f. | ||||
* | Change the order of constants; cos_hi, cos_lo, sin_hi, sin_lo -> cos_hi, ↵ | Jukka Ojanen | 2015-09-16 | 1 | -68/+68 |
| | | | | sin_hi, cos_lo, sin_lo to support 128 bit vectorization | ||||
* | Extended constant tables to double-double arithmetic | Jukka Ojanen | 2015-09-15 | 1 | -49/+115 |
| | |||||
* | No need to display the size of transform | Jukka Ojanen | 2015-08-28 | 2 | -27/+35 |
| | |||||
* | Control reaches end of non-void function | Jukka Ojanen | 2015-07-30 | 1 | -1/+1 |
| | |||||
* | Detect presence of malloc.h, fixes anthonix/ffts#40 | Jukka Ojanen | 2015-07-30 | 1 | -0/+3 |
| | |||||
* | Define [pa] and [pb] as constant input variables, not writable outputs | Jukka Ojanen | 2015-07-16 | 2 | -8/+5 |
| | |||||
* | Remove unreferenced header | Jukka Ojanen | 2015-07-15 | 1 | -2/+0 |
| | |||||
* | Improve compiler optimization by turning "patterns.c" to "patterns.h" | Jukka Ojanen | 2015-07-15 | 2 | -232/+506 |
| | |||||
* | Remove some dead code | Jukka Ojanen | 2015-07-15 | 1 | -19/+0 |
| | |||||
* | FFTS is no longer depended on any other math library, and this should help ↵ | Jukka Ojanen | 2015-07-14 | 5 | -82/+135 |
| | | | | to verify its numerical accuracy. | ||||
* | Move trigonometric stuff to separate file. | Jukka Ojanen | 2015-07-14 | 4 | -54/+272 |
| | | | | Implemented Oscar Buneman's method for generating a sequence of sines and cosines. | ||||
* | Unroll loops to process 64 byte cache line per iteration | Jukka Ojanen | 2015-07-09 | 1 | -39/+205 |
| | |||||
* | Add new attributes to control/improve branch predictions | Jukka Ojanen | 2015-07-09 | 1 | -0/+12 |
| | |||||
* | Half the number of calls to sin/cos functions in ffts_init_1d_real | Jukka Ojanen | 2015-07-08 | 1 | -12/+68 |
| | |||||
* | Add SSE3 optimized version of ffts_execute_1d_real_inv | Jukka Ojanen | 2015-07-07 | 1 | -20/+78 |
| | |||||
* | Add SSE3 optimized version of ffts_execute_1d_real | Jukka Ojanen | 2015-07-07 | 1 | -13/+80 |
| | |||||
* | To silence warning 'possible loss of data', use explicit casting to float | Jukka Ojanen | 2015-07-06 | 1 | -8/+8 |
| | |||||
* | SSE optimized versions of ffts_execute_1d_real and ffts_execute_1d_real_inv | Jukka Ojanen | 2015-07-06 | 1 | -4/+100 |
| | |||||
* | Add new attributes to help auto-vectorization | Jukka Ojanen | 2015-07-06 | 2 | -19/+54 |
| | |||||
* | Avoid allocating array of single pointer | Jukka Ojanen | 2015-07-06 | 1 | -10/+8 |
| | |||||
* | Fix ffts_aligned_free MinGW crash | Jukka Ojanen | 2015-07-06 | 1 | -1/+1 |
| | |||||
* | Incorrect stride with GCC flags "-march=native -ffast-math" | Jukka Ojanen | 2015-07-02 | 2 | -3/+5 |
| | | | | Note that N/leaf_N is always a multiply of 2 | ||||
* | Fix assertion failed in ffts_compare_offsets | Jukka Ojanen | 2015-07-02 | 1 | -4/+5 |
| | |||||
* | Generate cosine and sine table without using C math library. About 100 times ↵ | Jukka Ojanen | 2015-03-31 | 2 | -15/+54 |
| | | | | faster on ARM and 15 times faster on x86. | ||||
* | ffts_nd.c is using SSE2 intrinsics, detect and include emmintrin.h instead ↵ | Jukka Ojanen | 2015-03-19 | 1 | -2/+2 |
| | | | | xmmintrin.h, and fix GCC error: inlining failed in call to always_inline '_mm_load_pd': target specific option mismatch by adding "-msse2" instead of "-msse" | ||||
* | To support building for Windows with MinGW, don't assume MSVC to be the compiler | Jukka Ojanen | 2015-03-19 | 1 | -1/+1 |
| | |||||
* | Minimize sin/cos calculations by calculating all factors ones and generate ↵ | Jukka Ojanen | 2015-03-18 | 2 | -17/+31 |
| | | | | lookup tables by mapping | ||||
* | Remove unused sse.s | Jukka Ojanen | 2015-03-18 | 1 | -885/+0 |
| | |||||
* | Always run-time generate x64 dynamic code | Jukka Ojanen | 2015-03-18 | 1 | -841/+1335 |
| | |||||
* | Remove dependency on YASM as Windows dynamic code is run-time generated | Jukka Ojanen | 2015-03-17 | 1 | -828/+0 |
| | |||||
* | Determinate lookup table size using closed-form expression | Jukka Ojanen | 2015-03-16 | 1 | -22/+4 |
| | |||||
* | Remove dead code | Jukka Ojanen | 2015-03-16 | 1 | -98/+47 |
| | |||||
* | Don't generate lookup tables when size is less than 32 | Jukka Ojanen | 2015-03-16 | 1 | -5/+5 |
| | |||||
* | Merge ffts_small with ffts_static, and define small transforms "fully" constant | Jukka Ojanen | 2015-03-16 | 5 | -510/+583 |
| | |||||
* | Add string.h to fix implicit declaration of function 'memcpy' | Jukka Ojanen | 2015-03-13 | 1 | -0/+4 |
| | |||||
* | One more macro fix | Jukka Ojanen | 2015-03-13 | 1 | -2/+2 |
| | |||||
* | Forgot to rename some V macros | Jukka Ojanen | 2015-03-13 | 4 | -26/+28 |
| | |||||
* | Rename vector V as V4SF; vector of 4 single precision floats. Rename all ↵ | Jukka Ojanen | 2015-03-12 | 8 | -673/+796 |
| | | | | | | vector V macros accordingly. Redefine ffts_constants as ffts_constants_32f and ffts_constants_64f. | ||||
* | Replace data_t with float | Jukka Ojanen | 2015-03-12 | 1 | -31/+53 |
| | |||||
* | Remove unused neon_float.h header | Jukka Ojanen | 2015-03-12 | 1 | -1127/+0 |
| | |||||
* | Remove unused variable 'i' from 'ffts_generate_func_code' | Jukka Ojanen | 2015-03-12 | 1 | -4/+0 |
| |