summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* neon_static_x4_f and neon_static_x4_i don't use the second passed argument, ↵Jukka Ojanen2016-03-154-126/+140
| | | | and reschedule instructions for possible dual issue
* Coverage analysis shows unused if-else branchesJukka Ojanen2016-03-141-18/+26
|
* Unroll to minimize recursive function call depth (overhead)Jukka Ojanen2016-03-141-45/+91
|
* Peel off top-level only if-case from ARM NEON recursive implementationJukka Ojanen2016-03-142-99/+120
|
* Restore ARM NEON optimized recursive versionJukka Ojanen2016-03-112-13/+85
|
* Enable building shared library and start version numbering from 0.9.0. On ↵Jukka Ojanen2015-11-304-8/+14
| | | | Windows when using FFTS as a DLL, define FFTS_SHARED. This is not mandatory, but it offers a little performance increase. Hide symbols when possible to improve compiler optimization and sizeof binary. Use CMake target alias "ffts" to choose between static and shared library, preferring static
* Fix MSVC error C2719Jukka Ojanen2015-10-132-20/+21
|
* Add SSE2 optimized ffts_generate_cosine_sine_pow2_32fJukka Ojanen2015-09-171-14/+84
|
* Add double-double arithmetic to generate "exact" double precision cosine and ↵Jukka Ojanen2015-09-163-0/+379
| | | | sine tables. Correct rounding verified using MPFR upto 2^28. SSE2 optimized ffts_generate_cosine_sine_pow2_64f takes twice as long as ffts_generate_cosine_sine_pow2_32f.
* Change the order of constants; cos_hi, cos_lo, sin_hi, sin_lo -> cos_hi, ↵Jukka Ojanen2015-09-161-68/+68
| | | | sin_hi, cos_lo, sin_lo to support 128 bit vectorization
* Extended constant tables to double-double arithmeticJukka Ojanen2015-09-151-49/+115
|
* No need to display the size of transformJukka Ojanen2015-08-282-27/+35
|
* Control reaches end of non-void functionJukka Ojanen2015-07-301-1/+1
|
* Detect presence of malloc.h, fixes anthonix/ffts#40Jukka Ojanen2015-07-301-0/+3
|
* Define [pa] and [pb] as constant input variables, not writable outputsJukka Ojanen2015-07-162-8/+5
|
* Remove unreferenced headerJukka Ojanen2015-07-151-2/+0
|
* Improve compiler optimization by turning "patterns.c" to "patterns.h"Jukka Ojanen2015-07-152-232/+506
|
* Remove some dead codeJukka Ojanen2015-07-151-19/+0
|
* FFTS is no longer depended on any other math library, and this should help ↵Jukka Ojanen2015-07-145-82/+135
| | | | to verify its numerical accuracy.
* Move trigonometric stuff to separate file.Jukka Ojanen2015-07-144-54/+272
| | | | Implemented Oscar Buneman's method for generating a sequence of sines and cosines.
* Unroll loops to process 64 byte cache line per iterationJukka Ojanen2015-07-091-39/+205
|
* Add new attributes to control/improve branch predictionsJukka Ojanen2015-07-091-0/+12
|
* Half the number of calls to sin/cos functions in ffts_init_1d_realJukka Ojanen2015-07-081-12/+68
|
* Add SSE3 optimized version of ffts_execute_1d_real_invJukka Ojanen2015-07-071-20/+78
|
* Add SSE3 optimized version of ffts_execute_1d_realJukka Ojanen2015-07-071-13/+80
|
* To silence warning 'possible loss of data', use explicit casting to floatJukka Ojanen2015-07-061-8/+8
|
* SSE optimized versions of ffts_execute_1d_real and ffts_execute_1d_real_invJukka Ojanen2015-07-061-4/+100
|
* Add new attributes to help auto-vectorizationJukka Ojanen2015-07-062-19/+54
|
* Avoid allocating array of single pointerJukka Ojanen2015-07-061-10/+8
|
* Fix ffts_aligned_free MinGW crashJukka Ojanen2015-07-061-1/+1
|
* Incorrect stride with GCC flags "-march=native -ffast-math"Jukka Ojanen2015-07-022-3/+5
| | | | Note that N/leaf_N is always a multiply of 2
* Fix assertion failed in ffts_compare_offsetsJukka Ojanen2015-07-021-4/+5
|
* Generate cosine and sine table without using C math library. About 100 times ↵Jukka Ojanen2015-03-312-15/+54
| | | | faster on ARM and 15 times faster on x86.
* ffts_nd.c is using SSE2 intrinsics, detect and include emmintrin.h instead ↵Jukka Ojanen2015-03-191-2/+2
| | | | xmmintrin.h, and fix GCC error: inlining failed in call to always_inline '_mm_load_pd': target specific option mismatch by adding "-msse2" instead of "-msse"
* To support building for Windows with MinGW, don't assume MSVC to be the compilerJukka Ojanen2015-03-191-1/+1
|
* Minimize sin/cos calculations by calculating all factors ones and generate ↵Jukka Ojanen2015-03-182-17/+31
| | | | lookup tables by mapping
* Remove unused sse.sJukka Ojanen2015-03-181-885/+0
|
* Always run-time generate x64 dynamic codeJukka Ojanen2015-03-181-841/+1335
|
* Remove dependency on YASM as Windows dynamic code is run-time generatedJukka Ojanen2015-03-171-828/+0
|
* Determinate lookup table size using closed-form expressionJukka Ojanen2015-03-161-22/+4
|
* Remove dead codeJukka Ojanen2015-03-161-98/+47
|
* Don't generate lookup tables when size is less than 32Jukka Ojanen2015-03-161-5/+5
|
* Merge ffts_small with ffts_static, and define small transforms "fully" constantJukka Ojanen2015-03-165-510/+583
|
* Add string.h to fix implicit declaration of function 'memcpy'Jukka Ojanen2015-03-131-0/+4
|
* One more macro fixJukka Ojanen2015-03-131-2/+2
|
* Forgot to rename some V macrosJukka Ojanen2015-03-134-26/+28
|
* Rename vector V as V4SF; vector of 4 single precision floats. Rename all ↵Jukka Ojanen2015-03-128-673/+796
| | | | | | vector V macros accordingly. Redefine ffts_constants as ffts_constants_32f and ffts_constants_64f.
* Replace data_t with floatJukka Ojanen2015-03-121-31/+53
|
* Remove unused neon_float.h headerJukka Ojanen2015-03-121-1127/+0
|
* Remove unused variable 'i' from 'ffts_generate_func_code'Jukka Ojanen2015-03-121-4/+0
|
OpenPOWER on IntegriCloud