summaryrefslogtreecommitdiffstats
path: root/src/codegen_sse.h
Commit message (Expand)AuthorAgeFilesLines
* Try to remove some of the hard coded offsets to _ffts_plan_tJukka Ojanen2016-04-061-4/+8
* No need to display the size of transformJukka Ojanen2015-08-281-10/+18
* Minimize sin/cos calculations by calculating all factors ones and generate lo...Jukka Ojanen2015-03-181-5/+4
* Always run-time generate x64 dynamic codeJukka Ojanen2015-03-181-841/+1335
* Don't use long NOPs, instead add extra prefix to extend op codes to align bra...Jukka Ojanen2014-11-171-84/+119
* Add comments to SSE constantsJukka Ojanen2014-11-171-0/+10
* Define externals only when neededJukka Ojanen2014-11-161-12/+13
* Optionally define SSE constants in headerJukka Ojanen2014-11-161-0/+18
* Add some comments to macro assemblyJukka Ojanen2014-11-161-2/+47
* Remove unused "neon" labels and mark external function as "extern"Jukka Ojanen2014-11-141-18/+13
* Take care of unreferenced parametersJukka Ojanen2014-11-141-0/+9
* benchFFTS is computing the correct answer with theseJukka Ojanen2014-11-111-24/+24
|\
| * Damn AT&T syntaxfix_generate_size4_base_caseJukka Ojanen2014-11-111-24/+24
* | generate_leaf_init, generate_leaf_ee, generate_leaf_eo, generate_leaf_oe and ...Jukka Ojanen2014-11-111-2/+507
* | Replace movdqa with movaps which is one byte shorter. Don't need RDI register...Jukka Ojanen2014-11-101-41/+42
|/
* Generate function in "generate_size4_base_case"Jukka Ojanen2014-11-101-10/+79
* Removed last bits of magic from "generate_size8_base_case".Jukka Ojanen2014-11-091-138/+120
* Replace "magic bytes" with various macrosJukka Ojanen2014-11-091-211/+59
* Replace SHUFPS with x64_sse_shufps_reg_reg_immJukka Ojanen2014-11-091-79/+9
* Replace MULPS with x64_sse_mulps_reg_regJukka Ojanen2014-11-091-31/+12
* Replace MOVDQA with x64_sse_movdqa_reg_membase/64_sse_movdqa_membase_regJukka Ojanen2014-11-091-98/+20
* Replace MOVAPS with x64_sse_movaps_reg_membaseJukka Ojanen2014-11-091-76/+1
* Replace SUBPS with x64_sse_subps_reg_regJukka Ojanen2014-11-091-30/+12
* Replace ADDPS with x64_sse_addps_reg_regJukka Ojanen2014-11-091-30/+13
* Replace XORPS with x64_sse_xorps_reg_regJukka Ojanen2014-11-091-21/+4
* Replace XOR2 with x86_clear_reg, MOV_D with x64_mov_membase_reg/x86_mov_reg_m...Jukka Ojanen2014-11-091-112/+12
* Replace MOV_I with x86_mov_reg_imm, SHIFT with x86_shift_reg_imm, CALL with x...Jukka Ojanen2014-11-091-106/+17
* Replace add/sub immediate value with x64_alu_reg_imm_size_bodyJukka Ojanen2014-11-091-50/+3
* Replace register names with new definitionsJukka Ojanen2014-11-081-83/+54
* Win64 actually "generate_size8_base_case" instead of copyingJukka Ojanen2014-11-061-315/+744
* Reorder functions to alphabetical orderJukka Ojanen2014-11-051-122/+206
* Generate leaf_ee_init and x_init instead of copyingJukka Ojanen2014-11-041-47/+160
* Replace _M_AMD64 with _M_X64 as it is equal and "neutral"Jukka Ojanen2014-11-041-3/+3
* Refactor generate_func_codeJukka Ojanen2014-11-041-0/+242
* MOVDQA "intrinsic", two operand MOVDQA2, three operand MOVDQA3 helpersJukka Ojanen2014-11-031-16/+110
* XMM6:XMM15 Nonvolatile, must be preserved as needed by callee. http://msdn.mi...Jukka Ojanen2014-11-011-5/+29
* Add CMake as an alternative build systemJukka Ojanen2014-10-311-103/+137
* Adding in Vim modelines to all .c and .h files.Robert Massaioli2013-12-051-0/+1
* Transforms for N>=32 are now thread safeAnthony Blake2012-10-201-0/+1
* NEON backwards transforms work correctlyAnthony Blake2012-10-201-4/+0
* Added copyright noticeAnthony Blake2012-10-191-0/+34
* Fixed gcc compile issueAnthony Blake2012-10-181-2/+2
* SSE workingAnthony Blake2012-08-301-12/+36
* SSE working in GDBAnthony Blake2012-08-301-0/+32
* SSE Leaves workingAnthony Blake2012-08-291-4/+8
* SSE LEAF EE worksAnthony Blake2012-08-291-0/+104
OpenPOWER on IntegriCloud