summaryrefslogtreecommitdiffstats
path: root/libavcodec/x86
Commit message (Expand)AuthorAgeFilesLines
* VP8: move zeroing of luma DC block into the WHTJason Garrett-Glaser2010-08-022-2/+20
* Use word-writing instead of dword-writing (with two cached but otherwiseRonald S. Bultje2010-07-312-105/+98
* Remove x86/mmx.h. It is not used anymore and has been deprecated for years.Vitor Sessak2010-07-311-267/+0
* Convert deinterlacing MMX code to YASMVitor Sessak2010-07-313-0/+95
* Fix compilation in x86_64. I broke it with r24580.Vitor Sessak2010-07-291-2/+2
* Translate libmpeg2 MMX IDCT to plain asmVitor Sessak2010-07-291-208/+237
* Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster.Ronald S. Bultje2010-07-261-2/+78
* VP8: Much faster SSE2 MCJason Garrett-Glaser2010-07-261-88/+78
* Enable no-loop memory/register saving for ssse3/sse4 also.Ronald S. Bultje2010-07-261-2/+2
* Save a register (or regsize of stackspace for x86-32) for the no-loopRonald S. Bultje2010-07-261-16/+24
* Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. thisRonald S. Bultje2010-07-261-3/+9
* Split pextrw macro-spaghetti into several opt-specific macros, this will makeRonald S. Bultje2010-07-261-30/+49
* Fix obvious bug in assignment. Somehow, the test vectors don't test this...Ronald S. Bultje2010-07-251-1/+1
* Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so thisRonald S. Bultje2010-07-241-33/+52
* Inline asm for VP56 arith coderEli Friedman2010-07-231-0/+54
* VP8: optimize DC-only chroma case in the same way as luma.Jason Garrett-Glaser2010-07-232-10/+53
* VP8 asm: cosmetics (spacing)Jason Garrett-Glaser2010-07-231-2/+2
* VP8: 30% faster idct_mbJason Garrett-Glaser2010-07-232-54/+132
* VP8: clear DCT blocks in iDCT instead of using clear_blocks.Jason Garrett-Glaser2010-07-232-4/+24
* Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles onRonald S. Bultje2010-07-222-5/+34
* Fix and enable horizontal >=SSE2 mbedge loopfilter.Ronald S. Bultje2010-07-222-8/+8
* relicense h264 deblock sse2 to lgplLoren Merritt2010-07-223-15/+19
* sync yasm macros from x264Loren Merritt2010-07-211-12/+23
* Eliminate one instruction in VP8 dc_add_sse4Jason Garrett-Glaser2010-07-211-2/+1
* Various VP8 x86 deblocking speedupsJason Garrett-Glaser2010-07-212-92/+107
* Make mmx VP8 WHT fasterJason Garrett-Glaser2010-07-212-19/+24
* Add header declarations for mmx/sse constants missing themDavid Conrad2010-07-211-0/+6
* Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.cDavid Conrad2010-07-212-7/+1
* VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16)Ronald S. Bultje2010-07-204-2/+687
* Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder.Ronald S. Bultje2010-07-202-77/+150
* Revert r24339 (it causes fate failures on x86-64) - I'll figure out what'sRonald S. Bultje2010-07-192-127/+32
* Remove FF_MM_SSE2/3 flags for CPUs where this is generally not faster thanRonald S. Bultje2010-07-193-6/+25
* Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions.Ronald S. Bultje2010-07-192-32/+127
* Be more efficient with registers or stack memory. Saves 8/16 bytes stackRonald S. Bultje2010-07-191-16/+16
* Change function prototypes for width=8 inner and mbedge loopfilter functionsRonald S. Bultje2010-07-192-19/+19
* more credits to D. J. Bernstein for fftLoren Merritt2010-07-181-0/+3
* Attempt to fix x86-64 testsuite on fate.Ronald S. Bultje2010-07-161-1/+1
* Remove duplicate define.Ronald S. Bultje2010-07-161-1/+0
* Revert 24270, it contained some stuff that shouldn't have been in there.Ronald S. Bultje2010-07-161-1/+2
* Remove duplicate define.Ronald S. Bultje2010-07-161-2/+1
* Give x86 r%d registers names, this will simplify implementation of the chromaRonald S. Bultje2010-07-161-58/+81
* Change return statement, the REP_RET is a mistake since the else case (x86-64,Ronald S. Bultje2010-07-161-3/+1
* VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.Ronald S. Bultje2010-07-154-15/+488
* MMX/SSE VC1 loop filterDavid Conrad2010-07-114-0/+424
* Make ff_pw_4 128 bitsDavid Conrad2010-07-114-4/+4
* Move SSE optimized 32-point DCT to its own file. Should fix breakage with YASMVitor Sessak2010-07-064-266/+298
* SSE optimized 32-point DCTVitor Sessak2010-07-063-0/+275
* Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).Ronald S. Bultje2010-07-034-0/+334
* SSSE3 versions of vp8 width4 bilinear MC functionsJason Garrett-Glaser2010-07-032-4/+34
* SSSE3 versions of width4 VP8 6-tap MC functionsJason Garrett-Glaser2010-07-022-161/+192
OpenPOWER on IntegriCloud