summaryrefslogtreecommitdiffstats
path: root/libavfilter/x86
Commit message (Collapse)AuthorAgeFilesLines
* vf_blend: Reduce number of arguments for kernel functionTimothy Gu2016-02-142-3/+2
|
* x86/vf_blend: Add SSE2 optimization for screenTimothy Gu2016-02-102-0/+31
| | | | | | 10x faster than C. Reviewed-by: Paul B Mahol <onemda@gmail.com>
* x86/vf_blend: Move multiplying to a macroTimothy Gu2016-02-101-6/+10
| | | | Reviewed-by: Paul B Mahol <onemda@gmail.com>
* vf_blend: Add SSE2 optimization for multiplyTimothy Gu2016-02-082-0/+31
| | | | 5 times faster than C, 3 times overall.
* x86/vf_w3fdif: 32-bit compatibility for w3fdif_simple_highHendrik Leppkes2016-01-082-3/+34
|
* x86/vf_stereo3d: remove a few unnecessary movasJames Almer2016-01-031-12/+12
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_stereo3d: make ff_anaglyph_sse4 work on x86_32James Almer2015-12-282-4/+45
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_stereo3d: optimize register usageJames Almer2015-12-281-78/+86
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: add sse2 versions of blend_difference and blend_negationJames Almer2015-12-242-3/+13
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: make all functions work on x86_32James Almer2015-12-242-55/+52
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: simplify using macrosJames Almer2015-12-242-325/+53
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_maskedmerge: make ff_maskedmerge8_sse2 work on x86_32James Almer2015-12-242-12/+19
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_maskedmerge: Clear upper part of widthMichael Niedermayer2015-12-231-0/+1
| | | | | | | Fixes crash Fixes: Ticket5055 Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avfilter/x86/vf_maskedmerge: move %define out of .nextrowPaul B Mahol2015-12-101-2/+2
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* x86/vf_w3fdif: use aligned loads in w3fdif_complex_highJames Almer2015-10-271-4/+2
| | | | | Found-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_w3fdif: use aligned loads in w3fdif_simple_highJames Almer2015-10-111-4/+2
| | | | | Found-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_w3fdif: simplify w3fdif_simple_highJames Almer2015-10-111-9/+7
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_w3fdif: move pxor outside the loop in w3fdif_complex_lowJames Almer2015-10-111-4/+4
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_w3fdif: add colons after labelsPaul B Mahol2015-10-101-5/+5
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_w3fdif: add x86 SIMDPaul B Mahol2015-10-103-0/+298
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* doc: fix spelling errorsAndreas Cadhalpun2015-10-091-2/+2
| | | | | Reviewed-by: Lou Logan <lou@lrcd.com> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
* avfilter/x86/vf_blend.asm: hardmix: do same with two pxor instructions lessPaul B Mahol2015-10-071-3/+4
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/x86/vf_blend.asm: 11th register is used, update functionsPaul B Mahol2015-10-071-14/+14
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/x86/vf_blend.asm: add hardmix and phoenix sse2 SIMDPaul B Mahol2015-10-072-0/+78
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_stereo3d: add x86 SIMD for anaglyph outputsPaul B Mahol2015-10-063-0/+206
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_blend: Fix argument types, fix segfault in asmMichael Niedermayer2015-10-031-12/+12
| | | | Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* avfilter/vf_blend: add x86 SIMD for some modesPaul B Mahol2015-10-033-0/+493
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth inputPaul B Mahol2015-10-023-0/+115
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/x86/vf_psnr.asm: fix typoPaul B Mahol2015-10-011-1/+1
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* Replace all remaining occurances of step/depth_minus1 and offset_plus1Hendrik Leppkes2015-09-081-1/+1
|
* options: mark av_get_{int,double,q} as deprecated.Ronald S. Bultje2015-08-181-1/+3
| | | | Convert last users to av_opt_get_*() counterparts.
* x86inc: Drop SECTION_TEXT macroHenrik Gramner2015-08-044-4/+4
| | | | | The .text section is already 16-byte aligned by default on all supported platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
* x86/vf_interlace: add missing colon to labelsJames Almer2015-07-261-1/+1
| | | | | | Silences warnings with Nasm Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_ssim: add ff_ssim_4x4_line_xopJames Almer2015-07-202-3/+64
| | | | | | | ~20% faster than ssse3. Also enabled for x86_32 Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_ssim: fix some instruction commentsJames Almer2015-07-201-2/+2
| | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_psnr.asm: split one line of license text into twoPaul B Mahol2015-07-141-1/+2
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_removegrain: add x86 and x86_64 SSE2 functionsJames Darnley2015-07-143-0/+1310
| | | | | | | | | | | Speed of all modes increased by a factor between 7.4 and 19.8 largely depending on whether bytes are unpacked into words. Modes 2, 3, and 4 have been sped-up by a factor of 43 (thanks quick sort!) All modes are available on x86_64 but only modes 1, 10, 11, 12, 13, 14, 19, 20, 21, and 22 are available on x86 due to the number of SIMD registers used. With a contribution from James Almer <jamrial@gmail.com>
* vf_psnr: sse2 optimizations for sum-squared-error.Ronald S. Bultje2015-07-143-0/+180
| | | | | | | | | | | | The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. (avx2 should be trivial to add later.) Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* vf_ssim: x86 simd for ssim_4x4xN and ssim_endN.Ronald S. Bultje2015-07-143-0/+231
| | | | | | | | Both are 2-2.5x faster than their C counterpart. Reviewed-by: Paul B Mahol <onemda@gmail.com> Reviewed-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* x86: check for AV_CPU_FLAG_AVXSLOW where usefulJames Almer2015-06-011-1/+1
| | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter/x86/vf_hqdn3d: Fix register typesMichael Niedermayer2015-05-271-2/+2
| | | | | | Fixes Ticket4301 Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter/x86/vf_fspp: Fix invalid combination of opcode and operandsMichael Niedermayer2015-05-261-4/+4
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter/x86/vf_fspp: Fix loop condition for column_fidct()Michael Niedermayer2015-01-281-2/+2
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter/vf_eq: mark src as constMichael Niedermayer2015-01-271-1/+1
| | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter/vf_eq: Fix clipping codeMichael Niedermayer2015-01-261-1/+1
| | | | | Found-by: Christophe Gisquet <christophe.gisquet@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* avfilter: Port mp=eq/eq2 to lavfiArwa Arif2015-01-262-0/+97
| | | | | | | Code adapted from James Darnley's port Some fixes from Paul B Mahol <onemda@gmail.com> Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* x86/vf_pp7: port dctB_mmx to yasmJames Almer2015-01-094-69/+93
| | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* lavfi: port mp=pp7 to libavfilterArwa Arif2015-01-092-0/+69
| | | | | | | The only difference with mp=pp7 is that default mode is "medium", as stated in the MPlayer docs, rather than "hard". Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
* x86/vf_fspp: move pxor in store slice functions out of the loopJames Almer2014-12-261-2/+2
| | | | | | | m7 is not overwritten, so we only need to clear it once. Found by Christophe Gisquet. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_fspp: port inline asm to yasmJames Almer2014-12-264-1410/+778
| | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
OpenPOWER on IntegriCloud