summaryrefslogtreecommitdiffstats
path: root/libavfilter/x86/vf_blend.asm
Commit message (Collapse)AuthorAgeFilesLines
* avfilter/x86/vf_blend : add SIMD for 16 bit version ofMartin Vignali2018-04-051-60/+108
| | | | | | | | grainextract grainmerge average extremity negation
* avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line ↵Martin Vignali2018-04-051-22/+16
| | | | duplication between 8bit and 16 bit version
* avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, ↵Martin Vignali2018-02-241-13/+62
| | | | difference for SSE and AVX2 (x86_64)
* avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each ↵Martin Vignali2018-01-281-84/+145
| | | | | | func except divide and optimize average, grainextract, multiply, screen, grain merge
* avfilter/vf_blend: rename addition128 and difference128 to grainmerge and ↵Paul B Mahol2017-08-241-2/+2
| | | | grainextract
* x86/vf_blend: use ABS2 macroJames Almer2017-06-271-6/+3
|
* x86/vf_blend: optimize difference and negation functionsJames Almer2017-06-271-16/+24
| | | | | | | Process more pixels per loop. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: add sse and ssse3 extremity functionsJames Almer2017-06-271-0/+25
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: Add SSE2 optimization for divideTimothy Gu2016-02-281-0/+30
| | | | | | 4.5x faster than C float version with autovectorization 10 x faster than C int version 25 x faster than C float version without autovectorization
* vf_blend: Reduce number of arguments for kernel functionTimothy Gu2016-02-141-2/+1
|
* x86/vf_blend: Add SSE2 optimization for screenTimothy Gu2016-02-101-0/+29
| | | | | | 10x faster than C. Reviewed-by: Paul B Mahol <onemda@gmail.com>
* x86/vf_blend: Move multiplying to a macroTimothy Gu2016-02-101-6/+10
| | | | Reviewed-by: Paul B Mahol <onemda@gmail.com>
* vf_blend: Add SSE2 optimization for multiplyTimothy Gu2016-02-081-0/+29
| | | | 5 times faster than C, 3 times overall.
* x86/vf_blend: add sse2 versions of blend_difference and blend_negationJames Almer2015-12-241-3/+9
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: make all functions work on x86_32James Almer2015-12-241-53/+50
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_blend: simplify using macrosJames Almer2015-12-241-243/+33
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_blend.asm: hardmix: do same with two pxor instructions lessPaul B Mahol2015-10-071-3/+4
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/x86/vf_blend.asm: 11th register is used, update functionsPaul B Mahol2015-10-071-14/+14
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/x86/vf_blend.asm: add hardmix and phoenix sse2 SIMDPaul B Mahol2015-10-071-0/+64
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_blend: add x86 SIMD for some modesPaul B Mahol2015-10-031-0/+367
Signed-off-by: Paul B Mahol <onemda@gmail.com>
OpenPOWER on IntegriCloud