summaryrefslogtreecommitdiffstats
path: root/libavfilter/x86
Commit message (Collapse)AuthorAgeFilesLines
* avfilter/vf_convolution: add x86 SIMD for filter_3x3()Ruiling Song2019-08-073-0/+204
| | | | | | | | | | | Tested using a simple command (apply edge enhance): ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \ -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:5:1:1:1:0:128:128:128" \ -an -vframes 1000 -f null /dev/null The fps increase from 151 to 270 on my local machine. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
* avfilter/vf_gblur: add missing preprocessor checkJames Almer2019-06-121-0/+4
| | | | | | Fixes compilation on x86_32 Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/vf_gblur: add x86 SIMD optimizationsRuiling Song2019-06-123-0/+223
| | | | | | | | | | | | | The horizontal pass get ~2x performance with the patch under single thread. Tested overall performance using the command(avx2 enabled): ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null For single thread, the fps improves from 43 to 60, about 40%. For multi-thread, the fps improves from 110 to 130, about 20%. Signed-off-by: Ruiling Song <ruiling.song@intel.com>
* avfilter: add anlmdn filter x86 SIMD optimizationsPaul B Mahol2019-01-103-0/+117
|
* x86/af_afir: use three operand form forat some instructionsJames Almer2019-01-031-10/+10
| | | | | | Fixes compilation with old yasm versions. Signed-off-by: James Almer <jamrial@gmail.com>
* x86/af_afir: add ff_fcmul_add_avx()James Almer2019-01-032-1/+12
| | | | | | | | | | | fcmul_add_c: 1228.8 fcmul_add_sse3: 334.3 fcmul_add_avx: 186.3 Tested on a Core i5 4460 @ 3.2GHz Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/af_afir: split off fcmul_add into a DSP contextJames Almer2019-01-031-1/+1
| | | | | Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/af_afir: fix processing the last elementJames Almer2019-01-031-2/+5
| | | | | | | ff_fcmul_add_sse3() is now identical to the C version. Reviewed-by: Paul B Mahol <onemda@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* x86/scene_sad: fix link errors when HAVE_X86ASM is not definedJames Almer2018-11-211-1/+9
| | | | | Reviewed-by: Haihao Xiang <haihao.xiang@intel.com> Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/vf_blend: add 10bit supportPaul B Mahol2018-11-151-3/+3
|
* avfilter/vf_bwdif: Use common yadif frame management logicPhilip Langdale2018-11-141-1/+2
| | | | | After adding field type management to the common yadif logic, we can remove the duplicate copy of that logic from bwdif.
* avfilter/vf_framerate: factorize SAD functions which compute SAD for a whole ↵Marton Balint2018-11-113-0/+130
| | | | | | | | | frame Also add SIMD which works on lines because it is faster then calculating it on 8x8 blocks using pixelutils. Signed-off-by: Marton Balint <cus@passwd.hu>
* avfilter/vf_overlay: exclude nv12/nv21 formats from x86 asm checkPaul B Mahol2018-05-031-1/+3
| | | | | | They are yet to be supported, Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_overlay: add x86 SIMDPaul B Mahol2018-05-023-0/+209
| | | | | | | Specifically for yuv444, yuv422, yuv420 format when main stream has no alpha, and alpha is straight. Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter/vf_interlace: remove duplicate code with same funcionalityVasile Toncu2018-04-232-89/+1
|
* avfilter/x86/vf_blend : add SIMD for 16 bit version ofMartin Vignali2018-04-052-60/+128
| | | | | | | | grainextract grainmerge average extremity negation
* avfilter/x86/vf_blend : reorganize DIFFERENCE macro to reduce line ↵Martin Vignali2018-04-051-22/+16
| | | | duplication between 8bit and 16 bit version
* avfilter/x86/vf_blend : add 16 bit version for BLEND_SIMPLE, phoenix, ↵Martin Vignali2018-02-242-13/+116
| | | | difference for SSE and AVX2 (x86_64)
* avfilter/x86/vf_blend : indentMartin Vignali2018-02-241-47/+47
|
* avfilter/x86/vf_blend : reorganize init in order to add 16 bit versionMartin Vignali2018-02-241-3/+5
|
* avfilter/x86/vf_blend : avfilter/x86/vf_blend : add AVX2 version for each ↵Martin Vignali2018-01-282-84/+184
| | | | | | func except divide and optimize average, grainextract, multiply, screen, grain merge
* avfilter/vf_framerate: add SIMD functions for frame blendingMarton Balint2018-01-283-0/+178
| | | | | | | | | | | | | | | | | | Blend function speedups on x86_64 Core i5 4460: ffmpeg -f lavfi -i allyuv -vf framerate=60:threads=1 -f null none C: 447548411 decicycles in Blend, 2048 runs, 0 skips SSSE3: 130020087 decicycles in Blend, 2048 runs, 0 skips AVX2: 128508221 decicycles in Blend, 2048 runs, 0 skips ffmpeg -f lavfi -i allyuv -vf format=yuv420p12,framerate=60:threads=1 -f null none C: 228932745 decicycles in Blend, 2048 runs, 0 skips SSE4: 123357781 decicycles in Blend, 2048 runs, 0 skips AVX2: 121215353 decicycles in Blend, 2048 runs, 0 skips Signed-off-by: Marton Balint <cus@passwd.hu>
* avfilter/x86/vf_interlace : add AVX2 versionMartin Vignali2018-01-113-1/+50
|
* Revert "avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16"James Almer2017-12-193-33/+0
| | | | | | | | | This reverts commits 1a5865b6dcc97754a1d7eedc130fb58237d2a715 and 8fb1d63d919286971b8e6afad372730d6d6f25c8. They made fate interlace tests fail when AVX2 was used. Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_hflip : indentMartin Vignali2017-12-191-5/+5
| | | | based on patch by Paul B Mahol
* avfilter/x86/vf_hflip : add avx2 version for hflip_byte and hflip_shortMartin Vignali2017-12-192-5/+27
|
* avfilter/x86/vf_hflip : merge hflip byte and hflip short to one macroMartin Vignali2017-12-191-44/+17
|
* avfilter/vf_tinterlace : add AVX2 func for lowpass_line 8 and 16Martin Vignali2017-12-191-0/+16
|
* avfilter/vf_interlace : add AVX2 for lowpass_line 8 and 16Martin Vignali2017-12-192-0/+17
|
* avfilter/vf_interlace : move func init in ff_interlace_init and add depth ↵Martin Vignali2017-12-191-2/+2
| | | | arg for ff_interlace_init_x86
* avfilter/x86/vf_interlace : avfilter/x86/vf_interlace : fix crash when using ↵Martin Vignali2017-12-151-22/+25
| | | | | | unaligned data in low_pass complex related to ticket 6491
* avfilter/x86/vf_interlace : avoid crash when data are unalignedMartin Vignali2017-12-151-2/+2
| | | | ticket 6491
* avfilter/x86/vf_threshold : add threshold16 SIMD (SSE4 and AVX2)Martin Vignali2017-12-092-21/+34
|
* x86/vf_hflip: use xor to zero initialize registersJames Almer2017-12-071-2/+2
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_hflip: don't load the width argument twiceJames Almer2017-12-071-3/+2
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* x86/vf_threshold: make threshold8 functions work on x86_32James Almer2017-12-042-12/+17
| | | | Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_hflip.asm: fix building on x32Paul B Mahol2017-12-041-6/+6
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* avfilter: add hflip x86 SIMDPaul B Mahol2017-12-043-0/+151
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* x86vf_threshold/: use the PBLENDVB macroJames Almer2017-12-041-1/+1
| | | | | | | Fixes building with yasm Tested-by: stevenliu Signed-off-by: James Almer <jamrial@gmail.com>
* avfilter/x86/vf_threshold : cosmetic indentMartin Vignali2017-12-031-8/+8
|
* avfilter/x86/vf_threshold : add avx2 version for threshold 8Martin Vignali2017-12-032-3/+22
|
* avfilter/x86/vf_threshold : make macro for threshold8 in order to add avx2 ↵Martin Vignali2017-12-031-1/+5
| | | | version
* avfilter/vf_threshold: add x86 SIMDPaul B Mahol2017-12-023-0/+112
| | | | Signed-off-by: Paul B Mahol <onemda@gmail.com>
* Merge commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2'James Almer2017-10-211-24/+0
|\ | | | | | | | | | | | | | | | | * commit '994c4bc10751e39c7ed9f67ffd0c0dea5223daf2': x86util: Port all macros to cpuflags See d5f8a642f6eb1c6e305c41dabddd0fd36ffb3f77 Merged-by: James Almer <jamrial@gmail.com>
| * build: Generalize yasm/nasm-related variable namesDiego Biurrun2017-03-013-11/+11
| | | | | | | | None of them are specific to the YASM assembler.
| * x86: Add missing colons after assembly labelsDiego Biurrun2016-10-171-1/+1
| | | | | | | | | | This fixes many warnings of the sort warning: label alone on a line without a colon might be in error
* | avfilter/interlace: add support for 10 and 12 bitThomas Mundt2017-09-233-35/+147
| | | | | | | | | | | | Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
* | avfilter/interlace: prevent over-sharpening with the complex low-pass filterThomas Mundt2017-09-151-22/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The complex vertical low-pass filter slightly over-sharpens the picture. This becomes visible when several transcodings are cascaded and the error potentises, e.g. some generations of HD->SD SD->HD. To prevent this behaviour the destination pixel must not exceed the source pixel when the average of the pixels above and below is less than the source pixel. And the other way around. Tested and approved in a visual transcoding cascade test by video professionals. SSIM/PSNR test with the first generation of an HD->SD file as a reference against the 6th generation(3 x SD->HD HD->SD): Results without the patch: SSIM Y:0.956508 (13.615881) U:0.991601 (20.757750) V:0.993004 (21.551382) All:0.974405 (15.918463) PSNR y:31.838009 u:48.424280 v:48.962711 average:34.759466 min:31.699297 max:40.857847 Results with the patch: SSIM Y:0.970051 (15.236232) U:0.991883 (20.905857) V:0.993174 (21.658049) All:0.981290 (17.279202) PSNR y:34.412108 u:48.504454 v:48.969496 average:37.264644 min:34.310637 max:42.373392 Signed-off-by: Thomas Mundt <tmundt75@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | avfilter/vf_blend: rename addition128 and difference128 to grainmerge and ↵Paul B Mahol2017-08-242-9/+9
| | | | | | | | grainextract
* | x86/vf_limiter: make limiter functions work on x86_32James Almer2017-07-132-18/+14
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
OpenPOWER on IntegriCloud