ffmpeg-streaming - Raptor Engineering's fork of FFmpeg with streaming enhancements https://git.ffmpeg.org/ffmpeg.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	arm: use a local label instead of the function symbol in ff_prefetch_arm	Janne Grunau	2015-07-20	1	-1/+2
\| \| \| \| \| \| \| \|	Avoids a relocation which might end out of range for thumb2. Reported-By: Ludovic Fauvet <etix@videolan.org> Bug-Id: https://bugs.webkit.org/show_bug.cgi?id=137022 CC: libav-stable@libav.org
*	h264: arm: use intra pred8x8 functions only for chroma_format_idc <= 1	Janne Grunau	2015-07-18	1	-14/+16
\|
*	configure: Factor out g722dsp module	Vittorio Giovara	2015-07-17	1	-4/+2
\|
*	configure: Factor out vp8dsp module	Vittorio Giovara	2015-07-17	1	-12/+6
\|
*	configure: Factor out rv34dsp module	Vittorio Giovara	2015-07-17	1	-3/+2
\|
*	configure: Factor out flacdsp module	Vittorio Giovara	2015-07-17	1	-2/+2
\|
*	lavc: do not compile fmtconvert unconditionally	Anton Khirnov	2015-02-28	1	-4/+3
\| \| \| \|	Only ac3dec and dcadec use it.
*	fmtconvert: drop unused functions	Anton Khirnov	2015-02-28	4	-434/+0
\|
*	g722: Add ARM NEON implementation for g722_apply_qmf()	Peter Meerwald	2015-02-15	3	-0/+108
\| \| \| \| \|	Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Martin Storsjö <martin@martin.st>
*	arm: mlpdsp: handle pic offset calculation in a macro	Janne Grunau	2014-12-09	1	-16/+20
\| \| \| \| \|	Makes the code easier to read since it hides different offset calculations for arm and thumb mode.
*	arm: make ff_mlp_filter_channel_arm and ff_mlp_rematrix_channel_arm position ↵	Janne Grunau	2014-12-09	1	-10/+13
\| \| \| \| \| \|	independent No significant difference in used cpu cycles on a cortex-a9.
*	arm: Use .data.rel.ro for const data with relocations	Martin Storsjö	2014-12-09	3	-3/+3
\| \| \| \|	Signed-off-by: Martin Storsjö <martin@martin.st>
*	arm: fft_vfp: Unify the behaviour in ff_fft_calc_vfp between arm/thumb	Martin Storsjö	2014-12-08	1	-10/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Don't include the function pointer table in the code segment in arm mode. This shouldn't have any significant performance effect. It does end up as a few more instructions than before, for ARM, but only at the entry to this function, not within the fft functions themselves. Signed-off-by: Martin Storsjö <martin@martin.st>
*	arm: fft_vfp: Add a missing "endconst" when building in thumb mode	Martin Storsjö	2014-12-08	1	-0/+1
\| \| \| \|	Signed-off-by: Martin Storsjö <martin@martin.st>
*	motion_est: convert stride to ptrdiff_t	Vittorio Giovara	2014-11-24	1	-5/+5
\| \| \| \| \|	CC: libav-stable@libav.org Bug-Id: CID 700556 / CID 700557 / CID 700558
*	idctdsp: Add global function pointers for {add\|put}_pixels_clamped functions	Diego Biurrun	2014-09-02	1	-7/+0
\| \| \| \| \| \|	These function pointers already existed in the ARM code. Adding them globally allows calls to the function pointers to access arch-optimized versions of the functions transparently.
*	build: Add explanatory comments to (optimization) blocks in the Makefiles	Diego Biurrun	2014-08-15	1	-0/+18
\|
*	mpegvideo: cosmetics: Lowercase ugly uppercase MPV_ function name prefixes	Diego Biurrun	2014-08-15	3	-4/+4
\|
*	vc-1: Add platform-specific start code search routine to VC1DSPContext.	Ben Avison	2014-08-04	1	-0/+3
\| \| \| \| \| \| \|	Initialise VC1DSPContext for parser as well as for decoder. Note, the VC-1 code doesn't actually use the function pointer yet. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
*	h264: Move start code search functions into separate source files.	Ben Avison	2014-08-04	4	-6/+31
\| \| \| \| \| \|	This permits re-use with parsers for codecs which use similar start codes. Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
*	qpeldsp: Mark source pointer in qpel_mc_func function pointer const	Diego Biurrun	2014-07-25	2	-66/+68
\|
*	arm: Macroize the test for 'setend' CPU instruction support	Ben Avison	2014-07-21	1	-5/+1
\| \| \| \|	Signed-off-by: Diego Biurrun <diego@biurrun.de>
*	dct-test: Move arch-specific bits into arch-specific subdirectories	Diego Biurrun	2014-07-21	1	-0/+40
\|
*	idct: Move arm-specific declarations to a header in the arm directory	Diego Biurrun	2014-07-20	5	-15/+44
\|
*	idctdsp: prettyprinting cosmetics	Diego Biurrun	2014-07-18	4	-20/+20
\|
*	idct: Convert IDCT permutation #defines to an enum	Diego Biurrun	2014-07-18	4	-5/+5
\| \| \| \|	Also rename the enum values to be consistent with other DCT permutations.
*	arm: cosmetics: Consistently use lowercase for shift operators	Martin Storsjö	2014-07-18	2	-3/+3
\| \| \| \|	Signed-off-by: Martin Storsjö <martin@martin.st>
*	arm: cosmetics: Fix a misaligned asm operand	Martin Storsjö	2014-07-18	1	-1/+1
\| \| \| \|	Signed-off-by: Martin Storsjö <martin@martin.st>
*	armv6: Accelerate ff_fft_calc for general case (nbits != 4)	Ben Avison	2014-07-18	2	-17/+255
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementation targeted DTS Coherent Acoustics, which only requires nbits == 4 (fft16()). This case was (and still is) linked directly rather than being indirected through ff_fft_calc_vfp(), but now the full range from radix-4 up to radix-65536 is available. This benefits other codecs such as AAC and AC3. The implementaion is based upon the C version, with each routine larger than radix-16 calling a hierarchy of smaller FFT functions, then performing a post-processing pass. This pass benefits a lot from loop unrolling to counter the long pipelines in the VFP. A relaxed calling standard also reduces the overhead of the call hierarchy, and avoiding the excessive inlining performed by GCC probably helps with I-cache utilisation too. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in the FFT routines (fft4() to fft512() and pass()) for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 2245.5 53.1 1599.6 43.8 100.0% +40.4% FFT routines 940.6 22.0 348.1 20.8 100.0% +170.2% Signed-off-by: Martin Storsjö <martin@martin.st>
*	armv6: Accelerate ff_imdct_half for general case (mdct_bits != 6)	Ben Avison	2014-07-18	1	-2/+144
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous implementation targeted DTS Coherent Acoustics, which only requires mdct_bits == 6. This relatively small size lent itself to unrolling the loops a small number of times, and encoding offsets calculated at assembly time within the load/store instructions of each iteration. In the more general case (codecs such as AAC and AC3) much larger arrays are used - mdct_bits == [8, 9, 11]. The old method does not scale for these cases, so more integer registers are used with non-unrolled versions of the loops (and with some stack spillage). The postrotation filter loop is still unrolled by a factor of 2 to permit the double-buffering of some VFP registers to facilitate overlap of neighbouring iterations. I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in ff_imdct_half_c / ff_imdct_half_vfp, for the same example AAC stream: Before After Mean StdDev Mean StdDev Confidence Change aac_decode_frame 2368.1 35.8 2117.2 35.3 100.0% +11.8% ff_imdct_half_* 457.5 22.4 251.2 16.2 100.0% +82.1% Signed-off-by: Martin Storsjö <martin@martin.st>
*	dsputil: Split motion estimation compare bits off into their own context	Diego Biurrun	2014-07-17	3	-5/+4
\|
*	arm: dsputil: Coalesce all init files	Diego Biurrun	2014-07-16	4	-91/+28
\|
*	dsputil: Drop unused bit_depth parameter from all init functions	Diego Biurrun	2014-07-11	3	-7/+4
\|
*	dsputil: Split off pixel block routines into their own context	Diego Biurrun	2014-07-09	5	-63/+120
\|
*	arm: Avoid using the 'setend' instruction on ARMv7 and newer	Martin Storsjö	2014-07-08	1	-1/+5
\| \| \| \| \| \| \| \| \| \|	This instruction is deprecated on ARMv8, and it is serializing on some ARMv7 cores as well [1]. [1] http://article.gmane.org/gmane.linux.ports.arm.kernel/339293 CC: libav-stable@libav.org Signed-off-by: Martin Storsjö <martin@martin.st>
*	dsputil: Move pix_sum, pix_norm1, shrink function pointers to mpegvideoenc	Diego Biurrun	2014-07-06	5	-61/+116
\|
*	dsputil: Split off IDCT bits into their own context	Diego Biurrun	2014-06-30	13	-128/+250
\|
*	h264: avoid using uninitialized memory in NEON chroma mc	Janne Grunau	2014-06-23	1	-4/+56
\| \| \| \| \|	Adapt commit 982b596ea6640bfe218a31f6c3fc542d9fe61c31 for the arm and aarch64 NEON asm. 5-10% faster on Cortex-A9.
*	dsputil: Split audio operations off into a separate context	Diego Biurrun	2014-06-22	7	-55/+168
\|
*	dsputil: Split clear_block/fill_block off into a separate context	Diego Biurrun	2014-06-18	7	-24/+137
\|
*	arm: check if AS supports .dn	Janne Grunau	2014-06-03	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Move the GNU as check before the arch specific asm checks since the .dn check requires gas compatible assembler. Disable the VC-1 motion compensation NEON asm which is the only part using that directive. The integrated assembler in the upcoming clang 3.5 does not support .dn/.qn without plans to change that. Too much effort to implement it while it is rarely used. http://llvm.org/bugs/show_bug.cgi?id=18199.
*	dsputil: Move APE-specific bits into apedsp	Diego Biurrun	2014-05-29	5	-45/+102
\|
*	mpegvideo: move the MpegEncContext fields used from arm asm to the beginning	Anton Khirnov	2014-04-29	1	-6/+6
\| \| \| \| \|	This should reduce the frequency with which the offsets need to be updated.
*	lavu: add CHK_OFFS as AV_CHECK_OFFSET to check struct member offsets	Janne Grunau	2014-04-24	2	-13/+8
\|
*	Remove a number of unnecessary dsputil.h #includes	Diego Biurrun	2014-04-04	1	-1/+0
\|
*	arm: asm decode_block_coeffs_internal is vp8 specific	Janne Grunau	2014-04-04	1	-1/+1
\| \| \| \| \|	Unbreaks compilation on arm due to conflicting types for 'ff_decode_block_coeffs_armv6'.
*	On2 VP7 decoder	Peter Ross	2014-04-04	6	-46/+72
\| \| \| \| \| \| \| \| \|	Further performance improvements and security fixes by Vittorio Giovara, Luca Barbato and Diego Biurrun. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org> Signed-off-by: Diego Biurrun <diego@biurrun.de>
*	arm: build: Maintain decoder objects separate from infrastructure objects	Diego Biurrun	2014-03-27	1	-3/+4
\|
*	truehd: add hand-scheduled ARM asm version of ff_mlp_pack_output.	Ben Avison	2014-03-26	3	-0/+628
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Profiling results for overall decode and the output_data function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 339.6 15.1 329.3 16.0 95.8% +3.1% (insignificant) 6:2 function 24.6 6.0 9.9 3.1 100.0% +148.5% 8:2 total 324.5 15.5 323.6 14.3 15.2% +0.3% (insignificant) 8:2 function 20.4 3.9 9.9 3.4 100.0% +104.7% 6:6 total 572.8 20.6 539.9 24.2 100.0% +6.1% 6:6 function 54.5 5.6 16.0 3.8 100.0% +240.9% 8:8 total 741.5 21.2 702.5 18.5 100.0% +5.6% 8:8 function 63.9 7.6 18.4 4.8 100.0% +247.3% The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Martin Storsjö <martin@martin.st>
*	truehd: add hand-scheduled ARM asm version of ff_mlp_rematrix_channel.	Ben Avison	2014-03-26	2	-0/+234
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Profiling results for overall audio decode and the rematrix_channels function in particular are as follows: Before After Mean StdDev Mean StdDev Confidence Change 6:2 total 370.8 17.0 348.8 20.1 99.9% +6.3% 6:2 function 46.4 8.4 45.8 6.6 18.0% +1.2% (insignificant) 8:2 total 343.2 19.0 339.1 15.4 54.7% +1.2% (insignificant) 8:2 function 38.9 3.9 40.2 6.9 52.4% -3.2% (insignificant) 6:6 total 658.4 15.7 604.6 20.8 100.0% +8.9% 6:6 function 109.0 8.7 59.5 5.4 100.0% +83.3% 8:8 total 896.2 24.5 766.4 17.6 100.0% +16.9% 8:8 function 223.4 12.8 93.8 5.0 100.0% +138.3% The assembly version has also been tested with a fuzz tester to ensure that any combinations of inputs not exercised by my available test streams still generate mathematically identical results to the C version. Signed-off-by: Martin Storsjö <martin@martin.st>