| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Fix related register order issue in ff_h264_idct_add_neon.
Found-by: zjh8890 <243186085@qq.com>
|
|
|
|
|
|
|
|
| |
This macro unconditionally used out[-1], which causes an out of bounds
read, if out is the very beginning of the buffer.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
|
|
|
|
|
|
|
|
|
| |
Due to this typo max_center can be too large, causing nlsf to be set to
too large values, which in turn can cause nlsf[i - 1] + min_delta[i] to
overflow to a negative value, which is not allowed for nlsf and can
cause an out of bounds read in silk_lsf2lpc.
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Quite a bit faster than int32_to_float_fmul_array8_c calling
ff_int32_to_float_fmul_scalar_neon through FmtConvertContext.
Number of cycles per int32_to_float_fmul_array8 call while decoding
padded.dts on exynos5422:
before after change
cortex-a7: 1270 951 -25%
cortex-a15: 434 285 -34%
checkasm --bench cycle counts: cortex-a15 cortex-a7
int32_to_float_fmul_array8_c: 1730.4 4384.5
int32_to_float_fmul_array8_neon_c: 571.5 1694.3
int32_to_float_fmul_array8_neon: 374.0 1448.8
Interesting are the differences between
int32_to_float_fmul_array8_neon_c and int32_to_float_fmul_array8_neon.
The former is current behaviour of calling
ff_int32_to_float_fmul_scalar_neon repeatedly from the c function,
The raw numbers differ since checkasm uses different lengths than the
dca decoder.
|
|
|
|
|
|
|
|
|
|
| |
3% faster dts decoding on a cortex-a57.
cortex-a57 cortex-a53
int32_to_float_fmul_array8_c: 1270.9 4475.6
int32_to_float_fmul_array8_neon: 328.6 569.2
int32_to_float_fmul_scalar_c: 928.5 4119.6
int32_to_float_fmul_scalar_neon: 309.1 524.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
~25% faster dts decoding overall. The checkasm CPU cycles numbers are
not that useful since synth_filter_float() calls FFTContext.imdct_half().
cortex-a57 cortex-a53
synth_filter_float_c: 1866.2 3490.9
synth_filter_float_neon: 915.0 1531.5
With fftc.imdct_half forced to imdct_half_neon:
cortex-a57 cortex-a53
synth_filter_float_c: 1718.4 3025.3
synth_filter_float_neon: 926.2 1530.1
|
|
|
|
|
|
|
|
|
|
|
|
| |
~2% faster dts decoding overall.
cortex-a57 cortex-a53
dca_decode_hf_c: 474.8 1659.9
dca_decode_hf_neon: 225.2 301.1
dca_lfe_fir0_c: 913.2 1537.7
dca_lfe_fir0_neon: 286.8 451.9
dca_lfe_fir1_c: 848.7 1711.5
dca_lfe_fir1_neon: 387.1 506.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu
implementations do not support it in hardware. Vector mode code will
depending the OS either be emulated in software or result in an illegal
instruction on cpus which does not support it. This was not really
problem in practice since NEON implementations of the same functions are
preferred. It will however become a problem for checkasm which tests
every cpu flag separately.
Since this is a cpu feature newer cpu do not support anymore the
behaviour of this flag differs from the other flags. It can be only
activated by runtime cpu feature selection.
|
| |
|
|
|
|
| |
It's not used by anything outside of lavc anymore.
|
|
|
|
|
|
| |
Almost all the places from which this function is called already check
the header manually and in the two that don't (the mp3 muxer) the check
should not cause any problems.
|
| |
|
|
|
|
|
| |
The profiles are a property of the codec, so it makes sense to export
them through AVCodecDescriptors, not just the codec implementations.
|
| |
|
|
|
|
| |
Fixes 2507b5dd674834be7261772996f47ae3b95cca69
|
|
|
|
|
|
|
| |
fixes assembling on OS/2
Signed-off-by: Dave Yeo <dave.r.yeo@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
|
|
|
|
|
| |
Make easier to avoid compile failure when reworking the internal
headers.
|
|
|
|
| |
Avoid the warning `-Wempty-body`.
|
|
|
|
|
|
|
| |
Fix fate tests with asan. Introduced during bytestream2 porting
(in revision 62cc8f4d79dad119e8efeaae080a58a8dcb1e89d).
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These variables are coming from mpegvideoenc where are supposedly used
as bit counters on various frame properties. However their use is
unclear as they lack documentation, are available only from a very small
subset of encoders, and they are hardly used in the wild. Also frame_bits
in aacenc is employed in a similar way.
Remove this functionality from AVCodecContex, these variable are mostly
frame properties, and too few encoders support setting them with anything
useful.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Most option values are simply unused or ignored and in practice the
majory of codecs only need to check whether to enable rle or not.
Add appropriate codec private options which better expose the allowed
features.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
|
|
|
|
| |
We do not need to do a full setup like for a real frame, just allocate a
buffer and set cur_pic(_ptr).
|
| |
|
|
|
|
| |
EOVERFLOW seems to be unavailable on certain platforms.
|
|
|
|
| |
Also, stop using AVCodecContext for storing the stream parameters.
|
| |
|
| |
|
|
|
|
| |
Deprecate AVCodecContext.vbv_delay
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This is similar to what is done for AVStream.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Fall back to maximum DPB size if the level is unknown.
This should be more spec-compliant and does not depend on the caller
setting has_b_frames before opening the decoder.
The old behaviour, when the delay is supplied by the caller setting
has_b_frames, can still be obtained by setting strict_std_compliance
below normal.
|
|
|
|
|
| |
That is a more appropriate place for it, since it is not allowed to
change between slices.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the spec, the reference list for a slice should be
constructed by first generating an initial (what we now call "default")
reference list and then optionally applying modifications to it.
Our code has an optimization where the initial reference list is
constructed for the first inter slice and then rebuilt for other slices
if needed. This, however, adds complexity to the code, requires an extra
2.5kB array in the codec context and there is no reason to think that it
has any positive effect on performance. Therefore, simplify the code by
generating the reference list from scratch for each slice.
|
| |
|
|
|
|
|
|
| |
Currently, the frame stride is passed in bytes, while the MC buffer size
is in int16_t elements, This can be confusing, so pass both strides in
bytes.
|
|
|
|
| |
This should allow for more efficient SIMD.
|
|
|
|
| |
This should allow for more efficient SIMD.
|
|
|
|
|
|
|
| |
This should allow for more efficient SIMD.
Keep the C versions as they are now, to allow the compiler to inline the
interpolation coefficients.
|
|
|
|
| |
Avoid the warning `-Wempty-body`.
|
|
|
|
|
|
| |
The end-marked was typoed in
f7edcac040f73635fc1127489c9bb29ca8b43532
|
| |
|
|
|
|
| |
Bug-Id: 761
|
|
|
|
| |
Unbreak make check.
|
|
|
|
|
|
| |
Additional improvements by Michael Niedermayer <michaelni@gmx.at>.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|
|
|
|
| |
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
|