| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This supersedes the fix for the old algorithm in rev.1.8 of k_cosf.c.
I want this change mainly because it is an optimization. It helps
make software cos[f](x) and sin[f](x) faster than the i387 hardware
versions for small x. It is also a simplification, and reduces the
maximum relative error for cosf() and sinf() on machines like amd64
from about 0.87 ulps to about 0.80 ulps. It was validated for cosf()
and sinf() by exhaustive testing. Exhaustive testing is not possible
for cos() and sin(), but ucbtest reports a similar reduction for the
worst case found by non-exhaustive testing. ucbtest's non-exhaustive
testing seems to be good enough to find problems in algorithms but not
maximum relative errors when there are spikes. E.g., short runs of
it find only 3 ulp error where the i387 hardware cos() has an error
of about 2**40 ulps near pi/2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
{cos_sin}[f](x) so that x doesn't need to be reclassified in the
"kernel" functions to determine if it is tiny (it still needs to be
reclassified in the cosine case for other reasons that will go away).
This optimization is quite large for exponentially distributed x, since
x is tiny for almost half of the domain, but it is a pessimization for
uniformally distributed x since it takes a little time for all cases
but rarely applies. Arg reduction on exponentially distributed x
rarely gives a tiny x unless the reduction is null, so it is best to
only do the optimization if the initial x is tiny, which is what this
commit arranges. The imediate result is an average optimization of
1.4% relative to the previous version in a case that doesn't favour
the optimization (double cos(x) on all float x) and a large
pessimization for the relatively unimportant cases of lgamma[f][_r](x)
on tiny, negative, exponentially distributed x. The optimization should
be recovered for lgamma*() as part of fixing lgamma*()'s low-quality
arg reduction.
Fixed various wrong constants for the cutoff for "tiny". For cosine,
the cutoff is when x**2/2! == {FLT or DBL}_EPSILON/2. We round down
to an integral power of 2 (and for cos() reduce the power by another
1) because the exact cutoff doesn't matter and would take more work
to determine. For sine, the exact cutoff is larger due to the ration
of terms being x**2/3! instead of x**2/2!, but we use the same cutoff
as for cosine. We now use a cutoff of 2**-27 for double precision and
2**-12 for single precision. 2**-27 was used in all cases but was
misspelled 2**27 in comments. Wrong and sloppy cutoffs just cause
missed optimizations (provided the rounding mode is to nearest --
other modes just aren't supported).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- float ynf(int n, float x) /* wrapper ynf */
+float
+ynf(int n, float x) /* wrapper ynf */
This is because the __STDC__ stuff was indented.
Reviewed by: md5
|
|
|
|
| |
Reviewed by: md5
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
|
| |
|
|
-- Begin comments from J.T. Conklin:
The most significant improvement is the addition of "float" versions
of the math functions that take float arguments, return floats, and do
all operations in floating point. This doesn't help (performance)
much on the i386, but they are still nice to have.
The float versions were orginally done by Cygnus' Ian Taylor when
fdlibm was integrated into the libm we support for embedded systems.
I gave Ian a copy of my libm as a starting point since I had already
fixed a lot of bugs & problems in Sun's original code. After he was
done, I cleaned it up a bit and integrated the changes back into my
libm.
-- End comments
Reviewed by: jkh
Submitted by: jtc
|