summaryrefslogtreecommitdiffstats
path: root/lib/msun/src/s_expm1f.c
Commit message (Collapse)AuthorAgeFilesLines
* As usual, use a minimax polynomial that is specialized for floatbde2008-02-091-7/+8
| | | | | | | | | | | | | | | | | | | | | | | precision. The new polynomial has degree 4 instead of 10, and a maximum error of 2**-30.04 ulps instead of 2**-33.15. This doesn't affect the final error significantly; the maximum error was and is about 0.5015 ulps on i386 -O1, and the number of cases with an error of > 0.5 ulps is increased from 13851 to 14407. Note that the error is only this close to 0.5 ulps due to excessive extra precision caused by compiler bugs on i386. The extra precision could be obtained intentionally, and is useful for keeping the error of the hyperbolic float functions below 1 ulp, since these functions are implemented using expm1f. My recent change for scaling by 2**k had the unintentional side effect of retaining extra precision for longer, so callers of expm1f see errors of more like 0.0015 ulps than 0.5015 ulps, and for the hyperbolic functions this reduces the maximum error from nearly about 2 ulps to about 0.75 ulps. This is about 10% faster on i386 (A64). expm1* is still very slow, but now the float version is actually significantly faster. The algorithm is very sophisticated but not very good except on machines with fast division.
* Use a better method of scaling by 2**k. Instead of adding to thebde2008-02-071-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | exponent bits of the reduced result, construct 2**k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2**k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2* on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2**k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. Details specific to expm1*: - the saving is closer to 12 cycles than to 40 for expm1* on i386 (A64). For some reason it is much larger for negative args. - also convert to __FBSDID().
* Fix formatting, this is hard to explain, so I'll show one example.alfred2002-05-281-1/+2
| | | | | | | | | | - float ynf(int n, float x) /* wrapper ynf */ +float +ynf(int n, float x) /* wrapper ynf */ This is because the __STDC__ stuff was indented. Reviewed by: md5
* Assume __STDC__, remove non-__STDC__ code.alfred2002-05-281-9/+0
| | | | Reviewed by: md5
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Revert $FreeBSD$ to $Id$peter1997-02-221-1/+1
|
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* Remove trailing whitespace.rgrimes1995-05-301-6/+6
|
* J.T. Conklin's latest version of the Sun math library.jkh1994-08-191-0/+133
-- Begin comments from J.T. Conklin: The most significant improvement is the addition of "float" versions of the math functions that take float arguments, return floats, and do all operations in floating point. This doesn't help (performance) much on the i386, but they are still nice to have. The float versions were orginally done by Cygnus' Ian Taylor when fdlibm was integrated into the libm we support for embedded systems. I gave Ian a copy of my libm as a starting point since I had already fixed a lot of bugs & problems in Sun's original code. After he was done, I cleaned it up a bit and integrated the changes back into my libm. -- End comments Reviewed by: jkh Submitted by: jtc
OpenPOWER on IntegriCloud