summaryrefslogtreecommitdiffstats
path: root/lib/msun/src/e_expf.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix some regressions caused by the switch from gcc to clang. The fixesdas2013-05-271-2/+3
| | | | | | | | | | are workarounds for various symptoms of the problem described in clang bugs 3929, 8100, 8241, 10409, and 12958. The regression tests did their job: they failed, someone brought it up on the mailing lists, and then the issue got ignored for 6 months. Oops. There may still be some regressions for functions we don't have test coverage for yet.
* Revert r241756imp2012-10-221-62/+0
|
* Document the method used to compute expf. Taken from exp, withimp2012-10-191-0/+62
| | | | changes to reflect differences in computation between the two.
* Use STRICT_ASSIGN() to ensure that the compiler doesn't screw thingsdas2011-10-211-2/+4
| | | | | | up by storing x in a wider type than it's supposed to. Submitted by: bde
* Fix a bogus threshold that was copied from the double precision version.das2011-02-101-1/+1
| | | | | | | This commit should have no effect on correctness; it merely changes the threshold at which a simpler approximation can be used. Reviewed by: bde
* s/rcsid/__FBSDID/das2008-02-221-3/+2
|
* Use a better method of scaling by 2**k. Instead of adding to thebde2008-02-071-9/+8
| | | | | | | | | | | | | | | | | | | | | exponent bits of the reduced result, construct 2**k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2**k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2* on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2**k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. This change ld128/s_exp2l.c has not been tested.
* As for the float trig functions and logf, use a minimax polynomialbde2008-02-061-6/+7
| | | | | | | | | | | | | that is specialized for float precision. The new polynomial has degree 5 instead of 11, and a maximum error of 2**-27.74 ulps instead of 2**-30.64. This doesn't affect the final error significantly; the maximum error was and is about 0.9101 ulps on amd64 -01 and the number of cases with an error of > 0.5 ulps is actually reduced by epsilon despite the larger error in the polynomial. This is about 15% faster on amd64 (A64), i386 (A64) and ia64. The asm version is still used instead of this on i386 since it is faster and more accurate.
* Use volatile hacks to make sure these functions generate an underflowdas2008-01-181-1/+2
| | | | | exception when they're supposed to. Previously, gcc -O2 was optimizing away the statement that generated it.
* Fixed the hi+lo approximation to log(2). The normal 17+24 bit decompositionbde2005-11-301-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | that was used doesn't work normally here, since we want to be able to multiply `hi' by the exponent of x _exactly_, and the exponent of x has more than 7 significant bits for most denormal x's, so the multiplication was not always exact despite a cloned comment claiming that it was. (The comment is correct in the double precision case -- with the normal 33+53 bit decomposition the exponent can have 20 significant bits and the extra bit for denormals is only the 11th.) Fixing this had little or no effect for denormals (I think because more precision is inherently lost for denormals than is lost by roundoff errors in the multiplication). The fix is to reduce the precision of the decomposition to 16+24 bits. Due to 2 bugs in the old deomposition and numerical accidents, reducing the precision actually increased the precision of hi+lo. The old hi+lo had about 39 bits instead of at least 41 like it should have had. There were off-by-1-bit errors in each of hi and lo, apparently due to mistranslation from the double precision hi and lo. The correct 16 bit hi happens to give about 19 bits of precision, so the correct hi+lo gives about 43 bits instead of at least 40. The end result is that expf() is now perfectly rounded (to nearest) except in 52561 cases instead of except in 67027 cases, and the maximum error is 0.5013 ulps instead of 0.5023 ulps.
* Revert rev 1.8, which causes small (e.g. 2 ulp) errors for somedas2005-02-241-8/+13
| | | | | | | | | inputs. The trouble with replacing two floats with a double is that the latter has 6 extra bits of precision, which actually hurts accuracy in many cases. All of the constants are optimal when float arithmetic is used, and would need to be recomputed to do this right. Noticed by: bde (ucbtest)
* Use double arithmetic instead of simulating it with two floats. Thisdas2005-02-211-13/+8
| | | | | | | results in a performance gain on the order of 10% for amd64 (sledge), ia64 (pluto1), i386+SSE (Pentium 4), and sparc64 (panther), and a negligible improvement for i386 without SSE. (The i386 port still uses the hardware instruction, though.)
* Assume __STDC__, remove non-__STDC__ code.alfred2002-05-281-10/+2
| | | | Submitted by: keramida
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Revert $FreeBSD$ to $Id$peter1997-02-221-1/+1
|
* Make the long-awaited change from $Id$ to $FreeBSD$jkh1997-01-141-1/+1
| | | | | | | | This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
* General -Wall warning cleanup, part I.jkh1996-07-121-3/+3
| | | | Submitted-By: Kent Vander Velden <graphix@iastate.edu>
* Remove trailing whitespace.rgrimes1995-05-301-5/+5
|
* J.T. Conklin's latest version of the Sun math library.jkh1994-08-191-0/+103
-- Begin comments from J.T. Conklin: The most significant improvement is the addition of "float" versions of the math functions that take float arguments, return floats, and do all operations in floating point. This doesn't help (performance) much on the i386, but they are still nice to have. The float versions were orginally done by Cygnus' Ian Taylor when fdlibm was integrated into the libm we support for embedded systems. I gave Ian a copy of my libm as a starting point since I had already fixed a lot of bugs & problems in Sun's original code. After he was done, I cleaned it up a bit and integrated the changes back into my libm. -- End comments Reviewed by: jkh Submitted by: jtc
OpenPOWER on IntegriCloud