Added comments about the apparently-magic rational function used in

the second step of approximating cbrt(x). It turns out to be neither very magic not nor very good. It is just the (2,2) Pade approximation to 1/cbrt(r) at r = 1, arranged in a strange way to use fewer operations at a cost of replacing 4 multiplications by 1 division, which is an especially bad tradeoff on machines where some of the multiplications can be done in parallel. A Remez rational approximation would give at least 2 more bits of accuracy, but the (2,2) Pade approximation already gives 6 more bits than needed. (Changed the comment which essentially says that it gives 3 more bits.) Lower order Pade approximations are not quite accurate enough for double precision but are plenty for float precision. A lower order Remez rational approximation might be enough for double precision too. However, rational approximations inherently require an extra division, and polynomial approximations work well for 1/cbrt(r) at r = 1, so I plan to switch to using the latter. There are some technical complications that tend to cost a division in another way.
author: bde <bde@FreeBSD.org> 2005-12-15 16:23:22 +0000
committer: bde <bde@FreeBSD.org> 2005-12-15 16:23:22 +0000
commit: 3abe21faf25d6ec4ac2be98d89714858d87f7203 (patch)
tree: 119a3be481e5e7be1179141205fe4bdefe22cea4 /lib/msun
parent: deb97ff8b5e4bc0566a5bba69abc633e59f7460e (diff)
download: FreeBSD-src-3abe21faf25d6ec4ac2be98d89714858d87f7203.zip
FreeBSD-src-3abe21faf25d6ec4ac2be98d89714858d87f7203.tar.gz
1 files changed, 15 insertions, 1 deletions
diff --git a/lib/msun/src/s_cbrt.c b/lib/msun/src/s_cbrt.c
index 4e7eaa8..6d90078 100644
--- a/lib/msun/src/s_cbrt.c
+++ b/lib/msun/src/s_cbrt.c
@@ -72,7 +72,21 @@ cbrt(double x)
 	} else
 	    SET_HIGH_WORD(t,sign|(hx/3+B1));
 
-    /* new cbrt to 23 bits; may be implemented in single precision */
+    /*
+     * New cbrt to 26 bits; may be implemented in single precision:
+     *    cbrt(x) = t*cbrt(x/t**3) ~= t*R(x/t**3)
+     * where R(r) = (14*r**2 + 35*r + 5)/(5*r**2 + 35*r + 14) is the
+     * (2,2) Pade approximation to cbrt(r) at r = 1.  We replace
+     * r = x/t**3 by 1/r = t**3/x since the latter can be evaluated
+     * more efficiently, and rearrange the expression for R(r) to use
+     * 4 additions and 2 divisions instead of the 4 additions, 4
+     * multiplications and 1 division that would be required using
+     * Horner's rule on the numerator and denominator.  t being good
+     * to 32 bits means that |t/cbrt(x)-1| < 1/32, so |x/t**3-1| < 0.1
+     * and for R(r) we can use any approximation to cbrt(r) that is good
+     * to 20 bits on [0.9, 1.1].  The (2,2) Pade approximation is not an
+     * especially good choice.
+     */
 	r=t*t/x;
 	s=C+r*t;
 	t*=G+F/(s+E+D/s);
author	bde <bde@FreeBSD.org>	2005-12-15 16:23:22 +0000
committer	bde <bde@FreeBSD.org>	2005-12-15 16:23:22 +0000
commit	3abe21faf25d6ec4ac2be98d89714858d87f7203 (patch)
tree	119a3be481e5e7be1179141205fe4bdefe22cea4 /lib/msun
parent	deb97ff8b5e4bc0566a5bba69abc633e59f7460e (diff)
download	FreeBSD-src-3abe21faf25d6ec4ac2be98d89714858d87f7203.zip FreeBSD-src-3abe21faf25d6ec4ac2be98d89714858d87f7203.tar.gz