Evaluating the Normal Distribution

This article provides a little table-free C function that evaluates the normal distribution with absolute error less than 8 × 10−16. A small extension provides relative error near the limit available in double precision: 14 to 16 digits, the limits determined mainly by the computer’s ability to evaluate exp(-t) for large t. Results are compared with those provided by calls to erf or erfc functions, the best of which compare favorably, others do not, and all appear to be much more complicated than need be to get either absolute accuracy less than 10−15 or relative accuracy to the exp()-limited 14–16 digits. Also provided: A short history of the error function erf and its intended use, as well as, in the ‘browse files’ attachment, various erf or erfc versions used for comparison.


Introduction
Let φ(t) be the standard normal density function, φ(t) = e − 1 2 t 2 / √ 2π and let Φ(x), Phi(x) and cPhi(x) be the corresponding distribution and complementary distribution functions: I support the common, but unfortunately not universal, notation Φ(x) for the standard normal distribution when math fonts are available, or Phi(x) and cPhi(x) for Φ(x) and 1 − Φ(x) when referring to programming language functions.This little C function: double Phi(double x) {long double s=x,t=0,b=x,q=x*x,i=1; while(s!=t) s=(t=s)+(b*=q/(i+=2)); return .5+s*exp(-.5*q-.91893853320467274178L);}produces the following output, with the true value displayed below the value provided by Phi(x): x A study over the full range for Phi(x) shows a worst error of about 8 × 10 −16 , but that is absolute, not relative error.Concerning relative error, here is a little, but not-quite-as-little C function that uses nine tabled values to provide tail values of the normal distribution, i.e. cPhi(x) = 1 − Φ(x), to about the 14-16 digit limit available in double precision for a function that uses exp(): double cPhi(double x) {int i,j=.Values for the standard normal distribution are often obtained by means of library functions erf or erfc, with obligatory changes of scale and argument, such as .5+.5*erf(x/sqrt(2.)),or by using functions based on erf or erfc, with adjustments to the many coefficients in the polynomial or rational function approximations used in most erf or erfc implementations.
We will compare accuracies of the two short C functions above with those obtained from various erf and erfc implementations.Absolute error will be displayed as − log 10 |tru-approx|, that is, the negative of the exponent in expressing the absolute error as a power of 10.Relative error will be displayed as − log 10 |(tru-approx)/tru|.
Results concerning absolute or relative error can be misleading, or mistaken for "digits of accuracy".For example, the above Phi function returns Phi(7)=0.9999999999987196(format 19.16f), compared to .999999999998720187456..., the true value, and to many users, those ten 9's would be considered significant.For true values near 1, absolute and relative error are much the same.But in some applications, particularly for solutions of diffusion or heat equations, the region of interest may not be the statistician's 0 < x < 5, say, but rather 10 < x < 14, where greater accuracy may be provided by means of the complementary normal function cPhi(x) = ∞ x φ(t) dt.For that, use of 1-Phi(x) will not do as well, so many users may want to have an alternative function such as cPhi(x) that provides greater relative accuracy.
Our approach is to have two functions available: Phi(x) and cPhi(x).They happen to use the same method, differing only in that Phi(x) is simpler with no tables, a Taylor series about zero, while cPhi(x) evaluates a Taylor series about x = z + h for tabled values cPhi(z), z = 0, 2, 4, 6, . . ., 16.Some implementations that rely on a single function-for example, the function pnorm(x) mentioned below-will show a marked difference between 1-pnorm(x) and pnorm(-x), the latter providing much greater relative accuracy.See that discussion in Section 3 and its impact on interpretation of Figure 4. Precision for the above cPhi function seems limited by the accuracy of the exp() value in the final step.Section 3 shows that the little Phi and cPhi functions provide results at least as good as-and often better-than those obtained by conversion of erf or erfc from C compiler libraries or their elaborate listings from internet sources.
The graph of y = R(x) looks much like that of y = 2/(x+ x 2 + 8/π), so much so that they are difficult to separate in a plot of this size, where both are plotted for 0 ≤ x < 15: With merely the values of R(z) for z = 0, 2, 4, 6, 8, 10, 12, 14, 16, we can use the Taylor series to evaluate R(z + h), with |h| ≤ 1.This provides the basis for the C function cPhi(x) above.
It turns out that even the Taylor expansion for R(z + h) about z = 0 provides a simple and accurate way to evaluate cPhi(x) = 1 − Φ(x) with absolute error < 10 −15 , for −7 < x < 7, a range likely to cover most x's encountered in probability and statistics.I pointed this out in response to a newsgroup query seeking methods for evaluating the normal distribution, displaying a table much like the one above for Φ(.1), Φ(1.2), Φ(2.3), . . ., Φ(6.7), Φ(7.8), (see Newsgroup postings 2002Newsgroup postings , 2004)).Several suggestions led to simplifications, particularly by Bill Daly, who pointed out that the Taylor series about zero provides even-order terms whose sum is 1/φ(0), and odd-order terms with a simple recursion.
Rather than use simplifications of the Taylor expansion about zero for R(x), we might start with a different function, say B(x) = x 0 φ(t) dt/φ(x), so that B(x)φ(x) = x 0 φ(t) dt.As with R(x) above, the recursion readily follows, and in particular for the Taylor series about zero: It is surprising that summing this series until the new term seems to vanish, as in our little C function: double Phi(double x) {long double s=x,t=0,b=x,q=x*x,i=1; while(s!=t) s=(t=s)+(b*=q/(i+=2)); return .5+s*exp(-.5*q-.91893853320467274178L);} provides greatest absolute error of about 8 × 10 −16 , making it a method of choice for many applications in probability and statistics.The compact form resulted from another suggestion by Bill Daly in the newsgroups postings .
For this tiny Phi function, Figure 5 shows -log 10 (|tru − approx|), the 'digits of accuracy' for absolute error, along with those for two popular but far more elaborate methods for evaluating the normal probability distribution, (descriptions below).
For applications requiring relative accuracy to the limit available in double precision, cPhi(x) above may be used.For either requirement: absolute-or relative-accuracy, Phi or cPhi provide values for the normal distribution function by simple and easily understood methods, avoiding the need for special function libraries with ponderous listings for erf or erfc, and the necessary argument change from x to x/sqrt(2).
The cPhi function expects, and returns, an ordinary double, but internal calculations may be made more accurate by use of C's long double, which for many implementations invokes an 80-bit floating point processor.One rarely needs values of the normal CDF Φ(x) or its complement cPhi(x) accurate to as many as 15 digits, or for values x beyond 6 or so, but an easily implemented Taylor series for R(x) = cPhi(x)/φ(x) makes such accuracy available with only a few stored values of R(x) and few lines of code.
Since cPhi(x) comes from the product of R(x) and the normal density φ(x), one would expect that the error could be no better for cPhi(x) than for φ(x).As Figure 2 shows, the above C code evaluates R(x) more accurately than it can evaluate exp(-.5*x*x-.91893853320467274178L)(the constant is 1 2 ln(2π) ), and thus the digits-of-accuracy measure for the product R(x)φ(x) is pretty much that of the exp() part.Occasionally, there is constructive cancellation and the accuracy of cPhi(x) is better than that for evaluating the normal density.There, 'digits of accuracy' means − log 10 (|tru − approx|/tru).Both cPhi(x) and φ(x) average 14.7 digits of accuracy; both lose around two digits of accuracy as x increases from 0 to 16.Sometimes cPhi is a little better, sometimes not.Since cPhi(x) comes from R(z + h) with z = 0, 2, 4, . . ..16, we can expect greater errors near 1,3,5,...15, when h is near ±1.This only shows up near 9,11,13 and 15 in Figure 2; otherwise, errors in exp() seem to dominate.

The error function erf
In 1871, J.W. Glaisher published an article on definite integrals in which he comments that while there is scarcely a function that cannot be put in the form of a definite integral, for the evaluation of those that cannot be put in the form of a tolerable series we are limited to combinations of algebraic, circular, logarithmic and exponential-the elementary or primary functions.See Glaisher (1871).He writes: "The chief point of importance, therefore, is the choice of the elementary functions; and this is a work of some difficulty.One function however, viz. the integral ∞ x e −x 2 dx, well known for its use in physics, is so obviously suitable for the purpose, that, with the exception of receiving a name and a fixed notation, it may almost be said to have already become primary. . . .As it is necessary that the function should have a name, and as I do not know that any has been suggested, I propose to call it the Error-function, on account of its earliest and still most important use being in connexion with the theory of Probability, and notably with the theory of Errors, and to write It is unfortunate that changes from Glaisher's original Erf : the switch of limits, names and the standardizing factor, did not apply to what Glaisher acknowledged was its most important application: the normal distribution function, and thus 1 √ 2π e − 1 2 t 2 dt did not become the basic integral form.So those of us interested in its most important application are stuck with conversions: A search of the internet will show many applications of what we now call erf or erfc to problems of the type that seemed of more interest to Glaisher and his famous colleagues: integral solutions of differential equations.These include the telegrapher's equation, studied by Lord Kelvin in connection with the Atlantic cable, and Kelvin's estimate of the age of the earth (25 million years), based on the solution of a heat equation for a molten sphere (it was far off because of then unknown contributions from radioactive decay).More recent internet mentions of the use of erf or erfc for solving differential equations include short-circuit power dissipation in electrical engineering, current as a function of time in a switching diode, thermal spreading of impedance in electrical components, diffusion of a unidirectional magnetic field, recovery times of junction diodes and the Mars Orbiter Laser Altimeter.

Implementations and variations of erf and erfc
Although Glaisher advocated erf as a means to represent integrals that could not be put in the form of a tolerable series or combinations of algebraic, circular, logarithmic or exponential functions, Section 2 shows that x 0 φ(t) dt, ∞ x φ(t) dt and hence erf, can be represented by means of a quite tolerable Taylor series-times an exponential function.This invites comparison with-and possibly replacement for-some of the many available erf implementations, or variations of them tailored for evaluating the normal CDF Φ.Some of Newsgroup postings (2002) or Newsgroup postings ( 2004) point to Fortran or C versions of erf based on numerical approximations from Cody (1993), Hart et al. (1968) and Hill (1973).They are labelled 'Cody', 'Hart' and 'AS66' below.An erfc developed at Sun Microsystems (1993) ('Sun' in figures below) was also cited.The Sun erfc is available under GNU on linux systems.All of these are rather elaborate programs, often handling the range in several pieces.
The best C version of erfc that I have found is derfc.cby Ooura (1998).It uses a rational function times e −x 2 , with explicit coefficients (no tables) for the ratio of polynomials of degree 12 and 13.It too provides 14-16 digit accuracy; a plot of its 'digits of accuracy' vs. x in 0 < x < 16 looks very much like that for cPhi(x) and for the Sun erfc, (Figure 3).
Several implementations seem to be based on the polynomial approximations Hart et al. (1968).Compiled and printed results show that, like the pnorm function based on Cody, relative accuracy is a nice 12-15 digits for early positive x, but trail off to 1 or so around x=8, (Figure 4), for values derived as 1 − Φ(x) rather than Φ(−x) Some of the most widely used methods for erf or erfc are based on Algorithm AS66 by Hill (1973).A plot of the relative error for a double precision C version shows digits of accuracy around 11 at x=0, dropping to 8 at x=12.
Figure 3 shows relative errors for cPhi, derfc and AS66.That last one pales in comparison.
If interpreted as Φ(−x), the Cody version has that same limiting relative accuracy exhibited by cPhi, derfc and Sun.
Relative errors using 1 − Φ(x) are shown in Figure 4, for versions based on Hart, Cody and a Fortran version of AS66, which returns zero for x > 12.6.Note that the legend there uses cHart and cCody to indicate that 1-Hart and 1-Cody were used.They have the property ascribed above: Φ(−x) yields better relative accuracy than 1 − Φ(x), where the particular function used to evaluate Φ is based on Hart or Cody.Using Hart(−x) would not show the rapid decline to zero beyond x = 5, but would taper off to about 9 digits at x = 12.Similarly, if the Cody implementation is used to get Φ(−x), rather than 1 − Φ(x) as shown, the relative error plot compares favorably with the excellent results for cPhi and derfc in Figure 3, or the above table for cPhi, Sun and Ooura's derfc.There was no difference between 1 − Φ(x) and Φ(−x) for the AS66 implementation of the normal distribution function.Both are poor.
Plots comparing absolute and relative errors for Phi, cPhi and other sources: The plots in Figures 2,3,4,5 were patched from .psversions of earlier, no longer available sources.Details of the plots as they appear here may be easily magnified by most pdf viewers, such as GSview or Adobe reader.(In GSview, just right-click the mouse.)Some tentative conclusions can be drawn from the above discussion and plots: For applications in probability and statistics, where absolute accuracy may be more important, for example when determining that Φ(5.3) is .999999989282410compared to the true value of .999999989282409741..., the absolute accuracy provided by the tiny Phi(x) provides an attractive alternative to sources based on erf, comparable or better than all but Cody, which averages about 1/3 more digit of absolute accuracy but requires an elaborate, mysterious program to squeeze out that extra 1/3 of a digit.
For applications requiring relative accuracy, as in the extreme tails, there seems no better method than the short cPhi function listed above.It is better than many, and provides the limiting accuracy of the some of the best: Ooura, Sun or (properly applied), Cody.All of these rely on a multiple of exp(−cx 2 ) and suffer from the same limit on accuracy as cPhi: evaluation of exp(-t) for large t.

Summary
If you want values Φ(x) of the standard normal distribution for use in probability and statistics, you are likely to find no better method than this little C function: double Phi(double x) {long double s=x,t=0,b=x,q=x*x,i=1; while(s!=t) s=(t=s)+(b*=q/(i+=2)); return .5+s*exp(-.5*q-.91893853320467274178L);}which will provide probabilities with an absolute error less than 8 × 10 −16 .(You may want to add a line that will return 0 for x < −8 or 1 for x > 8, since an error of 10 −16 can make a true result near 0 negative, or near 1, exceed 1.) If you want relative errors (digits of precision) near the limit available in double precision for extreme tails of the distribution, such as for solving differential equations, you might evaluate 1−Φ(x) = cPhi(x) as .5*erfc(x/sqrt(2.)) by invoking the complementary error function erfc available from some compilers, or try one of many available on the web, with all the attendant mysteries over content and the nuisance of the two changes of scale.You will find that most of them have long listings, with constants involved in various polynomial, rational function or continued fraction approximations and that some fall short of the 14-16 digit accuracy limit on exp(−t) for large t. to the 14-16 digits available from the exp() function, as well as have an understanding of the mathematics behind the method.