Derivatives of the Incomplete Beta Function

All three version of INBEDER yielded at least seven signi(cid:12)cant digits of accuracy for each test case except when the derivative is zero.


Description and Purpose
The incomplete beta function is defined as where Beta(p, q) is the beta function. Dutka (1981) gave a history of the development and numerical evaluation of this function. In this article, an algorithm for computing first and second derivatives of I x,p,q with respect to p and q is described. The algorithm is useful, for example, when fitting parameters to a censored beta, truncated beta, or a truncated beta-binomial model.
The current algorithm is not intended to be used for such extreme arguments. Accordingly, tail-to-head computing is unnecessary and forward recurrence relations will be used to compute approximants in a head-to-tail fashion. Using forward recurrence relations, the n th approximant can be computed as where K x,p,q is given in (1) and the terms A n and B n are defined recursively. For n ≥ 1, A n and B n are given by A n = a n A n−2 + b n A n−1 and B n = a n B n−2 + b n B n−1 , where A −1 = 1; A 0 = 1; B −1 = 0; B 0 = 1; Denote the first and second derivatives of the incomplete beta function with respect to p and/or q by I p x,p,q , I pp x,p,q , I q x,p,q , I qq x,p,q , and I pq x,p,q . The proposed algorithm computes approximations to these derivatives by differentiating the approximants in (3). This technique was successfully used by Moore (1982) to obtain derivatives of the incomplete gamma function. For example, the n th approximant to I p x,p,q is computed as follows: ∂A n ∂p = ∂a n ∂p A n−2 + a n ∂A n−2 ∂p + ∂b n ∂p A n−1 + b n ∂A n−1 ∂p ; and ∂B n ∂p = ∂a n ∂p B n−2 + a n ∂B n−2 ∂p + ∂b n ∂p B n−1 + b n ∂B n−1 ∂p .
Expressions for the derivatives of K x,p,q , a n , and b n were obtained using MAPLE V (Char et al., 1991) and are given in the appendix. To avoid overflows due to large p and/or q, the derivatives of a n , and b n were converted to partial fractions prior to numerical evaluation. For example, ∂b n /∂q can be evaluated as .
Also, when x > p/(p + q), the incomplete beta function and its derivatives were computed using the relation I x,p,q = 1 − I 1−x,q,p .

Structure
Fortran 77 SUBROUTINE INBEDER(X,P,Q,PSI,DER,NAPPX,ERRAPX,IFAULT) Input Parameters X Real input: the value x at which the incomplete beta function is to be evaluated P Real input: a shape parameter Q Real input: a shape parameter PSI Real array (7) input: vector containing log gamma functions and their derivatives PSI(1) = ln {Beta(p, q)} PSI(2) = ψ p , the digamma function: ψ p = d ln{Γ(p)}/d p PSI(3) = ψ p , the trigamma function: Real array (6)  Integer output: a fault indicator = 0 if completion is successful = 1 if X is less than 0 or greater than 1 = 2 if P is less than 0 = 3 if Q is less than 0 = 4 if derivatives were set to 0 because I x,p,q was evaluated to 0 or 1 = 5 if evaluation stopped after MAXAPPX terms.

Constants
Three constants, ERR, MAXAPPX, and MINAPPX, are set in a data statement. The constant ERR should be set to 10 −j , where j = round {b f log 10 (2)} − 4 and b f is the number of bits used to represent the fractional component of floating point numbers. For example, suppose that floating point numbers are represented using 64 bits -one bit for the sign, 11 bits for the exponent, and 52 bits for the fraction. Then ERR should be set to 10 −12 . If b f is not known, then b f can be computed by function N2DIGIT given in Boik (1993, pp. 571). The constant MAXAPPX controls the highest order approximant that will be evaluated. The constant MINAPPX controls the minimum order approximant that will be evaluated.

Auxiliary Algorithms
No auxiliary algorithms are called by subroutine INBEDER. The user, however, is required to input values for the log of the complete beta function and first and second derivatives of the log gamma function. The algorithms of Lanczos (1964), Bernardo (1976), and Schneider (1978) are included and may be used to compute the required values. Alternative algorithms for the log gamma function and its first derivative were constructed by Cody (1993). Cody's algorithms are more accuarte than the older algorithms. FORTRAN source code for these algorithms can be obtained from Netlib.

Use of Fortran Driver Program
As an illustration, a diary of a double precision session appears below.

Number of Bits for Fractional Component of Floating Point Numbers is 52
The constant ERR should be set to 0.1D-11 To exit program, type Control-C Input p, q, and x 1.5 11 .001 Highest order approximant evaluated = 3 Approximate maximum absolute error = 0.28102520E-15, Ifault = 0 Input p, q, and x C Matlab function [der,psi,nappx,errapx] = inbeder(x,p,q)

Input Parameters
x Input: vector of length k containing values to which beta function is to be integrated p, q Input: beta shape parameters, either vectors with same dimension as x or scalars der output: matrix of dimension k × 6 der(:,1) = I x,p,q der(:,2) = I p

Optional Arguments
Function inbeder can have one or two additional input arguments. Using [der,psi,nappx] = inbeder(x,p,q,minappx,maxappx) sets parameters MINAPPX and MAXAPPX which control the minimum and maximum order approximant that will be returned. Default values are 3 and 200. Using inbeder(x,p,q,minappx) changes MINAPPX only and using inbeder(x,p,q,[ ],maxappx) changes MAXAPPX only.

Constants
One constant, err, is set in function inbeder. The constant is set to err = eps × 10 4 , where eps is an internal Matlab constant.

Additional Functions
The following Matlab functions are called by function inbeder.
1. function derconf: computes derivatives of a n and b n with respect to p and/or q when n = 1.
2. function subd: computes derivatives of a n and b n with respect to p and/or q when n ≥ 2.
3. function distinct: finds distinct entries in a vector.
4. function digam: computes the digamma function for an input vector.
5. function trigam: computes the trigamma function for an input vector.
6. function vec: stacks the elements of a matrix.
7. function devec: arranges the elements of a vector in a matrix.

S-Plus
ibd <inc.beta.deriv(x,p,q) Input Parameters x vector of length k containing values to which beta function is to be integrated p, q beta shape parameters, either vectors with same dimension as x or scalars

Output Parameters
The output of function inc.beta.deriv, namely ibd, is a list of 15 elements. Each of the first 13 elements in the list is a vector having the same length as x.

Additional Optional Inputs
In addition to the required arguments above, the user may specify the following additional arguments. err approximation control variable, iteration stops when maximum relative change is less than err minappx minimum order of approximation (default is 3) maxappx maximum order of approximation (default is 200)

Additional Functions
The following S-Plus functions are called by function inc.beta.deriv.
1. function digamma: computes the digamma function for an input vector.
2. function trigamma: computes the trigamma function for an input vector.

Help Files
Help files are included for three of the defined functions. On unix systems these files should be stored in a .Data/.Help directory with filenames inc.beta.deriv, digamma, and trigamma.

Accuracy
Benchmarks for evaluating the accuracy of the algorithm were obtained by computing the derivatives I p x,p,q , I q x,p,q , etc by means of numerical integration. Derivatives for several test cases were computed to 32 significant digits of accuracy using Maple V (Char et al., 1991) numerical integration routines. The output was compared to output from an quadruple precision version (floating point numbers represented using 128 bits) of Fortran subroutine INBEDER. The Fortran and Maple results agreed to at least 23 significant digits, except when the exact derivative is zero. It can be shown that I p,q .5,p,q = 0 whenever p = q. The derivatives to 8 significant digits are displayed in Table 1.
To gage the accuracy of the Fortran (double precision), Matlab, and S-plus versions of the algorithm, an additional list of test values was used. The values were obtained by equating I x,p,q , in turn to 0.01, 0.10, and 0.50 and solving for x, for each p, q ∈ {0.10, 1.0, 10000.0}. The 48 x values were then rounded to 8 significant digits and the derivatives were computed using the quadruple precision Fortran version of INBEDER. All three version of INBEDER yielded at least seven significant digits of accuracy for each test case except when the derivative is zero.

Applications
Maximum Likelihood Estimation of Beta Parameters from Censored Data Gnanadesikan, Pinkham, and Hughes (1967) described an algorithm for maximum likelihood estimation of the parameters of the beta distribution from the smallest order statistics. Ignoring additive constants, the log likelihood function given the smallest k order statistics from a sample of size n is the following: Using subroutine INBEDER, the log likelihood function is readily maximized using a Newton-Raphson algorithm. To solve the likelihood equations, Gnanadesikan et al approximated first derivatives of the incomplete beta function using a seven term Laplacian expansion.
Gnanadesikan et al illustrated their algorithm using a sample of size n = 20 from a beta distribution having parameters p = 1.5 and q = 11.0. Their results along with results based on INBEDER are given in Table 2. It is apparent from Table 2 that the Laplacian expansion used by Gnanadesikan et al is not sufficiently accurate. Their estimates do not maximize the likelihood function and are especially inaccurate when the ratio k/n is small. Also, from Table 2 it appears that the maximum likelihood estimates are positively biased when k/n is small. The variances and covariances of the maximum likelihood estimators can be estimated by the inverse of the observed information matrix.

Maximum Likelihood Estimation in Truncated Beta and Truncated Beta-Binomial Distributions
The beta-binomial model is sometimes used to model binary responses that display greater variation than would be expected under a binomial model. Smith (1983) and Smith and Ridout (1995) described an algorithm for maximum likelihood estimation of the beta-binomial parameters.
The usual beta-binomial pmf is obtained as the limit of (5) as τ 1 → 0 and τ 2 → 1. The truncated beta-binomial model in this article (5) differs from the truncated beta-binomial model described by Tripathi et al. (1994). Their model is obtained by excluding Y = 0 from the conventional beta-binomial model. That is, Tripathi et al. truncate the support of the discrete random variable Y , whereas the model in (5) truncates the support of the continuous random variable Π. Maximum likelihood estimates of the parameters of the truncated beta distribution in (4) or of the truncated beta-binomial distribution in (5) can be computed using a Newton-Raphson algorithm. As an illustration, N = 30 random variates π 1 , . . . , π 30 , were generated from the truncated beta distribution having parameters τ 1 = 0.2, τ 2 = 0.7, p = 3, and q = 2. Conditional on Π = π i , a binomial variate was generated from the Binomial(n i , π i ) distribution, where the sample sizes n i for i = 1, . . . , 30 were 25, 50, or 75. The data appear in Table 3.
Assuming that the values π i for i = 1, . . . , N have been observed, the MLEs of p and q can be obtained by maximizing the log likelihood function L(p, q|π 1 , . . . , π N ; τ 1 , τ 2 ) = with respect to p and q. For the data in Table 3, the MLEs are p = 4.153 and q = 1.680. The variances and covariances of the maximum likelihood estimators can be estimated by the inverse of the observed information matrix. If truncation is ignored, the MLEs of the beta parameters are p = 10.052 and q = 8.328.
Assuming that the values y i for i = 1, . . . , N have been observed, but π i for i = 1, . . . , N are unobservable, the MLEs of p and q can be obtained by maximizing the log likelihood function L(p, q|n 1 , . . . , n N ; y 1 , . . . , y N ; τ 1 , τ 2 ) = with respect to p and q. For the data in Table 3, the MLEs are p = 6.451 and q = 4.248. For a sample size of n = 10, the maximum likelihood estimates of Pr(Y = y) for y = 0, . . . , 10 are given in Table 4. For comparison, estimates computed under the conventional beta-binomial model and the conventional binomial model also are reported. For the conventional beta-binomial model (ignoring truncation), the MLEs of the beta parameters are p = 11.749 and q = 10.075. For the conventional binomial model,π = 0.533. The Kullback-Leibler distances between the true distribution and the three estimates are 0.048 (truncated beta-binomial), 0.200 (beta-binomial), and 0.447 (binomial). It is apparent that precision is lost if truncation and/or extra variation are ignored.
Sample π i n i y i Sample π i n i y i Sample π i n i y i 1 0.  Table 4 Maximum likelihood estimates of Pr(Y = y) from data in Table 3. MLE: estimates were computed assuming a truncated beta-binomial distribution with τ 1 = 0.2 and τ 2 = 0.7; MLE bb : estimates were computed assuming a beta-binomial distribution; MLE bi : estimates were computed assuming a binomial distribution.