A Software Tool for the Exponential Power Distribution : The normalp package

In this paper we present the normalp package, a package for the statistical environment R that has a set of tools for dealing with the exponential power distribution. In this package there are functions to compute the density function, the distribution function and the quantiles from an exponential power distribution and to generate pseudo–random numbers from the same distribution. Moreover, methods concerning the estimation of the distribution parameters are described and implemented. It is also possible to estimate linear regression models when we assume the random errors distributed according to an exponential power distribution. A set of functions is designed to perform simulation studies to see the suitability of the estimators used. Some examples of use of this package are provided.


Introduction
The Exponential Power Distribution (EPD) can be considered a general distribution for random errors.In statistical inference the usual hypothesis on random errors is that they are distributed as a normal distribution, but often this hypothesis is not suitable to the study of natural phenomena.Usually, in literature two alternative approaches are used.The first is related to the outliers theory and deals with robust methods, that are often disputable.The second approach looks for suitable distributional models more general than the gaussian one; in this paper we consider the latter approach.
The first formulation of the Exponential Power Distribution could be ascribed Subbotin (1923).Subbotin, starting from these two axioms: 1.The probability of a random error ε depends only on the absolute value of the error itself and can be expressed by a function ϕ(ε) having continuous first derivative almost everywhere; 2. The most likely value of a quantity, of which direct measurements are known, must not depend on the used unit of measure; obtained a probability distribution with density function: with −∞ < ε < +∞, h > 0 and m ≥ 1; this distribution is also known as General Error Distribution (GED).
The second axiom considered by Subbotin is the same of the second axiom used by Gauss to derive the usual normal error distribution; the first axiom used by Subbotin is more general: indeed, Gauss considers that the arithmetic mean represents the best way to combine observations, instead of the condition on the first derivative of the function ϕ(ε) used by Subbotin.

The exponential power distribution
Following the procedure introduced by Pearson (1895), Lunetta (1963) derived a different parametrization of the exponential power distribution by solving the differential equation: that gives the probability density function: with −∞ < x < +∞, −∞ < µ < +∞, σ p > 0 and p > 0. This distribution, called also by Vianelli (1963) normal distribution of order p (this explains the name of the package described in the following paragraphs), is characterized by three parameters: the location parameter, the scale parameter and p the shape parameter.The last parameter determines the shape of the curve; in this way, it is linked to the thickness of the tails, and thus to the kurtosis, of the distribution.In fact, by changing the shape parameter p, the exponential power distribution describes both leptokurtic (0 < p < 2) and platikurtic (p > 2) distributions; in particular, we obtain the Laplace distribution for p = 1, the normal distribution for p = 2 and the Uniform distribution for p → ∞.A recent review about this distribution family has been produced by Chiodi (2000).In previous papers, Chiodi (1986Chiodi ( , 1995) ) presented some procedures for the generation of pseudo-random numbers from a standardized exponential power distribution, i.e. an EPD with µ = 0 and σ p = 1.These procedures, based on a generalization of the well-known Box-Muller method (1958) for generating normal pseudo-random numbers, or on acceptance-rejection rules, are, of course, very useful in simulation studies, concerning the behavior of estimators under departures from normality.Other procedures for generating pseudo-random numbers from this distribution have been presented by Johnson (1979) and Tadikamalla (1980), among others.
Finally, we have to note that, although with a different parametrization, the exponential power distribution is dealt by Bayesian inference too, when there is the problem to specify a suitable a priori distribution (see, among others, Box and Tiao 1992, Choy and Smith 1997and Achcar De Araujo Pereira 1999).A bivariate exponential power distribution has been introduced by De Simone (1968) and Taguchi (1978), while a multivariate formulation of this distribution can be found in Fang, Kotz, and Ng (1989) and Krzanowski and Marriott (1994).

Estimation of the exponential power distribution parameters
By assuming that the shape parameter p is known, the location and scale parameters could be easily obtained by using the maximum likelihood estimation method.The estimate of the shape parameter p is an open problem, so far.Anyway several procedures have been proposed, some of which are very interesting.In literature, the main proposals are based on the maximum likelihood estimation method or on methods based on the computation of kurtosis indices.The derivation of the maximum likelihood estimators does not bring, formally, many problems; on the other hand, the maximum likelihood estimators have suitable properties, at least asymptotically, but in this case, i.e. in the case of the estimation of the shape parameter p, we could meet with computational difficulties by using numerical methods to solve the obtained equation.A new approach, to estimate the location and scale parameter of this distribution based on a genetic algorithm, has been recently proposed by Vitrano and Baragona (2004), but it does not present substantial improvements in comparison with others numerical methods based on the direct evaluation of the likelihood function.
The problem of estimating the linear regression parameters when we have random errors distributed according to an exponential power distribution has been treated by Zeckhauser and Thompson (1970) and A.M. Mineo (1995).

Estimation of the location and scale parameters
Let's assume to have a sample of n i.i.d.observations drawn from (1), then the likelihood function is: and the log-likelihood function is: By deriving the log-likelihood function with respect to µ and σ p and by equalizing the obtained expressions to zero, we have the following equations: The equation (4) does not have, in general, an explicit solution and is solved by means of numerical methods, while from (5) we get the maximum likelihood estimator of σ p : Vianelli (1963) called σp power deviation of order p; it can be seen as a variability index that generalizes the standard deviation.
Chiodi shows (1988b) that the maximum likelihood estimator of µ is unbiased, with asymptotic variance given by: while Lunetta (1966), when µ is known, obtains the sampling distribution of σp p : i.e. a Gamma distribution with λ = n/(pσ p p ), c = n/p, E(σ p p ) = σ p p , var(σ p p ) = (p • σ 2p p )/n.If µ is unknown, these results are only asymptotically unbiased, while σp p is biased for small samples.An unbiased estimator, when p is known and μ is the maximum likelihood estimator of µ, is given by Chiodi: that provides also its asymptotic sampling distribution, i.e. the Gamma distribution (8) but with c = n/p − 1/2.A paper where are showed some procedures to test hypothesis on homoskedasticity when samples are drawn from an exponential power distribution has been presented by Chiodi (1988a), as well.Finally, Chiodi (1994) uses saddle point approximations to obtain asymptotic sampling distributions of the maximum likelihood estimators for the exponential power distribution parameters.The sampling distributions, obtained in this way, work well even for small samples, for example n = 5.
Moreover, we can compute the Fisher information matrix on µ and σ p (Agrò 1995): where Then, the parameters µ and σ p are orthogonal, according to the Fisher's information matrix: for the estimation of the parameters, this fact makes possible the use of the conditional profile log likelihood that has better properties than the ordinary profile log likelihood (Cox and Reid 1987).
By inverting the information matrix, we obtain the asymptotic matrix of variance of the maximum likelihood estimators (μ, σp ): that for p=1 and p=2 becomes, respectively, the asymptotic matrix of variance for the maximum likelihood estimators of the Laplace and the Gauss distribution parameters.

Estimation of the shape parameter p
The methods presented in literature are based on the likelihood function and on indices of kurtosis.In particular, a comparison among different approaches, i.e. the whole log-likelihood, the profile log-likelihood (Barndorff-Nielsen 1988) and the conditional profile log-likelihood (Cox and Reid 1987), proposed by Agrò (1999), and an index of kurtosis, proposed by A.M. Mineo (1994), has been made by A.M. Mineo (2003b).In this paper, according to a simulation study, it seems that the best method is the one based on the index of kurtosis, while generally the whole log-likelihood approach works better than those based on the profile log-likelihood and the conditional profile log-likelihood.Indeed, the method based on the index of kurtosis VI (see the formula 15) gives estimates of p that seem less biased and more efficient than the estimates obtained by using the three methods based on the log likelihood function.In some way, these results are expected, given the link between the shape parameter p and the kurtosis (cfr.3.2.2).

Estimation of p by means of the maximum likelihood method
If we want to determine the maximum likelihood estimator of the shape parameter p, the equation that we obtain by deriving the log-likelihood function (3) is: where Ψ(.) is the digamma function, that is the first derivative of the logarithm of the gamma function.
The equation ( 13) can be solved by using numerical methods.Moreover, Agrò (1992) uses this method showing that it does not work well for small samples, even though it provides good results for samples of size greater than 50 − 100.
In a following paper, Agrò (1995) computes the inverse of the Fisher information matrix that defines the asymptotic matrix of variance of the maximum likelihood estimators (μ, σp , p): where Ψ (.) is the trigamma function, that is the second derivative of the logarithm of the gamma function.From this matrix we can see how the shape parameter p is orthogonal to the location parameter µ, but is not orthogonal to the scale parameter σ p .Capobianco (2000) observes how the asymptotic variance of the scale parameter estimator is larger than the asymptotic variance of the same estimator obtained for the Gaussian or the Laplace distribution.This is an expected result, since the Gaussian and the Laplace distribution are exponential power distributions when p = 1 and p = 2, respectively; if we substitute these values in the expression of the asymptotic variance, we obtain the same results.The apparent loss of efficiency is just due to the need of estimating p.

Estimation of p by means of indices of kurtosis
These estimation procedures take into account the relationship between the shape parameter p and the kurtosis.The usually used indices of kurtosis are: where is the absolute moment of grade r.The index β p , called generalized index of kurtosis, has been proposed by Lunetta (1963) and gives for p = 2 the Pearson's index of kurtosis β 2 .The estimators of the indices of kurtosis above described are given by: where M is the arithmetic mean.A first solution to this problem has been given by Lunetta (1967), which observed how the kurtosis indices could be used for estimating p. Afterwards, A. Mineo (1980) proposed the solution of an equation based on a suitable combination of sampling indices of kurtosis.Again, A. Mineo (1986) shows by simulation studies the lower efficiency of some well-known robust estimators, compared to the maximum likelihood estimator of the location parameter µ.In the same paper, the author estimates p by using an inverse interpolation procedure.In a following paper, A. Mineo (1989), by implementing an iterative procedure, obtains estimators for a simple linear regression model with errors distributed as a normal distribution of order p and an estimate of p by solving the equation: that derives from the theoretical value of the generalized index of kurtosis β p .In this formula, M p is an estimate of the location parameter µ, obtained by solving the equation ( 4): this is necessary to make possible the iterative process of estimation of all the three parameters µ, σ p and p.
An estimation method based on the index of kurtosis V I has been proposed by A.M. Mineo (1994), by comparing the theoretical value and the empirical value computed on the sample.The index V I has been preferred to β 2 , because it is less affected by extreme values on data, in order to obtain better estimates, even when small samples (n ≤ 50) are considered.Moreover, the equation ( 23), obtained from β p , has not been considered since its expression is more difficult computationally.The validity of the proposed procedure has been provided by simulation studies.Actually, the obtained results seem to be very interesting and better than the ones obtained by methods based on the likelihood function (A.M.Mineo, 2003b).

The normalp package
The normalp package (A.M.Mineo, 2003a) is an R (R Development Core Team 2004) package containing a collection of tools related to the exponential power distribution.The implemented functions generalize some commands already available in the base package, related to observations distributed as a normal distribution.Moreover, useful commands deal with estimation problems for linear regression models when the random errors are drawn from an exponential power distribution.The normalp package is available on the Comprehensive R Archive Network (CRAN1 ).

Computation of the exponential power distribution
The normalp package has four functions, dnormp(), pnormp(), qnormp() and rnormp(), to compute the typical quantities of a probability distribution, i.e.: • dnormp(), that gives the density function of an exponential power distribution; • pnormp(), that gives the probability function of an exponential power distribution; • qnormp(), that gives the quantiles of an exponential power distribution; • rnormp(), that generates a vector of n pseudo-random numbers from an exponential power distribution.
The use of these functions is the following: dnormp(x, mu = 0, sigmap = 1, p = 2, log = FALSE)} with arguments x vector of quantiles; mu vector of location parameters; sigmap vector of scale parameters; p shape parameter; log logical; if TRUE the density is given as a log(density).
pnormp(q, mu = 0, sigmap = 1, p = 2, lower.tail= TRUE, log.pr = FALSE) rnormp(n, mu = 0, sigmap = 1, p = 2, method = "def") with arguments n vector of observations; mu vector of location parameters; sigmap vector of scale parameters; p shape parameter; method if set to chiodi it uses an algorithm based on a generalization of the Marsaglia (1964) formula, suggested by Chiodi (1986), otherwise it uses a faster method based on the relationship linking an exponential power distribution and a Gamma distribution (Lunetta 1963).

Graphical functions
Besides the above described functions, some useful graphical functions have been implemented.In particular, these functions are: • graphnp(), that visualizes from one to five different exponential power distributions, each one plotted with a different color; • qqnormp(), that gives an exponential power distribution Q-Q plot; • qqlinep(), that sketches a line through the first and the third quartile in an exponential power distribution Q-Q plot; Then, the functions qqnormp() and qqlinep() allow to check graphically if a vector of observations can be considered drawn from an exponential power distribution.

Functions to estimate the exponential power distribution parameters
Another set of functions is related to the estimation of the parameters of an exponential power distribution.In particular, we have the following functions: • estimatep(), that estimates the shape parameter p from a vector of observations by means of the index of kurtosis V I (A.M.Mineo, 1994); • paramp(), that estimates the location parameter µ and the scale parameter σ p by means of the maximum likelihood method, by considering the two cases when p is known and when p is unknown; in the latter case, an estimate of p is obtained from the estimatep() function; this function returns a list of elements: the arithmetic mean, the standard deviation, the estimated values of location, scale and shape parameters of an exponential power distribution.Moreover, if the message iter=1 prints out, then problems on convergence have arisen in the computation; • kurtosis(), that computes the theoretical and the empirical values of the kurtosis indices V I, β 2 and β p ; • simul.mp(),that performs a Monte Carlo simulation to verify the behavior of the estimators of the parameters µ, σ p and p, behavior that can be seen graphically by using the function plot.simul.mp(),that returns an histogram for each set of the estimates produced by simul.mp().The function simul.mp()returns a list containing the following components dat: a matrix m × 5 reporting the result of paramp() for each sample; table: a matrix 2 × 5 of the means and the variances of the five estimator values.
The use of these functions is described below: estimatep(x, mu = 0, p = 2, method = c("inverse", "direct")) with arguments x vector of observations; mu the location parameter; p starting value of the shape parameter; method the user can choose an inverse interpolation (faster) or a direct solution of the equation V I = V I.
paramp(x, p = NULL, ...) with arguments x vector of observations; p if specified, the algorithm does not use an estimate of p, but the specified value; ... other arguments passed to or from other methods.kurtosis(x, p = 3, value = "parameter") with arguments x a sample of observations; if it is not specified, theoretical indices are computed; p the shape parameter; value if set to parameter, the value specified in p is used, otherwise an estimate of p is computed and estimates of the three indices are given.

Functions to deal with linear regression problems
The normalp package contains also a set of functions to fit linear regression models, when random errors are distributed according to an exponential power distribution.The implemented functions are: • lmp() that evaluates the coefficients of the linear regression model by means of the maximum likelihood method, either when p is known, or when p is unknown (A.M.Mineo, 1995); in the latter case an estimate of p is obtained from estimatep().The lmp() function returns a list containing the following components: coefficients: the coefficient estimates; residuals: the residuals; fitted.values: the fitted values; rank: the numeric rank of the fitted linear model; df.residual: the residual degrees of freedom computed as in lm(); call: the matched call; terms: the terms object used; p: the p estimate computed on residuals; knp: a logical parameter used by summary(); model: the model frame used; iter: its value is 1 if difficulties on convergence arise.
• plot.lmp() that performs an analysis of residuals by returning a set of plots: a plot of residuals against fitted values; an Exponential Power Distribution Q-Q plot; a Scale-Location plot.
• summary.lmp() that produces a list of summary statistics related to the fitted linear model given by lmp(); among them, the power deviation of order p computed on residuals: where q is the number of the estimated regression coefficients if p is known, otherwise is the number of the estimated regression coefficients plus 1.
• simul.lmp()that allows to verify the behavior of the regression parameter estimators and the estimators of σ p and p by means of a Monte Carlo simulation study.This function returns a list containing a matrix reporting the result of lmp() for all the samples and a matrix of means and variances of the m estimates for each related parameter.The regressors are drawn from an Uniform distribution.
• summary.simul.lmp()that summarizes the results of simul.lmp • plot.simul.lmp()returns a histogram for each set of the estimates computed by simul.lmp().
The use of these functions is the following (it is very similar to that of lm() and related functions of the stats package): lmp(formula, data = list(), p = NULL) with arguments formula the model to be fitted; data an optional data frame containing the variables in the model.

Some examples of use
In this section different examples of use of the normalp package will be showed.All the examples have only the purpose to show some possible ways of use of the implemented functions.

Estimating the exponential power distribution parameters
Let's estimate the shape parameter p from a vector of observations: R> x <-rnormp(100, 1, 2, 4) R> p <-estimatep(x, 1, 2) R> p q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −2 −1 0

Fitting a linear regression model with random errors distributed according to an exponential power distribution
In this subsection we consider an example that has only the purpose to show the use of the implemented functions to fit a linear regression model when we have to assume that the random errors are distributed according to an exponential power distribution.Let's consider the following data set taken from Levine, Krehbiel, and Berenson (2000); this data set reports the profits at the box office in million of dollars (Gross) and the number of sold home videos in thousands (Videos) for a sample of 30 movies; the idea is that the number of sold home videos is related to the profits at the box office: R> dataset <-read.Let's report a summary of the main results with a plot showing the two straight lines derived by using the OLS methods and the lmp() method:

Figure 2 :Figure 3 :
Figure 2: Exponential power distribution Q-Q plot for a sample of size n = 100.
By default the variables are taken from the environment; p the shape parameter.It is estimated if is not specified.