The Exponentiated Generalized Gumbel Distribution

A class of univariate distributions called the exponentiated generalized class was recently proposed in the literature. A four-parameter model within this class named the exponentiated generalized Gumbel distribution is defined. We discuss the shapes of its density function and obtain explicit expressions for the ordinary moments, generating and quantile functions, mean deviations, Bonferroni and Lorenz curves and Rényi entropy. The density function of the order statistic is derived. The method of maximum likelihood is used to estimate model parameters. We determine the observed information matrix. We provide a Monte Carlo simulation study to evaluate the maximum likelihood estimates of model parameters and two applications to real data to illustrate the importance of the new model.


Introduction
The Gumbel distribution is a very popular statistical model due to its wide applicability.An extensive list of the Gumbel model applications can be obtained in Kotz & Nadarajah (2000).In the area of climate modeling, for example, some applications of the Gumbel model include: global warming problems, offshore modeling, rainfall and wind speed modeling (Nadarajah 2006).We can find applications of this model in various areas of engineering such as flood frequency analysis, network, nuclear, risk-based, space, software reliability, structural and wind engineering (Cordeiro, Nadarajah & Ortega 2012).Due to its wide applicability, several works aimed at extending the Gumbel model become important.Some examples are mentioned in: Nadarajah & Kotz (2004), Nadarajah (2006) and Cordeiro et al. (2012).
In recent years, some different generalizations of continuous distributions have received great attention in the literature.Here, we refer to the papers: Marshall & Olkin (1997) for the Marshall-Olkin class, Eugene, Lee & Famoye (2002) for the Beta class, Zografos & Balakrishnan (2009) and Ristic & Balakrishnan (2011) for the Gamma class and Cordeiro & de Castro (2011) for the Kumaraswamy class of distributions.In a similar manner, for any baseline cdf G(x), and x ∈ R, Cordeiro, Ortega & Cunha (2013) defined the exponentiated generalized (EG) class of distributions with two extra parameters α > 0 and β > 0 and cdf F (x) and pdf f (x) given by respectively.In this paper, we study the so-called exponentiated generalized Gumbel ("EGGu" for short) distribution by inserting (1) in equation ( 3).
The rest of the paper is organized as follows.In Section 2, we define the EGGu distribution.Shapes of the density function are discussed in Section 3. Explicit expressions for cumulative and density functions, quantile function, ordinary moments, mean deviations, Bonferroni and Lorenz curves, generating function, Rényi entropy and order statistics are derived in Section 4. We discuss maximum likelihood estimation and present a Monte Carlo simulation experiment to evaluate the maximum likelihood estimates (MLEs) of the model parameters in Section 5. Two applications in Section 6 illustrate the usefulness of the new distribution for data modeling.Lastly, concluding remarks are given in Section 7.

The EGGu distribution
The EGGu distribution was proposed by Cordeiro et al. (2013), but they did not study its mathematical properties.The cdf and pdf of the EGGu distribution are given by and Henceforth, a random variable X having density function ( 6) is denoted by X ∼ EGGu(α, β, µ, σ).We write F (x) = F (x; α, β, µ, σ) in order to eliminate the dependence on the model parameters.In this model, µ ∈ R and σ > 0 are the location and scale parameters, respectively, whereas α > 0 and β > 0 are the shape parameters.The Gumbel distribution is clearly a special case of (5) when α = β = 1.Setting β = 1 we obtain the exponentiated Gumbel distribution defined by Nadarajah (2006).

Shape
The main features of the density shape can be perceived through the study of its first and second derivative.Regarding the EGGu distribution, the first derivative of log{f (x)} is where z = exp − exp − x−µ σ .Here, 0 < z < 1.
The critical values of f (x) are the roots of the equation: If the point x = x 0 is a root of ( 7), then we can classify it as local maximum, local minimum or inflection point when we have, respectively, λ(x 0 ) < 0, λ(x 0 ) > 0 and λ(x 0 ) = 0, where λ Plots of the EGGu density function for selected parameter values are displayed in Figure 1.

Properties
In this section, we study some structural properties of the EGGu distribution.
For an arbitrary baseline cdf G(x), a random variable is said to have the exponentiated-G ("exp-G" for short) distribution with power parameter a > 0, say Y ∼exp-G(a), if its cdf and pdf are H a (x) = G(x) a and h a (x) = a g(x)G(x) a−1 , respectively.We consider the generalized binomial expansion which holds for any real non-integer b and |z| < 1.Using expansion (8) twice in equation ( 5), we can express the EGGu cdf as where j+1 and H j+1 (x) = G(x) j+1 is the exponentiated Gumbel (exp-Gu) cdf with power parameter j + 1.By differentiating (9), we obtain where h j+1 (x) is the exp-Gu pdf with power parameter j + 1 given by Equation ( 10) reveals that the EGGu density function is a linear combination of exp-Gu densities.This result is important to derive some structural properties of the new distribution such as ordinary and incomplete moments, generating function and mean deviations from those of the exp-Gu distribution.

Quantile function
In applied work, we are interested on the quantile function (qf) of a continuous distribution.Based on the qf, we can generate occurrences of the distribution and obtain measures of skewness and kurtosis.The EGGu qf, say x = Q(u), follows by inverting the EGGu cdf (5) as The median of X is simply Further, it is possible to generate EGGu variates by X = Q(U ), where U is a uniform variate on the unit interval (0, 1).
The effect of the additional shape parameters α and β on skewness and kurtosis of the new distribution can be based on quantile measures.In this sense, two important measures are the Bowley skewness (B) and the Moors kurtosis (M).Recent papers have used these measures to determine skewness and kurtosis, for example, Rêgo, Cintra & Cordeiro (2012), Zea, Silva, Bourguignon, Santos & Cordeiro (2012) and Ramos, Marinho, Silva & Cordeiro (2013) derived the B and M measures for the Beta normal, Beta exponentiated Pareto and exponentiated Lomax Poisson distributions, respectively.
The Bowley skewness (Kenney & Keeping 1962) based on quartiles is given by On the other hand, the Moors kurtosis based on octiles (Moors 1988) is given by ) .
These measures are less sensitive to outliers and they exist even for distributions without moments.For the normal distribution, B = M = 0. Plots of these skewness and kurtosis measures for some choices of the parameter β as functions of α, and for some choices of α as functions of β, for µ = 0, σ = 1, are displayed in Figure 2.These plots indicate that skewness and kurtosis decrease when β increases for fixed α and when α increases for fixed β.

Moments
It is hardly necessary to emphasize the importance of calculating the moments of a random variable in statistical analysis, particularly in applied work.Some key features of a distribution such as skewness and kurtosis can be studied through its moments.The nth moment of X can be determined from (10) as which, on setting u = exp{−(x − µ)/σ}, it reduces to Using the binomial expansion for Using a result by Nadarajah (2006), By combining (13) and ( 14), the nth moment of X becomes

Generating Function
The moment generating function (mgf) of X can be obtained using the fact that the EGGu density function is a linear combination of exp-Gu densities.Thus, Using a result by Cordeiro et al. (2012), we have and then (j + 1) tσ w j+1 .

Mean Deviations
Generally, there has been a great interest in obtaining the first incomplete moment of a distribution.Based on this quantity, we can calculate, for example, mean deviations which provide important information about the characteristics of a population.Indeed, the amount of dispersion in a population may be measured to some extent by all the deviations from the mean and the median.
For calculating the mean deviations from the mean and the median, we require the first incomplete moment of X given by T (z) = The mean deviations from the mean and the median are defined by and respectively, where µ 1 = E(X), the median M of X is determined from the qf by M = Q(1/2), F (M ) and F (µ 1 ) are easily obtained from (5) and T (z) is given by (15).
Another important application of the first incomplete moment is to determine Bonferroni and Lorenz curves, which are commonly used in applied works in areas such as economics, reliability, demography, insurance, medicine and others.For a given probability π, these curves are defined by B(π) = T (q)/(πµ 1 ) and L(π) = T (q)/µ 1 , where µ 1 = E(X) and q = Q(π) is calculated by ( 12).

Rényi Entropy
The entropy of a random variable X with density function f (x) is a measure of variation of the uncertainty.For any real parameter λ > 0 and λ = 1, the Rényi entropy is given by Using the binomial expansion (8) twice in equation ( 4), we can write where δ j is given by Inserting (1) and (2) in equation ( 16) and after some algebra, we obtain Finally,

Order statistics
We derive an explicit expression for the density of the ith order statistic X i:n , say f i:n (x), in a random sample of size n from the EGGu distribution.It is wellknown that Substituting ( 3) and (4) in equation ( 17) and applying the binomial expansion (8) twice, we can write where ϑ is given by Thus, replacing G(x) and g(x) by the cdf and pdf of the Gumbel distribution given by ( 1) and ( 2), respectively, we can write f i:n (x) as After simple algebraic manipulation, we can rewrite the last equation as where ϑ * = ϑ /( + 1) and h +1 (x) is given by (11).
Equation ( 18) reveals that the density function of the EGGu order statistic is a linear combination of exp-Gu densities.A direct application of ( 18) is to calculate the moments and the mgf of the EGGu order statistics.
The rth moment of X i:n is given by

Revista Colombiana de Estadística 38 (2015) 123-143
From the results presented in Section 4.3, the last equation reduces to The mgf of X i:n is given by Finally, based on the results in Section 4.4, the last equation can be rewritten as ( + 1) tσ ϑ * .

Estimation
Several approaches for parameter point estimation were proposed in the literature but the maximum likelihood method is the most commonly employed.The MLEs enjoy desirable properties and can be used when constructing confidence intervals and regions and also in test statistics.Large sample theory for these estimates delivers simple approximations that work well in finite samples.The resulting approximation for the estimates in distribution theory is easily handled either analytically or numerically.So, we consider the estimation of the unknown parameters α, β, µ and σ of the EGGu distribution from complete samples only by the method of maximum likelihood.Let x 1 , . . ., x n be a sample of size n from X.The log-likelihood function for the vector of parameters θ = (α, β, µ, σ) can be expressed as The elements of the score vector are given by where The maximum likelihood estimate (MLE) θ of θ is obtained by solving nonlinear equations U α (θ) = 0, U β (θ) = 0, U µ (θ) = 0 and U σ (θ) = 0.They cannot be solved analytically and require statistical software with iterative numerical techniques.There are many maximization methods in R scripts like NR (Newton-Raphson), BFGS (Broyden-Fletcher-Goldfarb-Shanno), BHHH (Berndt-Hall-Hall-Hausman), SANN (Simulated-Annealing), NM (Nelder-Mead) and L-BFGS-B.For interval estimation and hypothesis tests on the parameters α, β, µ and σ, we determine the 4 × 4 observed information matrix J(θ) = {−U rs }, where U rs = ∂ 2 (θ)/(∂θ r ∂θ s ) for r, s ∈ {α, β, µ, σ}.The elements of J(θ) are given in the Appendix.
Next, a small Monte Carlo simulation experiment based on 10, 000 replications will be conducted to evaluate the MLEs of the parameters of the EGGu distribution.We set the sample size at n = 100, 200, 400 and 800, the parameter α at α = 1.5 and 3.0, and the parameter β at β = 1.5 and 3.0.The location and scale parameters were fixed at µ = 0 and σ = 1, respectively, without loss of generality.The Monte Carlo simulation experiments are performed using the R programming language; see http://www.r-project.org.Table 1 reports the empirical means and the mean squared errors (in parentheses) of the corresponding estimators.From these figures in this table, we note that, as the sample size increases, the empirical biases and mean squared errors decrease in all the cases analyzed, as expected.

Applications
In this section, we provide two applications to real data sets to illustrate the importance of the EGGu distribution.The MLEs of the parameters are computed (as discussed in Section 5) and the goodness-of-fit statistics for this model are compared with other competing models.All computations were performed using the SAS subroutine NLMixed.The four-parameter Beta Gumbel (BGu) (Nadarajah & Kotz 2004) and Kumaraswamy Gumbel (KumGu) (Cordeiro et al. 2012) distributions are used to make a comparison with the EGGu model.Their pdfs are given by and The first data set is obtained from Hinkley (1977).It consists of thirty successive values of March precipitation (in inches) in Minneapolis/St Paul.The data are: 0.77, 1. 74, 0.81, 1.20, 1.95, 1.20, 0.47, 1.43, 3.37, 2.20, 3.00, 3.09, 1.51, 2.10, 0.52, 1.62, 1.31, 0.32, 0.59, 0.81, 2.81, 1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.90, 2.05.Table 2 gives some descriptive statistics for these data, which include central tendency statistics, variance, among others.Table 3 lists the MLEs of the model parameters (standard errors in parentheses) for all fitted models.It is also given the values of the Akaike information criterion (AIC), Bayesian information criterion (BIC) and consistent Akaike information criterion (CAIC).Plots of the estimated pdf and cdf of the fitted EGGu, BGu and KumGu models to these data are displayed in Figure 3.They indicate that the EGGu distribution is superior to the other distributions in terms of model fitting.
Next, we shall apply formal goodness-of-fit tests in order to verify which distribution better fits the current data.We consider the Cramér-von Mises (CM) and Anderson-Darling (AD) statistics, which are described in Chen & Balakrishnan (1995).Table 4 gives the values of the CM and AD statistics (and the p-values of the tests in parentheses) for the fitted models.Thus, according to these formal tests, the EGGu model fits the current data better than the other models, i.e., these values indicate that the null hypothesis is strongly not rejected for the EGGu distribution.Based on the plots of Figure 3, we conclude that the EGGu distribution provides a better fit to these data than the BGu and KumGu models.The second data set is given by Murthy, Xie & Jiang (2004).The data refer the time between failures for repairable item: 1.43, 0.11, 0.71, 0.77, 2.63, 1.49, 3.46, 2.46, 0.59, 0.74, 1.23, 0.94, 4.36, 0.40, 1.74, 4.73, 2.23, 0.45, 0.70, 1.06, 1.46, 0.30, 1.82, 2.37, 0.63, 1.23, 1.24, 1.97, 1.86, 1.17.Table 5 gives some descriptive statistics for these data.Table 6 gives the MLEs of the model parameters (standard errors in parentheses) for all fitted models and the values of the AIC, BIC and CAIC statistics.Plots of the estimated pdfs and cdfs of the EGGu, BGu and KumGu models to the current data are displayed in Figure 4. Table 7 gives the values of the CM and AD statistics (p-values between parentheses).Thus, according to these formal tests, the EGGu model fits the current data better than the other models, i.e., these values indicate that the null hypotheses are strongly not rejected for the EGGu distribution.

Conclusions
In this paper, we study a new four-parameter model named the exponentiated generalized Gumbel (EGGu) distribution.This model generalizes the Gumbel distribution, which is one of the most important models for fitting data with support in R. We provide some mathematical properties of the EGGu distribution including ordinary moment, moment generating and quantile functions, mean deviations, Bonferroni and Lorenz curves and Rényi entropy.The density function of the order statistics is obtained as a mixture of exponentiated Gumbel densities.We discuss the parameter estimation by maximum likelihood and provide the observed information matrix.We provide a Monte Carlo simulation study to evaluate the maximum likelihood estimation of the model parameters.Two applications to real data indicate that the EGGu distribution provides a good fit and can be used as a competitive model to fit real data.

Figure 2 :
Figure 2: Plots of the EGGu skewness and kurtosis as functions of α for some values of β and as functions of β for some values of α.

Figure 3 :
Figure 3: (a) Plots of the fitted EGGu, BGu and KumGu densities; (b) Plots of the estimated cdfs of the EGGu, BGu and KumGu models.

Figure 4 :
Figure 4: (a) Plots of the fitted EGGu, BGu and KumGu densities; (b) Plots of the estimated cdfs of the EGGu, BGu and KumGu models.

Table 2 :
Descriptives statistics for Hinkley's data set.

Table 3 :
MLEs (and the corresponding standard errors in parentheses), AIC, BIC and CAIC statistics for Hinkley's data.Denotes the standard deviations of the MLEs of α, β, µ and σ. a

Table 4 :
Goodness-of-fit tests.Denotes the p-value of the test. a

Table 5 :
Descriptives statistics for the times between failures.

Table 6 :
MLEs (and the corresponding standard errors in parentheses) and the AIC, BIC and CAIC statistics for the times between failures.Denotes the standard deviation of the MLEs of α, β, µ and σ. a
a Denotes the p-value of the test.