Acessibilidade / Reportar erro

THE TRANSMUTED HALF-NORMAL DISTRIBUTION WITH APPLICATION TO PRECIPITATION DATA

ABSTRACT

The Half-Normal distribution has been intensively extended in the recent years. A review of the literature showed that at least 10 extensions of the Half-Normal distribution were introduced between 2008 and 2016. These extensions generalized the behavior of the density and hazard functions, which are restricted to monotonous decreasing and monotonically increasing, respectively. In this paper we propose a new extension called the transmuted Half-Normal distribution using the quadratic rank transmutation map, introduced by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). A comprehensive account of mathematical properties of the new distribution is presented. We provide explicit expressions for the moments, moment-generating function, Shannon’s entropy, mean deviations, Bonferroni and Lorenz curves, order statistics, and reliability. The estimation of the parameters is implemented by the maximum likelihood method. The bias and accuracy of the estimators are assayed by the Monte Carlo simulations. This proposed distribution allows us to incorporate covariates directly in the mean and consequently to quantify their influences on the average of the response variable. Experiment with two real data sets show usefulness and its value as a good alternative to several extensions of the Half-Normal distribution in data modeling with and without covariates.

Keywords:
Half-Normal distribution; moments; parametric estimation; precipitation data; transmutation

1 INTRODUCTION

Underlying any parametric inference procedure a probability distribution is used to describe the behavior of a random variable in the population. Over the last years, an uncountable number of probability distributions, mostly having support of positive real numbers, have been proposed in the literature. Many strategies can be used to generate or extend a probability distribution. Most of these strategies adds one or more parameters to some basic distribution (Normal, Gumbel, Exponential, Weibull, Range, Log-Normal, among many others). In general, introducing one or more parameters brings greater flexibility in the behavior of the density and hazard functions of the distributions. A recent survey of the main methods used to extend a baseline distribution is available, for example, in Nadarajah & Rocha (201626 NADARAJAH S & ROCHA R. 2016. Newdistns: An R Package for New Families of Distributions. Journal of Statistical Software, 69(1): 1-32.), Tahir & Nadarajah (201543 TAHIR MH & NADARAJAH S. 2015. Parameter induction in continuous univariate distributions: Well-established G families. Journal of Probability and Statistics, 87(2): 539-568.), de Brito et al. (201513 DE BRITO CCR, RÊGO LC & DE OLIVEIRA WR. 2015. Method for Generating Distributions and Classes of Probability Distributions: The Univariate Case. arXiv:1504.01062, pp. 1-50.), Aljarrah et al. (20141 ALJARRAH MA, LEE C & FAMOYE F. 2014. On generating T-X family of distributions using quantile functions. Journal of Statistical Distributions and Applications, 1(1): 1-17.), Lee et al. (201322 LEE C, FAMOYE F & ALZAATREH AY. 2013. Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdisciplinary Reviews: Computational Statistics, 5(3): 219-238.), Lai (201120 LAI DC. 2011. Constructions and applications of lifetime distributions. Applied Stochastic Models in Business and Industry, 29(2): 127-140.) and Gupta & Kundu (200917 GUPTA RD & KUNDU D. 2009. Introduction of Shape/Skewness Parameter(s) in a Probability Distribution. Journal of Applied Statistical Science, 7(2): 153-171.).

It is important to emphasize that the transformation of a random variable Z into another X , of the form X = g (Z), is the simplest way to generate or extend a base probability distribution. An example of a distribution obtained by transformation is the one parameter Half-Normal (HN) distribution, resulting from the transformation X=θZ, where Z has standard Normal distribution and θ>0 is a scale parameter.

It should be mentioned here that the Half-Normal distribution is a special case of Nakagami-m distribution introduced by Nakagami (196027 NAKAGAMI N. 1960. The m-distribution a general formulation of intensity distribution of rapid fading. Proc. Symp. Statist. Methods Radio Wave Propag.). In the recent years several extensions of the Half-Normal distribution were proposed. These extensions include: the general Half-Normal distribution (Pewsey, 200231 PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054., 200432 PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.), the generalized Half-Normal distribution (Cooray & Ananda, 20087 COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.), the Beta (log-Beta) generalized Half-Normal distribution (Pescim et al., 201029 PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.; Cordeiro et al., 201310 CORDEIRO GM, PESCIM RR, ORTEGA EMM & DEMÉTRIO CGB. 2013. The Beta generalized Half-Normal distribution: New properties. Journal of Probability and Statistics, 2013: 1-18.; Pescim et al., 201330 PESCIM RR, ORTEGA EMM, CORDEIRO GM & DEMÉTRIO CGB. 2013. The Log-Beta Generalized Half-Normal Regression Model. Journal of Statistical Theory and Applications, 12(4): 330-347.), the Kumaraswamy generalized Half-Normal distribution (Cordeiro et al., 20129 CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.), the Beta generalized Half-Normal geometric distribution (Ramires et al., 201333 RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.), the extension generalized Half-Normal distribution (Olmos et al., 201428 OLMOS NM, VARELA H, BOLFARINE H & GÓMEZ HW. 2014. An extension of the generalized Half-Normal distribution. Statistical Papers, 55(4): 967-981.), the Power Half-Normal distribution (Gómez & Bolfarine, 201516 GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.), the generalized Half-Normal extended (Sanchez et al., 201635 SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.), the Odd Log-Logistic generalized Half-Normal (Cordeiro et al., 20178 CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.) and the transmuted generalized Gamma distribution (Saboor et al., 201934 SABOOR A, KHAN MN, CORDEIRO GM, PASCOA MA, RAMOS PL & KAMAL M. 2019. Some new results for the transmuted generalized gamma distribution. Journal of Computational and Applied Mathematics, 352: 165-180.).

This article introduces the transmuted Half-Normal distribution (THN), derived from the HN distribution. The extension of a baseline distribution via transmutation was proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). Tahir & Cordeiro (201642 TAHIR MH & CORDEIRO GM. 2016. Compounding of distributions: A survey and new generalized classes. Journal of Statistical Distributions and Applications, 3(1): 3-13.) have enumerated 51 distributions extended by transmutation. In Section 2 we present the HN distribution used as the baseline distribution to be extended. Its transmuted version is presented in Section 3. Various statistical and reliability properties of the THN are explored and discussed in Section 4. A few characterizations are considered in Section 5. Estimation by the maximum likelihood method is presented in Section 6. In Section 7, Monte Carlo simulations are conducted to investigate the bias and accuracy the maximum likelihood estimators. Two applications considering the proposed distribution are presented in Section 8. Section 9, with some concluding remarks, closes the paper.

2 THE HALF-NORMAL DISTRIBUTION

If a nonnegative random variable X has HN distribution with scale parameter θ>0, then the probability density function (p.d.f.) and the cumulative distribution function (c.d.f.) are written, respectively, as:

g x θ = 2 θ ϕ x θ and G x θ = 2 Φ x % θ 1 (1)

where ϕ(·) and Φ(·) denote, respectively, the p.d.f. and c.d.f. of distribution of a random variable with standard Normal distribution. The corresponding hazard rate function (h.r.f.) is written as:

h x θ = 1 θ ϕ x θ Φ x θ 1

which is monotonically increasing for every θ. It is worth remembering that p.d.f. is monotonically decreasing for all θ.

Formally, if a random variable Z is normally distributed with mean zero and variance one, then the HN distribution is the distribution of X=θZ. For Z normally distributed with mean µ and variance one, the transformation X=θZ leads to Leone et al. (196123 LEONE FC, NELSON LS & NOTTINGHAM RB. 1961. The Folded Normal Distribution. Technometrics, 3(4): 543-550.); Tsagris et al. (201444 TSAGRIS M, BENEKI C & HASSANI H. 2014. On the Folded Normal Distribution. Mathematics, 2(1): 12-28.). The folded-Normal distribution is a special case of the Normal distribution whenever the sign of the measured variable is unknown, lost or not relevant. In other words, it is used when one is interested in the size of the measured variable and not in the direction or sign (Chou & Liu, 19986 CHOU CY & LIU HR. 1998. Properties of the Half-Normal distribution and its application to quality control. Journal of Industrial Technology, 14(3): 4-7.). It is clear that the HN distribution is a particular case of the folded-Normal distribution when μ=0, which also coincides with the normal distribution truncated at zero (Nadarajah & Kotz, 2006b25 NADARAJAH S & KOTZ S. 2006b. R Programs for Truncated Distributions. Journal of Statistical Software, 16(1): 1-8.). In addition, the expressions in (1) can be obtained with the particular case of the generalized Rayleigh distribution (Vodã, 1976a45 VODÃ VG. 1976a. Inferential procedures on a generalized Rayleigh variate I. Aplikace Matematiky, 21(6): 395-412.,b46 VODÃ VG. 1976b. Inferential procedures on a generalized Rayleigh variate II. Aplikace Matematiky, 21(6): 413-419.), of the generalized range distribution Stacy (196241 STACY EW. 1962. A generalization of the Gamma distribution. The Annals of Mathematical Statistics, 33(3): 1187-1192.) and the square root of a chi-square random variable with one degree of freedom (Johnson et al., 199419 JOHNSON NL, KOTZ S & BALAKRISHNAN N. 1994. Continuous univariate distributions. Vol. 1. Second ed.. New York: John Wiley & Sons Inc.).

For X~HN(θ) the k-th moment of X is EXk=1π2k2θkΓk+12 such that for k = 1 we have EX=2πθ and for k=2,EX2=θ2. The third and fourth standardized moments (asymmetry and kurtosis, respectively) do not depend on θ and are written as:

β 1 = 4 π 2 π 2 3 2 = 0.9952 e β 2 = 3 π 2 4 π 12 π 2 2 = 3.8691.

The quantile function can be written in the form:

Q p θ = G 1 p θ = θ Φ 1 p + 1 2 (2)

where 0 < p < 1 and Φ−1(·) is the quantile function of a standard Normal distribution. Note that from (2) we can generate pseudo-random values of X . An alternative strategy can be found in Singh (199439 SINGH R. 1994. Simulation of Observations for the Half-Normal Distribution. Sankhyã: The Indian Journal of Statistics, Series B, 56(2): 137-139.) as a consequence of the application of the Box-Muller transformation Singh (199540 SINGH R. 1995. Editorial Note on the Paper “Simulation of observations for the HalfNormal distribution”. Sankhyã: The Indian Journal of Statistics, Series B, 57(3): 461- 461.).

3 THE TRANSMUTED HALF-NORMAL DISTRIBUTION

Motivated by the need for more versatile density and hazard functions, Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.) proposed a strategy that has proved useful in extending a baseline distribution. The distribution obtained by this process is called the transmuted-baseline distribution, for example, transmuted Weibull distribution. In this section, from the transmutation procedure, we will introduce the transmuted HN distribution (THN).

From the strategy proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.), a random variable X is said to have a transmuted distribution if its cdf is written in the form:

F ( x θ , λ ) = ( 1 + λ ) G ( x θ ) λ G ( x θ ) 2 , λ 1.

Consequently:

f ( x θ , λ ) = g ( x θ ) [ ( 1 + λ ) 2 λ G ( x θ ) ]

where G(xθ) and g(xθ) are respectively the c.d.f. and p.d.f. of the baseline distribution, indexed by a parameter vector θ. For λ=0, we have the distribution of the base random variable as a particular case.

Therefore, a non-negative random variable X has THN distribution, with scale parameter θ>0 and 1λ1 if its respective p.d.f. and c.d.f. are given, respectively, by

f ( x θ , λ ) = 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 (3)

and

F ( x θ , λ ) = ( 1 + λ ) 2 Φ x θ 1 λ 2 Φ x θ 1 2 . (4)

Figure 1 illustrates how the parameter λ influences the behavior of (3). Since θ is the scale parameter it is set to θ=1.

Figure 1
Behavior of the p.d.f. of the THN distribution.

It can be checked that the cdf F(xθ,λ) of THN(θ, λ) is a convex combination (finite mixture) of the cdf of maximum and minimum of two i.i.d HN(θ) random variables by writing F(xθ,λ) as

F ( x θ , λ ) = 1 + λ 2 2 2 Φ x θ 1 2 Φ x θ 1 2 + 1 λ 2 2 Φ x θ 1 2 .

Consequently the following result follows

2 Φ x θ 1 2 F ( x θ , λ ) 2 2 Φ x θ 1 2 Φ x θ 1 2 .

Therefore a THN(θ, λ) distribution is stochastically larger (smaller) than the distribution of the minimum (maximum) of two i.i.d. HN(θ) random variables.

4 STATISTICAL AND RELIABILITY PROPERTIES

In this section, we provide some important statistical and reliability properties of the THN distribution.

4.1 Survival, hazard rate and residual life functions

The survival function S(xθ,λ) describes the probability of an item or individual surviving the time x. For the THN distribution we have:

S ( x θ , λ ) = 1 1 + λ 2 Φ x θ 1 + λ 2 Φ x θ 1 2 . (5)

The hazard rate function (h.r.f.) specifies the instantaneous rate of death or failure at time x, given that the individual has survived up to x. Mathematically we have h(xθ,λ)=f(xθ,λ)S(xθ,λ) consequently for THN

h x | θ , λ = 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 1 1 + λ 2 Φ x θ 1 + λ 2 Φ x θ 1 2 . (6)

Figure 2 illustrates how the parameter λ influences the behavior of (6). Since θ is the scale parameter it is set to θ=1.

Figure 2
Behavior of the h.r.f. of the THN distribution.

The h.r.f. of THN(θ, λ) can written in terms of that of HN(θ) as

h x | θ , λ = 1 + λ 2 λ 2 Φ x θ 1 1 λ 2 Φ x θ 1 2 θ ϕ x θ 2 2 Φ x θ = 1 + λ 2 λ 2 Φ x θ 1 1 λ 2 Φ x θ 1 h H N x | θ ,

where hHN(xθ) is the hazard function of the HN(θ) distribution.

Using the above relation we can see that for all 0λ1

h H N ( x θ ) h ( x θ , λ ) 1 + λ h H N ( x θ )

and for all 1λ0

1 λ h H N ( x θ ) h ( x θ , λ ) h H N ( x θ )

That is HN hazard order THN for all 0λ1 and HN hazard order THN for all 1λ0.

In fact it can be shown that THN likelihood order HN for all 0λ1 and THN likelihood order HN for all 1λ0.

Moreover if X~THN(θ,λ1) and Y~THN(θ,λ2) then X likelihood orderY if λ2<λ1.

The mean residual life for individuals at age t is the average remaining life time and corresponds to the ratio of the area under the survival curve to the right of t and S(xθ,λ). From (5) we have:

m ( t θ , λ ) = E X t X > t = 1 S ( t θ , λ ) t S ( x θ , λ ) d x . (7)

Theorem 4.1. The mean residual life of the THN distribution is given by:

m ( t θ , λ ) = π 1 2 S ( t θ , λ ) { 2 3 2 θ λ e t 2 2 θ 2 γ ( t ) λ + 1 θ 2 e t 2 2 θ 2 2 t π Φ t θ 1 + 4 θ λ Φ t 2 θ 1 + λ t π γ ( t ) γ ( t ) 1 }

whereγ(t)=2Φtθ1.

Proof. In fact, by replacing the equation (5) in (7), we obtain

m t | θ , λ = 1 S t | θ , λ t 1 1 + λ 2 Φ x θ 1 + λ 2 Φ x θ 1 2 d x .

Solving the integral, we obtain:

m ( t θ , λ ) = π 1 2 S ( t θ , λ ) { 2 3 2 θ λ e t 2 2 θ 2 2 Φ t θ 1 2 θ λ 2 Φ t 2 θ 1 + + λ t π 2 Φ t θ 1 2 + t π t π 2 Φ t θ 1 + λ t π 2 Φ t θ 1 λ θ 2 e t 2 2 θ 2 θ 2 e t 2 2 θ 2 + 2 λ θ } .

From algebraic manipulations we arrive at the following expression:

m ( t θ , λ ) = π 1 2 S ( t θ , λ ) { 2 3 2 θ λ e t 2 2 θ 2 2 Φ t θ 1 λ + 1 θ 2 e t 2 2 θ 2 2 t π Φ t θ 1 + 4 θ λ Φ t 2 θ 1 + λ t π 2 Φ t θ 1 2 Φ t θ 1 1 } .

Replacing 2Φtθ1=γ(t), we get the result. □

4.2 Asymptotic behavior of the tails

The behavior of THN(θ, λ) in the tails for x0 and for x can be stated respectively as follows:

f ( x θ , λ ) ~ ( 1 + λ ) 2 θ ϕ x θ , F ( x θ , λ ) ~ ( 1 + λ ) 2 Φ x θ 1 , h x | θ , λ 1 + λ h H N x | θ ) ,

and

f ( x θ , λ ) ~ ( 1 λ ) 2 θ ϕ x θ , F ( x θ , λ ) ~ ( 1 + λ ) 2 Φ x θ 1 λ , h ( x θ , λ ) ~ h H N ( x θ ) ) .

4.3 Quantile function

The p th quantile of THN(θ, λ) is given by

Q p | θ , λ = θ Φ 1 1 + 3 λ 1 + λ 2 4 λ p 4 λ .

4.4 Moments and associated measures

Moments are measures capable of characterizing a probability distribution, with the first four moments indicative of central tendency, dispersion, asymmetry and kurtosis in that order. Unlike many extensions of the HN distribution, the k-th moment of the THN distribution can be obtained analytically.

Theorem 4.2. If X has THN distribution then the k-th moment can be written as follows:

E X k = θ k 2 k 2 π 3 / 2 λ + 1 π Γ k + 1 2 2 λ k π H g e o 1 2 , k 2 + 1 ; 3 2 ; 1 Γ k 2

where Hgeo is the Hypergeometric distribution Feller (1968 14 FELLER W. 1968. An introduction to probability theory and its applications. vol. 1. 3rd ed.. John Wiley & Sons. ).

Proof. From the moment of the order k of a continuous random variable definition:

E X k = 0 x k 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x = 1 + λ 0 2 θ x k ϕ x θ d x 2 λ 0 2 θ x k ϕ x θ 2 Φ x θ 1 d x

which can be written as:

E X k = θ k 2 k 2 π 3 / 2 λ + 1 π Γ k + 1 2 2 λ k π H g e o 1 2 , k 2 + 1 ; 3 2 ; 1 Γ k 2 . (8)

From (8) we have:

E X = θ π 1 + λ 2 2 λ , (9)

The expressions of the coefficient of variation, skewness and kurtosis are obtained from the following relations:

C V X = V a r X E X = π 2 + 4 2 6 λ λ 2 1 / 2 2 λ 1 2 λ ,

S k w X = E X E X 3 V a r X 3 / 2 = [ 2 4 π + λ 2 18 π + π 24 + λ 2 42 2 60 + λ 3 28 2 40 ] × π 2 + 4 2 6 λ λ 2 3 / 2

and

K u r X = E X E X 4 V a r X 2 , = [ 3 π 2 4 π 12 + λ 4 π 3 2 4 48 2 72 + λ 2 4 π 3 2 5 + 192 2 264 + λ 3 288 2 408 + λ 4 144 2 204 ] × π 2 + 4 2 6 λ λ 2 2 .

4.5 Moment generating function

Theorem 4.3. If X has THN distribution then the moment generating function of X, M X (t), is given by:

M X t = 2 λ n = 0 t n n ! H g e o 1 2 , n 2 + 1 ; 3 2 ; 1 Γ n 2 + 1 θ n 2 n / 2 + 1 π + + 1 + λ n = 0 t n n ! Γ n + 1 2 θ n 2 n / 2 π .

Proof. Let X be a random variable with THN distribution, then the moment generating function of X is given by:

M X t = 0 e x t 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x . (10)

From the Taylor series expansion of the function e tx we have that (10) can be written in the form:

M x t = 1 + λ n = 0 t n n ! 0 2 θ x n ϕ x θ d x + 2 λ n = 0 t n n ! 0 2 θ x n ϕ x θ 2 Φ x θ 1 d x .

Solving integrals, we get:

M X t = 2 λ n = 0 t n n ! H g e o 1 2 , n 2 + 1 ; 3 2 ; 1 Γ n 2 + 1 θ n 2 n / 2 + 1 π + 1 + λ n = 0 t n n ! Γ n + 1 2 θ n 2 n / 2 π . (11)

4.6 Differential entropy

Here we investigate differential entropy of a continuous random variable. This is a measure of the uncertainty variation, and a large entropy value indicates a greater uncertainty in the data. One of the most popular measures is the Shannon entropy Shannon (195137 SHANNON CE. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1): 50-64.).

The concept of Shannon’s entropy refers to the uncertainty of a probability distribution. An important fact is that the entropy Sh is not a function of the random variable X , but rather of the probability distribution of that variable.

Definition 4.1. The differential entropy ℋ Sh of a continuous random variable X with a probability density function f (x) is defined as

H S h ( X ) = S f ( x ) log f ( x ) d x ,

where S is the support set of the random variable.

Theorem 4.4. The Shannon’s entropy for a continuous random variable X with THN distribution is given by:

H S h ( X ) = log 2 θ + π 1 log 2 π 2 λ 2 π + + 2 λ 2 1 log 1 + λ 1 λ λ λ 2 + 1 log 1 + λ 1 λ .

Proof. In fact, it follows directly from (3) that

H S h X = 0 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 × × log 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x .

Using the distributive properties of the logarithmic function we have:

H S h X = 0 2 θ ϕ 1 + λ 2 λ 2 Φ x θ 1 × × log 2 θ + log ϕ x θ + log 1 + λ 2 λ 2 Φ x θ 1 d x = 0 log 2 θ 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 + + log ϕ x θ 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 + + log 1 + λ 2 λ 2 Φ x θ 1 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x .

Since the functions in question are all integrable, we have

H S h X = 0 log 2 θ 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x + 0 log ϕ x θ 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x + 0 log 1 + λ 2 λ 2 Φ x θ 1 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 d x .

Solving the integrals, we have that the Shannon’s entropy for the THN distribution is given by:

H S h ( X ) = log 2 θ + π 1 log 2 π 2 λ 2 π + + 2 λ 2 1 log 1 + λ 1 λ λ λ 2 + 1 log 1 + λ 1 λ .

4.7 Mean deviations

The amount of dispersion in a population can be measured by deviations around the mean and median, defined by:

δ 1 = 0 x μ f ( x ) d x and δ 2 = 0 x M f ( x ) d x ,

where µ and M denote the mean and median, respectively.

To calculate these measures we can use the following relations presented in Nadarajah & Kotz (2006a24 NADARAJAH S & KOTZ S. 2006a. The beta exponential distribution. Reliability Engineering & System Safety, 91(6): 689-697.):

δ 1 = 2 μ F μ 2 0 μ x f x d x and δ 2 = μ + 2 M F M M 2 0 M x f x d x ,

where F(µ) and F(M) can be calculated according to the equation (4). Taking the f (x) as the p.d.f. of the THN distribution, we have:

0 μ x f x | θ , λ d x = θ π 2 1 + λ + 2 λ 2 Φ μ 2 θ 1 + θ 2 2 θ ϕ μ θ 1 + λ 2 λ 2 Φ μ θ 1 ,

In an analogous way we obtain 0Mxfx|θ,λdx.

4.8 Bonferroni and Lorenz curves

The Bonferroni and Lorenz curves proposed by Bonferroni (19305 BONFERRONI CE. 1930. Elementi di statistica generale. Libereria Seber Firenze.), commonly used in areas such as reliability, demography, economics, medicine and insurance, are applications of the mean deviations and are considered by economists as a measure of social inequality, since they relate the accumulated percentages of income and population . The Bonferroni and Lorenz curves are defined as:

Definition 4.2. Suppose that X is a nonnegative random variable with probability density function f (x) and cumulative distribution function F(x). The Bonferroni and Lorenz curves denoted by B(p) and L(p), respectively, are defined as:

B p = 1 p μ 0 q x f x d x e L p = 1 μ 0 q x f x d x ,

on whatμ=E(X),q=F1(p)andp[0,1].

In particular, for the THN distribution we have:

Theorem 4.5. Let X be a nonnegative continuous random variable that has THN distribution. The Bonferroni and Lorenz curves are given by:

B ( p ) = θ 2 p μ { 2 θ ϕ q θ 1 + λ 2 λ 2 Φ q θ 1 + 1 θ π 2 1 + λ + 2 λ 2 Φ q 2 θ 1 } .

and

L ( p ) = θ 2 μ { 2 θ ϕ q θ 1 + λ 2 λ 2 Φ q θ 1 + 1 θ π 2 1 + λ + 2 λ 2 Φ q 2 θ 1 } .

Proof. In fact, by applying the THN distribution we have

B p = 1 p μ 0 q x 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 = θ 2 p μ [ 2 θ ϕ q θ 2 λ 2 Φ q θ 1 λ 2 θ ϕ q θ + 2 θ ϕ q θ 2 λ θ π 2 Φ q 2 θ 1 + 2 λ + 2 θ π ] θ 2 p μ { 2 θ ϕ q θ 1 + λ 2 λ 2 Φ q θ 1 + 1 θ π 2 1 + λ + 2 λ 2 Φ q 2 θ 1 .

In an analogous way we obtain the Lorenz curve.

4.9 Order statistics

Order statistics, as well as sample moments, play an important role in statistical inference (David & Nagaraja, 200312 DAVID HA & NAGARAJA HN. 2003. Order statistics. 3rd ed.. Wiley Series in Probability and Statistics. John Wiley & Sons.). Order statistics moments play an important role in quality control and reliability testing to predict the failure of future items based on the times of some initial failures.

Let X (1) , . . . , X (n) be the order statistics of a random sample X 1 ,..., X n obtained from a population following a distribution f(x), then the p.d.f. of the j-th order statistic is given by:

f j ( x ) = n ! ( j 1 ) ! ( n j ) ! f ( x ) F ( x ) j 1 1 F ( x ) n j

for j=1,,n.

Applying the equations (3) and (4) we obtain the density of the j-th order statistic of the THN.

f j ( x θ , λ ) = n ! ( j 1 ) ! ( n j ) ! 2 θ ϕ x θ 1 + λ 2 λ 2 Φ x θ 1 × × ( 1 + λ ) 2 Φ x θ 1 λ 2 Φ x θ 1 2 j 1 × × 1 ( 1 + λ ) 2 Φ x θ 1 + λ 2 Φ x θ 1 2 n j

Considering the substitution γ(x)=2Φxθ1 we have:

f j x | θ , λ = n ! j 1 ! n j ! 2 θ ϕ x θ 1 + λ 2 λ γ x × × 1 + λ γ x λ γ x 2 j 1 1 1 + λ γ x + λ γ x 2 n j .

Reordering the terms we obtain the general formula for density of the j-th order statistic THN:

f j x | θ , λ = n ! j 1 ! n j ! 2 θ ϕ x θ γ x j 1 1 + λ 2 λ γ x × × 1 + λ λ γ x j 1 γ x λ 1 n j γ x 1 n j .

We can use the THN distribution to model maximum or minimum events, so in what follows we obtain the densities of the maximum and the minimum order statistic. For the order statistic n, we have to replace j = n in the general formula (12). Therefore, the n-th order statistic representing the distribution of the maximum of the THN distribution has the p.d.f. is given by:

f n x = n 2 θ ϕ x θ γ x n 1 1 + λ 2 λ γ x 1 + λ λ γ x n 1 . (12)

For first order statistic, we replace in the general formula j = 1. The result generates the p.d.f. of the minimum of the THN distribution given by:

f 1 x = n 2 θ ϕ x θ 1 + λ 2 λ γ x γ x λ 1 n 1 γ x 1 n 1 .

4.10 Extreme Values

The HN(θ) distribution belongs to the max domain of attraction of the Gumbel extreme value distribution. Hence, there must exist a strictly positive function, say h(t), such that

lim t 1 G ( t + x h ( t ) ) 1 G ( t ) = e x ,

for every x(0,) (Leadbetter et al., 198721 LEADBETTER MR, LINDGREN G & ROOTZÉ NH. 1987. Extremes and related properties of random sequences and processes. 1st ed.. Springer Science & Business Media.).

Then it can be shown that

lim t 1 F ( t + x h ( t ) ) 1 F ( t ) = e a x ,

for every x(0,). Therefore THN(θ, λ) belongs to the max domain of attraction of the Gumbel extreme value distribution with

lim n P [ a n ( X n : n b n x ) ] = exp exp ( x )

for some suitable norming constants an>0 and bn, where X n:n is the maximum order statistic.

4.11 Stress strength reliability

Stress strength reliability estimation is of great interest in engineering, being used in stress-force models or as a measure of performance in electrical and electronic systems. However, it can also be applied in other areas, as it allows a general measure of the differences between two populations Asgharzadeh et al. (20113 ASGHARZADEH A, VALIOLLAHI R & RAQAB MZ. 2011. Stress-strength reliability of Weibull distribution based on progressively censored samples. SORT-Statistics and Operations Research Transactions, 35(2): 103-124.).

Theorem 4.6.LetX1~THN(θ1,λ1)andX2~THN(θ2,λ2), are independent then

R = P ( X 2 < X 1 ) = 0 f 1 ( x θ , λ ) F 2 ( x θ , λ ) d x = 1 + λ 1 1 + λ 2 4 θ 1 π j = 1 1 j + 1 2 2 θ 2 j + 1 θ 2 j + 2 2 1 2 + j Γ j + 1 2 j 1 j 1 ! + 1 + λ 1 λ 2 8 θ 1 π 3 2 j = 1 l = 1 1 j + l + 2 2 2 θ 2 2 j + 2 l + 2 θ 2 j + 2 l + 3 2 j + l Γ j + l + 3 2 2 j 1 j 1 ! 2 l 1 l 1 ! + λ 1 1 + λ 2 16 θ 1 π 3 2 k = 1 j = 1 1 k + j + 2 2 2 θ 1 2 k + 1 2 2 θ 2 2 j + 1 θ 2 k + 2 l + 3 2 k + l Γ k + l + 3 2 2 k 1 k 1 ! 2 j 1 j 1 ! + + λ 1 λ 2 32 θ 1 π 2 k = 1 j = 1 l = 1 1 k + j + l + 3 2 2 θ 1 2 k + 1 2 2 θ 2 2 j + 2 l + 2 θ 2 k + 2 l + 2 j + 4 2 1 2 + k + l + j 2 k 1 k 1 ! 2 j 1 1 ! 2 l 1 l 1 ! × × Γ ( k + l + j + 2 ) .

Proof. Using the p.d.f. and c.d.f. defined in (3) and (4) we have:

R = P ( X 2 < X 1 ) = 0 2 θ 1 ϕ x θ 1 1 + λ 1 2 λ 1 2 Φ x θ 1 1 × × 1 + λ 2 2 Φ x θ 2 1 λ 2 2 Φ x θ 2 1 2 d x .

Replacing 2Φxθ1 by 2πk=1(1)k+122θ2k+1x2k+12k1k1!, we have to

R = 0 2 θ 1 ϕ x θ 1 1 + λ 1 2 λ 1 2 π k = 1 ( 1 ) k + 1 2 2 θ 1 2 k + 1 x 2 k + 1 2 k 1 k 1 ! × × 1 + λ 2 2 Φ x θ 2 1 λ 2 2 π j = 1 ( 1 ) j + 1 2 2 θ 2 2 j + 1 x 2 j + 1 2 j 1 j 1 ! 2 d x .

Applying the distributive law, using properties of integration and isolating terms independent of x, we get:

R = 1 + λ 1 1 + λ 2 4 θ 1 π j = 1 1 j + 1 2 2 θ 2 2 j + 1 0 ϕ x θ 1 x 2 j + 1 d x 2 j 1 j 1 ! +

1 + λ 1 λ 2 8 θ 1 π j = 1 l = 1 1 j + l + 2 2 2 θ 2 2 j + 2 l + 2 0 ϕ x θ 1 x 2 j + 2 l + 2 d x 2 j 1 j 1 ! 2 l 1 l 1 ! +

λ 1 1 + λ 2 16 θ 1 π k = 1 j = 1 1 k + j + 2 2 2 θ 1 2 k + 1 2 2 θ 2 2 j + 1 0 ϕ x θ 1 x 2 k + 2 j + 2 d x 2 k 1 k 1 ! 2 j 1 j 1 ! +

+ λ 1 λ 2 32 θ 1 π 3 2 k = 1 j = 1 l = 1 1 k + j + l + 3 2 2 θ 1 2 k + 1 2 2 θ 2 2 j + 2 l + 2 0 ϕ x θ 1 x 2 k + 2 j + 2 l + 3 d x 2 k 1 k 1 ! 2 j 1 j 1 ! 2 l 1 l 1 !

On solving the integrals we arrive at the desired result. □

Note that for the case where θ1=θ2, it immediately follows that

R = 1 6 λ 2 λ 1 + 1 2 , 1 / 3 < R < 2 / 3.

5 CHARACTERIZATION

Here we discuss characterizations of THN(θ, λ) distribution based on: (i) simple relationship between two truncated moments and (ii) maximum and minimum order statistics.

5.1 Characterizations by two truncated moments

Here we present a characterization of THN(θ, λ) distribution based on a simple relationship between two truncated moments using the following theorem due to Glänzel (198715 GLÄNZEL W. 1987. A Characterization Theorem Based on Truncated Moments and its Application to Some Distribution Families. Dordrecht: Springer Netherlands.) and used in Hamedani et al. (201718 HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.) and Yousof et al. (201747 YOUSOF HM, ALIZADEH M, JAHANSHAHI SMA, RAMIRES TG, GHOSH I & HAMEDANI G. 2017. The Transmuted Topp-Leone G Family of Distributions: Theory, Characterizations and Applications. Journal of Data Science, 15(4): 723-740.). This theorem also holds if the interval I is not closed and also when the cd f F is not in a compact form. This characterization is stable in the sense of weak convergence.

Theorem 5.1.Let (Ω, ℱ, P) be a probability space andI=l,ube an interval for somel<u(l=,u=might as well be allowed) . LetX:ΩIbe a continuous random variable with the cd f F and let w 1 and w 2 be two real valued functions defined on I such that

E w 2 X | X x = E w 1 X | X x ξ x , x I ,

is defined for some real function ξ. Assume thatw1,w2C1I,ξC2Hand F are twice continuously differentiable and strictly monotone function on H. More over, assume that the equationξw1=w2has no real solution in the interior of I. Then F is uniquely determined by the functions w1, w2and ξ , particularly

F x = l x K ξ u ξ u w 1 u w 2 u exp s u d u ,

where the function s is a solution of the differential equation s ( x ) = ξ ( x ) w 1 ( x ) ξ ( x ) w 1 ( x ) w 2 ( x ) and K is the normalization constant, such that I d F = 1 .

Proposition 5.1.LetX:Ω(0,)be a continuous random variable and letw1x=1+λ2λ2Φxθ1andw2x=w1x2Φxθ1forx(0,). Then the random variable X follows THN(θ, λ) with p.d.f. in (3) if and only if the function ξ defined in Theorem 5.1 has the form

ξ x = Φ x θ , x 0 .

Proof. Let X be a random variable with pdf (3), then

1 F x E w 1 X | X x = 1 2 Φ x θ 1 , x 0 ,

and

1 F x E w 2 X | X x = 1 2 1 2 2 Φ x θ 1 2 , x 0 ,

hence

ξ x = E w 2 X | X x E w 1 X | X x = Φ x θ , x 0 .

Conversely, if ξ=Φxθ, then

s x = 1 θ ϕ x θ w 1 x Φ x θ w 1 x w 1 x 2 Φ x θ 1 = d d x log 1 Φ x θ , x 0,

and hence

s x = log 1 Φ x θ , x 0 .

Now, in view of Theorem 5.1, X has density (2) .

Corollary 5.1.LetX:Ω(0,)be a continuous random variable and let w1(x) be as in Proposition 5.1. Then pdf of X follows THN(θ, λ) with p.d.f. in (3) if and only if there exist functions w2and ξ defined in Theorem 5.1 satisfying the differential equation

ξ x w 1 x ξ t w 1 x w 2 x = 1 θ ϕ x θ 1 Φ x θ x 0.

Remark: The general solution of the differential equation in Corollary 5.1 is

ξ x = 1 Φ x θ 1 1 θ ϕ x θ 1 Φ x θ d x + C ,

where C is a constant. The set of functions given in Proposition 5.1 satisfies this differential equation when C = 0. It need to be noted that there are other triplets (w 1 , w 2 , ξ) satisfying the conditions of Theorem 5.1.

5.2 Characterizations by order statistics

Here we use two results from Hamedani et al. (201718 HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.) stated in the theorem below to characterize THN(θ, λ) distribution by maximum (X n:n ) and minimum (X 1:n ) order statistics from a sample of size n from THN(θ, λ) distribution.

Theorem 5.2.LetX:Ω(0,)be a continuous random variable with cdf F and let ξ(x) and w(x) be two differentiable functions in (0, ∞) such that

  • (i)limx0ξxF(x)n=0,

0 w ' t ξ t w t d t = ,

  • then

E ξ X n : n | X n : n < t = w ( t ) , t > 0,

  • implies

F x = exp x w ' t n ξ t w t d t , x 0

  • (ii)limxξx[1F(x)]n=0,

0 w ' t w t ξ t d t = ,

  • then

E ξ X 1 : n | X 1 : n > t = w ( t ) , t > 0,

  • implies

F x = 1 exp 0 x w ' t n w t ξ t d t , x 0.

Proposition 5.2. If we consider ξx=(1+λ)2Φxθ1λ2Φxθ12n and wx=12ξxin Theorem 5.2(i). Then the random variable X follows THN(θ, λ) with c.d.f.in (4).

Proposition 5.3.If we considerξx=1(1+λ)2Φxθ1λ2Φxθ12nandwx=12ξxin Theorem 5.2(ii). Then the random variable X follows THN(θ, λ) with c.d.f. in (4).

6 MAXIMUM LIKELIHOOD ESTIMATOR

Let x=(x1,,xn) be a random sample from the THN distribution with p.d.f. expressed by (3). Then, the log-likelihood function, a part from constant terms, can be written as:

l θ , λ | x n log θ + i = 1 n log ϕ x i θ + i = 1 n log 1 + λ 2 λ 2 Φ x i θ . (13)

The MLEs θ^ and λ^ of θ and λ, respectively may be obtained by maximization of (13), or solving the following likelihood equations:

l θ , λ | x λ = i = 1 n 4 Φ x i θ + 3 4 λ Φ x i θ + 3 λ + 1 l θ , λ | x λ = n θ 1 θ 2 i = 1 n ϕ ' x i θ ϕ x i θ x i + 4 λ θ 2 i = 1 n ϕ x i θ 4 λ Φ x i θ + 3 λ + 1 x i

where ϕ'xiθ=ddθϕxiθ.

7 SIMULATION STUDY

In this section we present the results of a Monte Carlo simulation used to evaluate the bias and mean square error of the estimates obtained by the maximum likelihood method. Samples of size n = 20, 50,..., 170, 200 with λ = −0.9, −0.7,..., 0.7, 0.9 and θ = 1.0 were generated. For each of the combinations of n, θ and λ the inverse transformation method was used to generate, N = 10.000 pseudo-random samples of the THN. The results are reported in the Figures 3 and 4.

Figure 3
Estimated bias and mean square error of θ (θ = 1, 1 : λ = −0.9, 2 : λ = −0.7, 3 : λ = −0.5, 4 : λ = −0.3, 5 : λ = 0, 6 : λ = 0.3, 7 : λ = 0.5, 8 : λ = 0.7 and 9 : λ = 0.9).

Figure 4
Estimated bias and mean square error of λ (θ = 1, 1 : λ = −0.9, 2 : λ = −0.7, 3 : λ = −0.5, 4 : λ = −0.3, 5 : λ = 0, 6 : λ = 0.3, 7 : λ = 0.5, 8 : λ = 0.7 and 9 : λ = 0.9).

In analyzing the bias of θ^, we note that for values of λ = −0.9, −0.7, −0.5, −0.3, 0, 0.3, 0.5, θ^ presented an excellent estimate converging to zero even for small samples. Although for λ = 0.7 and 0.9 the convergence of θ^ is a little more time consuming, we have observed that with the increase in the sample size the estimated bias is very low, around −0.04 and −0.08 respectively.

We also note that for the values of λ = −0.9, −0.7, −0.5, −0.3, 0, 0.3, 0.5, λ^ is estimated accurately. Like in the case of θ , λ = 0.7, 0.9 also has a slower convergence, with the estimated bias close to −0.11 and −0.02, respectively.

Though the bias are very low for both the estimated parameters, it is comparatively higher for λ^. Moreover higher the amplitude of λ higher is the bias in the estimates of both the parameters showing that the transmutation parameter λ , exerts influence in the estimation of the scale parameter θ.

The mean square errors of θ^ are extremely low, with the positive values of λ the ones that present a more precise convergence. Although the positive λ parameters do not converge directly to zero, we can note from the graph that the errors in the estimate are less than 0.03. Thus we can conclude that the errors in the estimates are practically insignificant.

Finally, when we observe the mean square error of the parameter λ^, we realize that θ exerts influence on its estimation. Analogously to the previous one, we can see that, for all scenarios, the errors tend to zero, the furthest being close to 0.1.

In general, we can conclude that the estimators have the property of asymptotic unbiasedness, since the bias tends to zero as n increases, while trend in the mean squared error show the consistency, because when the value of n increases the errors tend to zero.

8 REAL DATA ANALYSIS

In this section, we illustrate the applicability of THN distribution using two real data sets that were not been analyzed before in the literature. Our objective is to evaluate its adjustment in relation to other distributions already presented in the literature.

In both applications the data used were compiled from the daily series of daily precipitation obtained at the portal of the National Institute of Meteorology (http://www.inmet.gov.br). In the adjustment of distributions, the cumulative total for the month was considered as a response. The cumulative total monthly, quarterly, and so forth is widely used in the calculation of the Standardized Precipitation Index (SPI).

8.1 Monthly Precipitation

For the first application, we consider the data for the month of February of each year, from 1974 to 2016, of the city of Chapecó localized in the state of Santa Catarina. It is important to note that due to some fault it does not record measurement in the month of February of some years in the considered period. Table 1 shows the summary measures of the data used. Looking at Figure 5, it is possible to note that the risk is increasing, an indication that the THN distribution may be an appropriate model for adjustment.

Table 1
Descriptive statistics.

Figure 5
TTT Plot.

For this data set, we fit the following models:

  • 1. Half-Normal (HN) due to Daniel (195911 DANIEL C. 1959. Use of Half-Normal plots in interpreting factorial two-level experiments. Technometrics, 1(4): 311-341.).

f 1 ( x ) = 2 θ ϕ x θ , θ > 0.

  • 2. General Half-Normal (GHN) due to Pewsey (200231 PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054., 200432 PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.).

f 2 ( x ) = 2 θ ϕ x α θ , α = m i n ( x ) , θ > 0.

  • 3. Power Half-Normal (PHN) due to Gómez & Bolfarine (201516 GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.).

f 3 x = 2 α θ ϕ x θ 2 Φ x θ 1 α 1 , α , β > 0.

  • 4. Generalized Half-Normal (GHN II) due to Cooray & Ananda (20087 COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.).

f 4 ( x ) = 2 α θ x θ α 1 ϕ x θ α , θ , α > 0.

  • 5. Gamma Half-Normal (GMHN) due to Alzaatreh & Knight (20132 ALZAATREH A & KNIGHT K. 2013. On The Gamma-Half Normal Distribution and Its Applications, 12(1): 103-119.).

f 5 ( x ) = 2 θ ϕ x θ Γ α b α log 2 Φ x θ α 1 2 Φ x θ 1 b 1 , θ , α , b > 0.

  • 6. Odd Log-Logistic Generalized Half-Normal (OLLGHN) due to Cordeiro et al. (20178 CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.).

f 6 ( x ) = a 2 π α x x θ α e 1 2 % x θ 2 α 2 Φ x θ α 1 2 2 Φ x θ α a 1 2 Φ x θ α 1 a + 2 2 Φ x θ α a 2 , θ , α , a > 0.

  • 7. Transmuted Generalized Half-Normal (TGHN).

f 7 ( x ) = 2 α θ x θ α 1 ϕ x θ α ( 1 + λ ) 2 λ 2 Φ x θ α 1 , θ , α > 0, | λ | 1

  • 8. Beta Generalized Half-Normal (BGHN) due to Pescim et al. (201029 PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.).

f 8 ( x ) = 2 b α a θ B a , b x θ α 1 ϕ x θ α 2 Φ x θ α 1 a 1 Φ x θ α b 1 , θ , α , a , b > 0

  • 9. Extended Generalized Half-Normal (EGHN) due to Sanchez et al. (201635 SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.).

f 9 ( x ) = a b 2 π α x x θ α exp 1 2 x % θ 2 α 2 2 Φ x θ % α a 1 × 1 2 2 Φ x θ α a b 1 , θ , α , a , b > 0.

  • 10. Kumaraswamy Generalized Half-Normal (KGHN) due to Cordeiro et al. (20129 CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.).

f 10 ( x ) = a b 2 π α x x θ α exp 1 2 x θ 2 α 2 Φ x θ α 1 a 1 × 1 2 Φ x θ α 1 a b 1 , θ , α , a , b > 0.

  • 11. Beta Generalized Half-Normal Geometric (BGHNG) due to Ramires et al. (201333 RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.).

f 11 x = 1 p 2 b 1 2 a x x θ α exp 1 2 x θ 2 α π B a , b 2 Φ x θ α 1 a 1 × 1 Φ x θ α b 1 1 p 1 I 2 Φ x θ α 1 a , b 2 , θ , α , a , b > 0,0 p < 1.

  • where I2Φxθα1(a,b)=B2Φxθα1(a,b)/B(a,b).

Table 2 shows the maximum likelihood estimates (standard errors) of the fitted distribution. All estimates were obtained in SAS/NLMIXED procedure SAS (201036 SAS. 2010. The NLMIXED Procedure, SAS/STAT® User’s Guide, Version 9.22. Cary, NC: SAS Institute Inc.), applying the Newton-Raphson optimization technique. Although, for all distributions, the resulting variancecovariance matrices were positive definite and maxΘlΘx|Θ=Θ^<0.000001, we observed atypical standard errors for some parameters in the GMHN, BGHN, EGHN and KGHN distributions. Our guess is that they converge to a local minimum or, most likely, that the parameters are linear functions of each other (or almost collinear) on the data in question. In fact, from the correlation matrices we obtain corr(θ^,α^)=0.9594,corr(b^,α^)=0.9707 and corr(θ^,b^)=0.9984 for the GMHN distribution, corr(α^,b^)=0.7179,corr(α^,a^)=0.8255,corr(θ^,α^)=0.8587 and corr(θ^,α^)=0.9711 for the BGHN distribution, corr(θ^,b^)=0.9817,corr(a^,b^)=0.9896 and corr(θ^,a^)=0.9988 for the EGHN distribution, corr(α^,a^)=0.8899,corr(θ^,a^)=0.8977,corr(α^,b^)=0.9535,corr(a^,b^)=0.9568,corr(θ^,b^)=0.9726 and corr(θ^,α^)=0.9807 for the KGHN distribution. It was not possible to estimate the standard error of all the parameters of the BGHNG distribution and its variance-covariance matrix was also not completely filled, but we can observe that corr(α^,p^)=0.7391,corr(α^,a^)=0.7480,corr(α^,b^)=1.0000 and corr(b^,p^)=1.0000.

Table 2
Maximum likelihood estimates (standard errors).

To compare the distributions, we consider the statistics based on the likelihood −2×Log-Like, AIC, AICc and BIC and the measures of good adjustment KS, AD and CvM. The best model is one that provides the minimum values of these criteria. Table 3 shows such values and the index indicates the classification obtained for each distribution. We also have in the last column a total rating (sum of ratings) for each distribution. Since there are large uncertainties in the estimates of the GMHN, BGHN, EGHN, KGHN and BGHNG distributions, they will not be considered. When looking at the table, we can see that the THN distribution ranked first, followed by the TGHN and PHN distribution. Both distributions, THN and PHN, have two parameters. The TGHN distribution has three parameters, and the additional parameter (λ ) has zero in its respective confidence interval. In view of this, we can infer that the models that obtained the best fit were THN (17) and PHN (24).

Table 3
Comparison criterion.

8.2 Ten-Day Precipitation

The data consists of the ten-day accumulated precipitation between April and June in the station WMO 83498 localized in the state of Bahia, Brazil. We have considered the historical series from 1961 to 2017. In this application we adopted a regression structure for the scale parameter, that is,

log θ i = β 0 + β 1 x i (14)

where x i denote the ith observation associate the ith ten-day period. The periods of ten days were considered for the months of April, May, June and July. We have considered the distributions HN, GHN II and THN since only they have closed analytical expressions for the mean.

In Table 4 are reported the parameter estimates and standard errors. Since the β^1 has a negative sign for all distributions, decrease in the accumulated precipitation is indicated as the time passes. The empirical and estimated means (95% confidence intervals) are presented in Table 5 and Figure 6. Also, in Table 6 are present several criteria to discriminate between the HN, GHN II and THN distributions. From these results we can conclude that GHN II and THN provide better fit than the HN distribution. It is also observed that THN has the lowest values of those criteria.

Table 4
Parameter estimates and 95% confidence intervals.

Table 5
Empirical and estimated means for each ten-day period.

Figure 6
The empirical and estimated means.

Table 6
Model selection measures.

9 CONCLUSION

In this paper, we present the THN distribution formulated from the quadratic transmutation proposed by Shaw & Buckley (200938 SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.). Some characteristics and mathematical properties of the proposed distribution are studied. It is important to note that the moment generating function, moment of order k, mean, variance, asymmetry and kurtosis have explicit analytic expressions, which depend only of the parameters of the THN distribution. Due to the simplicity of the distribution, it was possible to calculate the uncertainty measures as a Shannon entropy and the mean deviations. As the Bonferroni and Lorenz curves as well as the reliability characteristic are presented, giving opportunities to areas such as engineering to extract benefits from the use of the proposed distribution. Distribution of order statistic was calculated for the distribution of THN, as well as their respective expressions for the densities of the maximum and minimum distribution. A Monte Carlo simulation study showed that the parameters are efficiently estimated by maximum likelihood method and show a low bias and a low accuracy even for a small sample sizes, which indicates the potential that the new distribution provides for modeling. In the first application using daily precipitation data and 11 models proposed in the literature (derived from the HN distribution) were placed in competition with the distribution proposed in this work. Five models were withdrawn from the analysis because of their inconsistent estimates and by looking at the statistics based on likelihood −2×Log-Like, AIC, AICc and BIC, as well as KS, AD and CvM, the THN distribution showed a better fit in comparison with the models used. Since a distribution presents an explicit and simple expression for the mean, it was possible to use it for an application using regression structure. When observing the −2×Log-Like, AIC, BIC and SSR criteria, we noticed that the THN distribution presented a good of fit, reinforcing its supremacy when compared to the models used here.

It is important to mention that during the peer review process the THN distribution was considered by Balaswamy (20184 BALASWAMY S. 2018. Transmuted Half Normal Distribution. International Journal of Scientific Research in Mathematical and Statistical Sciences, 5(4): 163-170.). Although both work propose the same distribution, we emphasize that in our paper a more comprehensive account of mathematical properties of the new distribution was presented (survival, hazard rate and residual life functions and their properties; asymptotic behavior of the tails; moments, associated measures and moment generating function; differential entropy; mean deviations; Bonferroni and Lorenz curves; order statistics; extreme values; stress strength reliability; characterizations by two truncated moments; characterizations by order statistics). In addition, we studied the bias and accuracy of the parameters estimated by the maximum likelihood method and illustrated the applicability of THN distribution using two real data sets that were not been analyzed before in the literature.

References

  • 1
    ALJARRAH MA, LEE C & FAMOYE F. 2014. On generating T-X family of distributions using quantile functions. Journal of Statistical Distributions and Applications, 1(1): 1-17.
  • 2
    ALZAATREH A & KNIGHT K. 2013. On The Gamma-Half Normal Distribution and Its Applications, 12(1): 103-119.
  • 3
    ASGHARZADEH A, VALIOLLAHI R & RAQAB MZ. 2011. Stress-strength reliability of Weibull distribution based on progressively censored samples. SORT-Statistics and Operations Research Transactions, 35(2): 103-124.
  • 4
    BALASWAMY S. 2018. Transmuted Half Normal Distribution. International Journal of Scientific Research in Mathematical and Statistical Sciences, 5(4): 163-170.
  • 5
    BONFERRONI CE. 1930. Elementi di statistica generale. Libereria Seber Firenze.
  • 6
    CHOU CY & LIU HR. 1998. Properties of the Half-Normal distribution and its application to quality control. Journal of Industrial Technology, 14(3): 4-7.
  • 7
    COORAY K & ANANDA MMA. 2008. A generalization of the Half-Normal distribution with applications to lifetime data. Communications in Statistics - Theory and Methods, 37(9): 1323-1337.
  • 8
    CORDEIRO GM, ALIZADEH M, PESCIM RR & ORTEGA EMM. 2017. The odd log-logistic generalized Half-Normal lifetime distribution: Properties and applications. Communications in Statistics Theory and Methods, 46(9): 4195-4214.
  • 9
    CORDEIRO GM, PESCIM RR & ORTEGA EMM. 2012. The Kumaraswamy Generalized Half-Normal Distribution for Skewed Positive Data. Journal of Data Science, 10: 195-224.
  • 10
    CORDEIRO GM, PESCIM RR, ORTEGA EMM & DEMÉTRIO CGB. 2013. The Beta generalized Half-Normal distribution: New properties. Journal of Probability and Statistics, 2013: 1-18.
  • 11
    DANIEL C. 1959. Use of Half-Normal plots in interpreting factorial two-level experiments. Technometrics, 1(4): 311-341.
  • 12
    DAVID HA & NAGARAJA HN. 2003. Order statistics. 3rd ed.. Wiley Series in Probability and Statistics. John Wiley & Sons.
  • 13
    DE BRITO CCR, RÊGO LC & DE OLIVEIRA WR. 2015. Method for Generating Distributions and Classes of Probability Distributions: The Univariate Case. arXiv:1504.01062, pp. 1-50.
  • 14
    FELLER W. 1968. An introduction to probability theory and its applications. vol. 1. 3rd ed.. John Wiley & Sons.
  • 15
    GLÄNZEL W. 1987. A Characterization Theorem Based on Truncated Moments and its Application to Some Distribution Families. Dordrecht: Springer Netherlands.
  • 16
    GÓMEZ YM & BOLFARINE H. 2015. Likelihood-based inference for the power HalfNormal distribution. Journal of Statistical Theory and Applications, 14(4): 383-398.
  • 17
    GUPTA RD & KUNDU D. 2009. Introduction of Shape/Skewness Parameter(s) in a Probability Distribution. Journal of Applied Statistical Science, 7(2): 153-171.
  • 18
    HAMEDANI GG, CORDEIRO GM, LIMA MCS & NASCIMENTO ADC. 2017. Some Extended Classes of Distributions: Characterizations and Properties. Pakistan Journal of Statistics and Operation Research, 13(4): 893-908.
  • 19
    JOHNSON NL, KOTZ S & BALAKRISHNAN N. 1994. Continuous univariate distributions. Vol. 1. Second ed.. New York: John Wiley & Sons Inc.
  • 20
    LAI DC. 2011. Constructions and applications of lifetime distributions. Applied Stochastic Models in Business and Industry, 29(2): 127-140.
  • 21
    LEADBETTER MR, LINDGREN G & ROOTZÉ NH. 1987. Extremes and related properties of random sequences and processes. 1st ed.. Springer Science & Business Media.
  • 22
    LEE C, FAMOYE F & ALZAATREH AY. 2013. Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdisciplinary Reviews: Computational Statistics, 5(3): 219-238.
  • 23
    LEONE FC, NELSON LS & NOTTINGHAM RB. 1961. The Folded Normal Distribution. Technometrics, 3(4): 543-550.
  • 24
    NADARAJAH S & KOTZ S. 2006a. The beta exponential distribution. Reliability Engineering & System Safety, 91(6): 689-697.
  • 25
    NADARAJAH S & KOTZ S. 2006b. R Programs for Truncated Distributions. Journal of Statistical Software, 16(1): 1-8.
  • 26
    NADARAJAH S & ROCHA R. 2016. Newdistns: An R Package for New Families of Distributions. Journal of Statistical Software, 69(1): 1-32.
  • 27
    NAKAGAMI N. 1960. The m-distribution a general formulation of intensity distribution of rapid fading. Proc. Symp. Statist. Methods Radio Wave Propag.
  • 28
    OLMOS NM, VARELA H, BOLFARINE H & GÓMEZ HW. 2014. An extension of the generalized Half-Normal distribution. Statistical Papers, 55(4): 967-981.
  • 29
    PESCIM RR, DEMÉTRIO CGB, CORDEIRO GM, ORTEGA EMM & URBANO MR. 2010. The Beta generalized Half-Normal distribution. Computational Statistics & Data Analysis, 54(4): 945-957.
  • 30
    PESCIM RR, ORTEGA EMM, CORDEIRO GM & DEMÉTRIO CGB. 2013. The Log-Beta Generalized Half-Normal Regression Model. Journal of Statistical Theory and Applications, 12(4): 330-347.
  • 31
    PEWSEY A. 2002. Large-sample inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 31(7): 1045-1054.
  • 32
    PEWSEY A. 2004. Improved likelihood based inference for the general Half-Normal distribution. Communications in Statistics Theory and Methods, 33(2): 197-204.
  • 33
    RAMIRES TG, ORTEGA EMM, CORDEIRO GM & HAMEDANI GG. 2013. The Beta generalized Half-Normal geometric distribution. Studia Scientiarum Mathematicarum Hungarica, 50(4): 523-554.
  • 34
    SABOOR A, KHAN MN, CORDEIRO GM, PASCOA MA, RAMOS PL & KAMAL M. 2019. Some new results for the transmuted generalized gamma distribution. Journal of Computational and Applied Mathematics, 352: 165-180.
  • 35
    SANCHEZ JJD, FREITAS WWL & CORDEIRO GM. 2016. The extended generalized Half-Normal distribution. Brazilian Journal of Probability and Statistics, 30: 366-384.
  • 36
    SAS. 2010. The NLMIXED Procedure, SAS/STAT® User’s Guide, Version 9.22. Cary, NC: SAS Institute Inc.
  • 37
    SHANNON CE. 1951. Prediction and entropy of printed English. Bell Labs Technical Journal, 30(1): 50-64.
  • 38
    SHAW W & BUCKLEY IRC. 2009. The alchemy of probability distributions: Beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434v1 [q-fin.ST], pp. 1-8.
  • 39
    SINGH R. 1994. Simulation of Observations for the Half-Normal Distribution. Sankhyã: The Indian Journal of Statistics, Series B, 56(2): 137-139.
  • 40
    SINGH R. 1995. Editorial Note on the Paper “Simulation of observations for the HalfNormal distribution”. Sankhyã: The Indian Journal of Statistics, Series B, 57(3): 461- 461.
  • 41
    STACY EW. 1962. A generalization of the Gamma distribution. The Annals of Mathematical Statistics, 33(3): 1187-1192.
  • 42
    TAHIR MH & CORDEIRO GM. 2016. Compounding of distributions: A survey and new generalized classes. Journal of Statistical Distributions and Applications, 3(1): 3-13.
  • 43
    TAHIR MH & NADARAJAH S. 2015. Parameter induction in continuous univariate distributions: Well-established G families. Journal of Probability and Statistics, 87(2): 539-568.
  • 44
    TSAGRIS M, BENEKI C & HASSANI H. 2014. On the Folded Normal Distribution. Mathematics, 2(1): 12-28.
  • 45
    VODÃ VG. 1976a. Inferential procedures on a generalized Rayleigh variate I. Aplikace Matematiky, 21(6): 395-412.
  • 46
    VODÃ VG. 1976b. Inferential procedures on a generalized Rayleigh variate II. Aplikace Matematiky, 21(6): 413-419.
  • 47
    YOUSOF HM, ALIZADEH M, JAHANSHAHI SMA, RAMIRES TG, GHOSH I & HAMEDANI G. 2017. The Transmuted Topp-Leone G Family of Distributions: Theory, Characterizations and Applications. Journal of Data Science, 15(4): 723-740.

Publication Dates

  • Publication in this collection
    10 Aug 2020
  • Date of issue
    2020

History

  • Received
    23 Nov 2018
  • Accepted
    28 Mar 2020
Sociedade Brasileira de Pesquisa Operacional Rua Mayrink Veiga, 32 - sala 601 - Centro, 20090-050 Rio de Janeiro RJ - Brasil, Tel.: +55 21 2263-0499, Fax: +55 21 2263-0501 - Rio de Janeiro - RJ - Brazil
E-mail: sobrapo@sobrapo.org.br