Statistical hybridization of normal and Weibull distributions with its properties and applications

The normal distribution is one of the most popular probability distributions with applications to real life data. In this research paper, an extension of this distribution together with Weibull distribution called the Weimal distribution which is believed to provide greater flexibility to model scenarios involving skewed data was proposed. The probability density function and cumulative distribution function of the new distribution can be represented as a linear combination of exponential normal density functions. Analytical expressions for some mathematical quantities comprising of moments, moment generating generating function, characteristic function and order statistics were presented. The estimation of the proposed distribution’s parameters was undertaken using the method of maximum likelihood estimation. Two data sets were used for illustration and performance evaluation of the proposed model. The results of the comparative analysis to other baseline models show that the proposed distribution would be more appropriate when dealing with skewed data.


Introduction
One of the main motivations for studying new families of statistical distributions lies in the increased flexibility of fitting various datasets that cannot be properly fitted by existing distributions [1]. In many applied areas, such as environmental and medical sciences, engineering, biological studies, lifetime analysis, actuaries, economics, as well as finance and insurance; there is a clear need for extended forms of these distributions [2,3]. The normal distribution is the most popular probability model having wider applications in solving real life problems. When the number of observations is large, it can serve as an approximation to other probability models [4,5].
The normal distribution is also called Guassian distribution, named after the German Mathematician Carl Freidrich Gauss (1777-1855) who introduced it in connection with the theory of error [6]. The probability density function (pdf) and the cumulative distribution function (cdf) of the normal distribution with location parameter −∞<μ<∞ and scale parameter σ>0 [7,8]: Attempts to generalize the Normal distribution have led to the development of Skewed Normal distribution [9], the Beta-Normal distribution [10], the Generalized Normal distribution [11], the Kumaraswamy-Normal distribution [12], the McDonald-Normal distribution [13], the modified Beta-Normal distribution [14], the Gamma-Normal distribution [15], the modified Gamma-Normal distribution [16], the Kummer Beta-Normal distribution [17], and a host of others. These distributions are proven to be more flexible than the classical Normal distribution when applied to the real life datasets [18].
The use of four-parameter distribution should be sufficient for most practical purposes, and at least three-parameter are needed in such distributions [34], but doubted any noticeable improvement arising from including a fifth-or sixth parameter. The Weibull-G family of distributions have been adopted by several notable researchers among to generate known theoretical models such as the Weibull-Exponential distribution [35], Weibull-rayleigh distribution [25], and Weibull-Frechet distribution [26]. In this research paper, a proposition ofa probability model called a Weibull-Normal distribution, also to be known as Weimal distribution, which resulted from hybridizing Weibull and Normal distributions by utilizing the Weibull-G family generator [21].

The Weibull-Generalized (Weibull-G) Family of Distribution
datasets; a Weibull generalized family of distribution, according to Bourgeignon et al. [21], will have a cdf G (x,ξ) defined by: The above integral yields: The corresponding pdf is given by: Where g(x,ξ) and G (x,) are the respective pdf and cdf of the baseline distribution indexed by parameter vector ξ, where α>0 and β>0 are the scale and shape parameters respectively.
In eqn. (5) gives the pdf of any Weibull-G family of distribution and is most tractable when both cdf and pdf have simple analytic expressions. The major benefit of the Weibull-generator expressed in eqn. (3) lies in its ability to offer more flexibility to the extremes of the pdf and this makes it more suitable for analyzing data with high degree of asymmetry.

The Weibull-Normal distribution
Taking into account the pdf and cdf of the normal distribution as given in eqns. (1) and (2) with location parameter μ∊  and dispersion parameter α>0. The respective cdf and pdf of the proposed fourparameter Weibull-Normal distribution can be obtained from eqns. (4) and (5) as follows: and ( ) ( ) 1 1 , , , , 1 1 x The plots of the pdf and cdf of the new Weibull-Normal distribution for the selected parameter values and plotted and it was observed that the cdf of the Weibull-Normal distribution increases as x increases and approaches one ax x gets larger. The different pdf plots of the Weimal distribution under different parameter values indicate that a it is negatively skewed distribution and hence it will be very appropriate in modeling skewed real-life datasets unlike the symmetric normal distribution. It is interesting to note that whenever α=β=1, then the Weimal distribution becomes normal and that all parameter values affect the graph of the pdf in different directions and different rates.

Useful extensions
Extensions in eqns. (6) and (7) can be derived using the concept of exponentiated distributions as follows: Consider the exponentiated normal (EN) distribution as stated by Nadarajah and Kotz [6] with power parameter α>0 defined byY∼EN(a,μ,σ) with cdf and pdf given respectively by: and By expanding the exponential term in eqn. (7) using power series and utilizing the generalized binomial theorem while substituting, we can re-write in eqn. (7) as: Using the result in eqn. (8), we can now express the pdf of the Weimal distribution as a linear combination of exponentiated (exponential-G) density functions as: Corresponding to: Denotes the exponential normal pd f (X∼EN(β(k+1)+j,μ,σ) and the coefficient of the distribution w j,k is given by: By integrating the pdf in eqn. (9) with respect to x, we obtain the corresponding cdf as: If β>0 is a real number (positive non-integer), we can expand the last term in eqn. (12) as: Where Combining in eqns. (12) and (13), the Weimal cd f can be expressed in eqn. (12) as: By differentiating in eqn. (14) and changing indices, we can obtain the pdf of the Weimal distribution as: Where

Ordinary moments
Moments are used to study some of the most important features and characteristics of a random variable such as mean (central tendency measure), variance (dispersion measure), skewness (Sk) and kurtosis (ku).
The n th moment of X can be obtained as: Now, substituting for ∅(x) and ∅(x γ-1 ) in equation (16), using binomial expansion and simplifying, we have: Where /(n,p) represents the (n,p) th probability weighted moment (PWM) for any n and p positive integers of the standard normal distribution and is found as follows: Now, according to Nadarajah [27], And where in eqn. (19) is the Lauricella function of type A [28], using these definitions in eqns. (19) and (18) Given that p+n is even.
Combining in eqns. (17) and (21), it can be expressed that the nth moment of the standard Weimal distribution in terms of the Lauricella function of type A [6,8,9]

The central moments
The nth central moments or moment about the mean of X, say μ n can be obtained as: The variance of X is the central moment of order two (n=2) and is given as: For n=1, we therefore obtain the mean of the standard Weimal distribution from eqn. (22) as: Similarly, when n=2, we derive the following expression for the second moment as: Hence, the variance of X ∼ Weimal (α,β,0,1)which is the second central moment of X is obtained as: Using in eqns. (26) and (27), respectively.
The coefficient of skewness is the standardized third central moment of X about the mean and can be obtained using the expression: Whereas the coefficient of kurtosis is the standardized fourth central moment of X about the mean and is given by Where σ can be obtained using eqn. (28) while μ 3 and μ 4 are obtained using eqn. (23).

Moment generating function (mgf)
A general way of organizing all the moments into one mathematical object is called the mgf. In other words, the mgf generates the moments of X by differentiation, that is for any real number say k, the k th derivative of M x (t) evaluated at t=0 is the kth moment μ k of X.

Characteristics function (cf)
The characteristics function (cf) has many useful and important properties which give it a central role in statistical theory. Its approach is particularly useful in analysis of linear combination of random variables.
A representation for the cf is given by: Simple algebra and power series expansion proves that:

Order statistics
Order statistics have been used in a wide range of problems including robust statistical estimation, detection of outliers, characterization of probability distribution, goodness of fit tests, entropy estimation, analyses of censored samples, reliability analysis, quality control and even researches bordering on strength of materials. In this section, closed form expression for the pdf's of the ith order statistics of the Weibull-Normal (that is, Weimal) distribution is derived.
Suppose X 1 ,X 2 ,…,X n is a random sample from the standard Weimal distribution and let X 1:n , X 2:n … X i:n denote the corresponding order statistic obtained from this sample. The pd f,f 1:n (x) of the i th order statistic can be obtained by: Using eqns. (6) and (7), the pd f,f 1:n (x) of the ith order statistic X 1:n can be expressed from equation (37) as: and ( ) 1 : 1

Estimation of parameters of Weimal distribution
The estimation of the parameters of the Weimal distribution is presented using the method of maximum likelihood in this section. Let X 1 ,X 2 ,…,X n be a random sample from the Wemal distribution with unknown parameter θ=(α,β,μ,σ) T . The total log-likelihood function for θ is obtained from f(x) as follows: Let l(θ)=L(X 1 ,X 2 ,…,X n /α,β,μ,σ) therefore Meanwhile differentiating l(θ) partially with respect to each of the parameters:α,β,and σ and setting the results to zero gives the maximum likelihood estimates of the respective parameters. The partial derivative of l(θ) with respect to each parameter or the score function is given by: , , Where the components of the score vector U(θ)=(U α ,U β ,U μ ,U σ ) are: Maximization in eqn. (42) can be performed by using well established routines like nim-routine or optimize in the R-statistical package. Setting these equation to zero, (that is,U(θ)=0) and solving them simultaneously yieds the maximum likelihood estimate ( ) ˆ MLE of θ θ . These equations cannot be solved analytically and therefore statistical softwares can be used to solve them numerically by means of iterative techniques like Newton-Raphson method.

Conclusion
In this research article, a new-four-parameter probability model named Weibull-Normal distribution (also to be known as Weimal distribution) resulting from the hybridization of two well-known probability models, namely: Weibull distribution and Normal distribution, is introduced. The new probability model extends the classical normal distributions by adding skewness to it. An obvious reason for generalizing a classical distribution is the fact that the generalization provides more flexibility to analyze real-life data. The new distribution has proved to be versatile and analytically tractable during the generalization process. The Weimal identity function can be expressed as aalinear combination of exponentiated normal density functions, thereby enabling derivations of vital mathematical properties comprising of moments, moment generating function, characteristics function and order statistics. The estimation of the parameters has been approached by the method of maximum likelihood.