An extension of Rayleigh distribution and applications

Abstract: In this article, we have derived a new distribution named as Rayleigh– Rayleigh distribution (RRD) motivated by the transformed transformer technique by Alzaatreh, Lee, and Famoye (2013). The statistical properties of RRD, comprising of explicit expressions for quantile function, moments, moment generating function, mean deviation, skewness, kurtosis, reliability measures, measures of uncertainty, distributions of order statistics and L moments have been derived. Parameter estimation is carried out using method of maximum-likelihood estimation and Fisher information matrix is derived. The flexibility of the new distribution is assessed by applying it to four real data sets. The comparative behavior of RRD with Rayleigh distribution, Generalized Rayleigh distribution, Exponentiated Rayleigh distribution, Weibull Rayleigh distribution and Alpha Power Rayleigh distribution provided the evidence that it outperforms the other competing distributions.


PUBLIC INTEREST STATEMENT
One of the basic tools of statistics is to modeling the real-life phenomenon in the form of statistical distributions. Adding parameters in the existing distribution and combining two or more existing distributions are two common techniques used for the generalization of a distribution. These generalized distributions accommodate the changing circumstances and complexities of real life. In this study, we have extended the Rayleigh distribution by adding the parameter of the other Rayleigh distribution. The resultant distribution contained two parameters and is called Rayleigh-Rayleigh distribution (RRD). Simulation study revealed that estimators of the parameters are asymptotically unbiased and efficient. The application of RRD to four real data sets exhibited the more flexibility of RRD as compared to some other life time distributions. It is hoped that the proposed distribution will attract the attention of the researcher in the fields of life sciences, physical sciences and social sciences.

Introduction
Uncertainties and risks are two main realities of real-life phenomena. Probability theory is used to handle these uncertainties and modeling of these real-life phenomena. Due to variation, complexities and diversities in real life, a large number of statistical distributions are derived. Still there are many important problems where the real-life data do not follow any standard probability distributions. This leads to the extensions and development of generalized statistical distributions.
In literature numerous generalized distributions have been developed with common feature of having more parameters. Induction of parameters in existing distribution improves the goodness of fit of the distribution under study and tail properties of a distribution increases.
Rayleigh distribution (RD) introduced by Lord Rayleigh in 1880 plays a crucial role in modelling and analyzing life time data such as project effort loadings modelling, life testing experiments, reliability analysis, communication theory, physical sciences, engineering, medical imaging science, applied statistics and clinical studies. Let the random variable X has RD with scale parameter, βi.e. X~RD (x;βÞ. Then the probability density function (PDF) and the cumulative density function (CDF) of RD are defined as Due to the importance of Raleigh distribution in a variety of fields, a wide range of extensions of Raleigh Distribution has been established. Kundu and Raqab (2005) proposed the generalized RD and its unknown parameters are estimated by using different estimation methods. Abd Elfattah, Hassan, and Ziedan (2006) studied the estimation of unknown parameter of RD in the presence of different censoring sampling schemes. Voda (2007) proposed the new generalization of RD by using conservability approach. In this technique the PDF of generalized distribution can be obtained as À g y ð Þ ¼ yf y ð Þ E y ð Þ , where f y ð Þ and E y ð Þ are PDF and finite mean of positive continuous random variable. Dey (2009) derived the Bayes estimators for the parameter of RD by using square error and LINEX loss functions. Merovci (2013) developed the transmuted RD by using quadratic rank transmutation technique. Merovci (2014) proposed the transmuted generalized RD and describe its mathematical properties. Merovci and Elbatal (2015) studied the Weibull RD. Mahmoud and Ghazal (2017) deliberated the parameters estimation of exponentiated Rayleigh based on type II censored data.
In this article we have derived the generalization of the RD named as Rayleigh-Rayleigh distribution (RRD) using Transformed Transformer technique proposed by Alzaatreh, Lee, and Famoye (2013). Main motivation of this study is to generate a new distribution by adding scale parameter in the RD, so that the performance of generalized distribution becomes better than the original one.
According to Transformed Transformer method the function form of CDF of a random variable is used to transform PDF of other random variable into a new distribution. Let T be a continuous random variable with PDF and CDF φ t ð Þ and Φ t ð Þ, respectively, for À 1 a<t<b 1. Let X is another random variable with PDF ψ x ð Þ and CDF Ψ x ð Þ. W [Ψ x ð Þ] is a functional form of CDF of the random variable X defined on the support of random variable T, it is differentiable and monotonically non-decreasing function and when x ! À1 then W Ψ x ð Þ ½ !a and when x ! 1 then W Ψ x ð Þ ½ !b. The PDF of random variable T is transformed into the function of random variable X through the transformer W [Ψ x ð Þ]. The new generalized family of distribution is called the T-X family of distribution. Let G x ð Þandg x ð Þ be the CDF and PDF of generalized family of distributions, respectively, then , large number of generalized families of distributions can be formed. For discrete random variable X, discrete families of distributions can be derived. T-Geometric family of discrete distributions is proposed by Alzaatreh, Lee, and Famoye (2012).
Consider the support of random variable T is [0,1), then the functional form of W Ψ x ð Þ ½ may be defined as (-log [1-Ψ x ð Þ]) which satisfied all the above three conditions. The CDF of new generalized distribution is defined as and the PDF is In the literature, a large number of generalized families of distributions has been derived and studied by using Transformed-Transformer technique, e.g. T-geometric family of discrete distributions by Alzaatreh et al. (2012), T-normal family by Alzaatreh, Lee, and Famoye (2014), Logistic-X by Tahir, Cordeiro, Alzaatreh, Mansoor, and Zubair (2016), Weibull-G by , Gompertz-G family of distribution by Alizadeh, Cordeiro, Pinho, and Ghosh (2017), etc.
The rest of the paper is organized as follows. Section 2 presents the derivation of RRD along with shape of its PDF and CDF. Reliability analysis is studied with the shapes of survival and hazard rate function in Section 3. In Section 4, statistical properties like quantile function, moments and moment generating function, skewness and kurtosis are investigated. Section 5 present the Shannon and Re'nyi entropies. Order statistics and L-moments are derived in Section 6. Maximumlikelihood estimators and information matrix are defined in Section 7. In Section 8 simulation study is carried out to examine performance of the maximum-likelihood estimators of parameters of RRD. In Section 9, four real-life data sets are considered to examine the application of RRD in reallife phenomena and comparison of proposed distribution with parent and other existing distributions. Finally, the study is concluded in Section 10.

Rayleigh-Rayleigh distribution
In this Section, an RRD is derived. Let random variable T follows RD having PDF given in expression (1) with scale parameter σ 2 . The functional form of W Ψ x ð Þ ½ is defined as (-log [1-Ψ x ð Þ]) depending upon the support of Rayleigh random variate T. Then the PDF of Rayleigh-X family of distributions is given as and the corresponding CDF is defined as By taking different distributions of random variable X, a large number of Rayleigh-X distributions can be obtained such as Rayleigh-Gamma, Raleigh-Pareto, Rayleigh-Gumbel, Rayleigh-Exponential, etc. In our study, we assumed that X is another Rayleigh variate with scale parameter β i.e. X~RD (x;βÞ. By substituting the value of Ψ(x) and ψ x ð Þ define in (1) and (2) having parameter β in expression (5) the PDF of the RRD is obtained as The CDF of RRD is By adding a scale parameter in the base line distribution, the generalized distribution is expected to be more flexible to model complicated real-life phenomena than the original one.
Figures 1 and 2 illustrate some of the possible shapes of the PDF and CDF of RRD for some selected values of the parameters β and σ. The PDF plots in Figure 1 reveal that the RRD is unimodel, increasing and decreasing.

Reliability analysis
The reliability function S x ð Þ focuses the probability of an event for a specific time without failing the event. CDF and S x ð Þ are reverse of each other. The S x ð Þ for RRD is defined as For various values of β andσ, the reliability function is monotonically decreasing which is shown in Figure 3.
The ratio of PDF and reliability function is hazard rate function. That is another specification in reliability analysis. The hazard rate function for the NGR distribution is given as The plots of hazard rate function are exponentially increasing which is shown in Figure 4.

Statistical properties
In this section, we have derived the statistical properties of the RRD, specifically quantile function, random number generator, moments, moment generating function, skewness, kurtosis and mean deviation.

Quantile function and simulation
Here and hereafter let the random variable "X" follows RRD with parametersβand σi.e. X,RRD (x;β; σÞ. The quantile function corresponding to the CDF of RRD is Median, first and third quartiles of RRD can conveniently derived by substituting p ¼ 1=2; 1=4; 3=4, respectively. The RR random variate can easily be simulated by taking U as a uniform random variate on the unit interval. By using the technique proposed by Alzaatreh et al. (2013) the random X is generated as

Moments and moment generating function
Moments are necessary and important in any statistical analysis, especially in applications. It can be used to study the most important features and characteristics of the distribution e.g. central tendency, dispersion, skewness and kurtosis.

rth Moment
By definition the rth moment about origin of "X" is Forr ¼ 1, 2, 3 and 4, the first four non-central moments of NGR distribution are specified as

Negative moments
The negative moment generating function is defined as For the random variable of RRD

Moment generating function
The moment generating function of "X" is

Characteristic function
The characteristic function of random variable X is defined as

Skewness and kurtosis
The coefficient of Skewness and kurtosis of RRD is given as Hence, RRD is platykurtic and negatively skewed distribution.

Mean deviation
The mean deviation of "X" is

Measure of uncertainty
Entropy measures the dynamical uncertainty of the probability distribution, unpredictability of the state or disorder of a system … … 5.1. Shannon entropy Shannon (1948) proposed the idea of entropy. The Shannon entropy of "X" is defined as Where γ is the Euler constant and its value is 0.5772.

Renyi Entropy
The generalized form of Shannon entropy is R enyi entropy, proposed by R enyi (1961). The R enyi Entropy of "X", denoted by I R x ð Þ can be defined as

Order statistics
For the sake of data analysis relating to quality control, reliability, hydrological and extreme values, order statistics and moments of order statistics play a starring role. In this Section, we have derived the PDFs of the kth order, maximum and minimum order statistics from the RRD.
6.1. The PDF of the smallest order statistic Let X 1 ð Þ is the first order statistics from random sample X 1 ; X 2 ; . . . ; X m from RRD. The PDF of X 1 ð Þ is defined as

The PDF of the largest order statistic
For the order statistics of the sample drawn from RRD, the PDF of the largest order statistics X m ð Þ is given as 6.3. The joint PDF of ith and jth order statistics Let the joint pdf of the ith and jth order statistics is denoted by g i:j:m x; y ð Þ; then using the standard formula this can be derived as 8β 4 σ 2 ! mÀj ; 1 i j; ; 0 x y þ1

L-moments
In statistics, conventional moments have a great importance to describe the shape of a distribution but provide inadequate performance in case of extreme values due to the sensitivity to extreme observations. Moreover, the conventional moments are asymptotically inefficient for fat tails distributions. In such a situation many empirical studies shows that the L-moments, the linear combination of ordered statistics outperform the conventional moments. Like the conventional moments, the estimation process using the population L-moments and sample L-moments of a distribution can be carried out. The measures of skewness and kurtosis derived in term of L-moments are named as L-skewness and L-kurtosis, respectively.
In this study, the L-moments of X have been derived through the probability weighted moments (PWM) and this method was introduced by Hosking (1990). The PWM denoted by β r are given below The rth L-Moment denoted byλ r is the linear combination of PW moments. The first four L-moments of X are Consequently the L-Skewness of RRD is and the L-Kurtosis of RRD is

Maximum-likelihood estimation and Fisher information matrix
Due to possessing the asymptotic properties of normality and efficiency, the maximum-likelihood estimators have greater importance in statistical inference. In this Section, maximum-likelihood estimators of the parameters of RRD have been derived. Let X 1, X 2, X 3, …, X n be a random sample from X~RRD(x;β; σÞ. Then the likelihood function of the observed sample is given as # and the corresponding log likelihood function is By applying the rule of maximum-likelihood estimation, expression (17) is partially differentiated with respect to β and σ and equating to zero, the corresponding normal equations are given as The expression (18) and (19) cannot be solved analytically. R package is used to solve them numerically by using Newton-Raphson method.
To obtain the Fisher's information matrix (FIM), the second derivatives of the log likelihood function are derived as As the MLE are asymptotically unbiased and normally distributed with its variance covariance matrix obtained from the inverse of FIM. Hence, the interval estimation and hypothesis testing of the model parameters can easily be applied.

Simulation study
In this Section, simulation study has been carried out to check the performance of the estimators. Using the R statistical package 5,000 replications of sample sizes n = 50, 100, 150, 200, 300 and 400 have been generated from RRD by using the random number generator given in expression (9) Tables 1-4 for all the combinations of four sets of values of parameters, mentioned above, demonstrate that average bias for both the parameters appear negative which indicates that the estimators are under estimated and value of bias approaches to zero by increasing sample size. Hence, estimators of the parameters of RRD are asymptotically unbiased. These maximum-likelihood estimates remains under estimated by varying the values of both parameters. Values of average root mean square error decrease by increasing the sample size indicating that the estimators are asymptotically efficient. There is no effect on value of root mean square error by increasing or decreasing the values of parameters. Both of the evidences show that maximum-likelihood estimators of the parameters of RRD perform well and estimates are precise and accurate.
The comparison is carried out by taking the following four real data sets: (1) Lifetime data set of the 46 patients survival times (in years) to given treatment of chemotherapy already used by Bekker, Roux, and Mosteit (2000) and Fundi, Njenga, and Keitany (2017) (2) Data set about the strengths of 1.5 cm glass fibers that is measured at the National Physical Laboratory in England used by Smith and Naylor (1987).
(  The data sets and their sources can be seen in the respective references. Hereafter we will call the above data sets as Dataset1, Dataset 2, Dataset 3 and Dataset 4, respectively. For the comparison of the distributions, the goodness of fit criteria used are −2lnL, Akaike information criterion (AIC) by Akaike (1974), Consistent Akaike information criterion (CAIC) by Bozdogan (1987), and Bayesian information criterion (BIC) by Schwarz (1978) and Hannan-Quinn Information Criterion (HQIC) by Hannan and Quinn (1979). AIC estimates the performance of a model while comparing with other models. CAIC provide a consistent and asymptotically unbiased estimate of order of the true model. HQIC is a consistent model selection criterion. The distribution with smaller values of −2lnL, AIC, BIC, CAIC and HQIC is considered as the best distribution. The specifications of these criteria are as follows: where k = number of estimated parameters in the distribution ln L = maximized log likelihood of the distribution under consideration The results given in Table 5-8 show that the values of −2lnL, AIC, BIC, CAIC and HQIC are smallest for NGRD as compared to the other distributions under consideration. The above results strongly lead to recommend that our proposed distribution outperforms the RD, GRD, ERD, WRD and APRD for the selected data sets.
Hence, for given data sets RRD is chosen as the best fitted model than the competitors models.

Conclusion
In this study, RD is successfully generalized by adding one-scale parameter from other RD. Explicit expression of probability density and cumulative distribution function are derived. Behavior of parameters is checked by PDF and CDF plots. Comprehensive studies of the statistical properties of the new distribution have been presented. The reliability behavior of RRD is investigated by varying the values of the parameters. Order statistics, distribution of the order statistics and L-moments are also derived. The estimation of the parameters is performed through maximum-likelihood approach. Results of simulation study shows that maximum-likelihood estimators of proposed model are asymptotically unbiased and root mean square error reduces by increasing the sample size. The application of the suggested distribution to four real-life data exhibited that RRD outperformed some other existing   distributions. In all the four datasets proposed RRD performs better than original RD. Hence, the induction of one or more parameter improves the performance of a distribution. It is hoped that the proposed distribution will attract the attention of researchers and practitioners in the fields of physical sciences, biological sciences, actuarial studies and social sciences.