Next Article in Journal
Knowledge-Driven Multi-Objective Optimization for Reconfigurable Manufacturing Systems
Next Article in Special Issue
Pseudo-Poisson Distributions with Concomitant Variables
Previous Article in Journal / Special Issue
Flexible Parametric Accelerated Hazard Model: Simulation and Application to Censored Lifetime Data with Crossing Survival Curves
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

New Lifetime Distribution for Modeling Data on the Unit Interval: Properties, Applications and Quantile Regression

by
Suleman Nasiru
1,
Abdul Ghaniyyu Abubakari
1 and
Christophe Chesneau
2,*
1
Department of Statistics, School of Mathematical Sciences, C. K. Tedam University of Technology and Applied Sciences, Kassena-Nankana Navrongo-Kolgo Road, Navrongo, Upper East, Ghana
2
Department of Mathematics, LMNO, CNRS-Université de Caen, Campus II, Science 3, 14032 Caen, France
*
Author to whom correspondence should be addressed.
Math. Comput. Appl. 2022, 27(6), 105; https://doi.org/10.3390/mca27060105
Submission received: 31 October 2022 / Revised: 18 November 2022 / Accepted: 2 December 2022 / Published: 3 December 2022
(This article belongs to the Special Issue Statistical Inference in Linear Models)

Abstract

:
Probability distributions are very useful in modeling lifetime datasets. However, no specific distribution is suitable for all kinds of datasets. In this study, the bounded truncated Cauchy power exponential distribution is proposed for modeling datasets on the unit interval. The probability density function exhibits desirable shapes, such as left-skewed, right-skewed, reversed J, and bathtub shapes, whereas the hazard rate function displays J and bathtub shapes. For the purpose of modeling dependence between measures in a dataset, a bivariate extension of the proposed distribution is developed. The bivariate probability density function displays monotonic and non-monotonic shapes, making it suitable for modeling complex bivariate relations. Subsequently, the applications of the distribution are illustrated using COVID-19 data. The results revealed that the new distribution provides a better fit to the datasets compared to other existing distributions. Finally, a new quantile regression model is developed and its application demonstrated. The generated quantile regression model offers a decent fit to the data, according to the residual analysis.

1. Introduction

Disease modeling and prediction are primary tasks of epidemiologists and researchers interested in the estimation of disease occurrences. To perform these tasks, modeling the variability in disease occurrences using probability distributions is essential. With the emergence of the novel coronavirus disease in late 2019 (COVID-19) and its negative impact on humanity, many researchers have proposed new probability distributions (discrete or continuous) for modeling the number of infections, mortality rate, and recovery rates, among others. Some of the proposed probability distributions or families of distributions include: Marshall–Olkin reduced Kies distribution [1], modified inverse Weibull distribution [2], weighted Weibull distribution [3], type I half logistic Burr X-G family [4], unit power Weibull distribution [5], new extended exponentiated Weibull distribution [6], discrete extended odd Weibull exponential distribution [7], odd Weibull inverse Topp–Leone distribution [8], log-logistic tangent distribution [9], discrete-type half-logistic exponential distribution [10], and unit Johnson S U distribution [11].
Among these probability distributions used for modeling diseases, those defined on the unit interval play a major role due to their usefulness in areas such as health, psychology, and epidemiology, among others. For instance, researchers may be interested in modeling mortality or recovery rates. Observations measured on these variables are usually proportions, fractions, or rates, which are defined in the unit interval. Although the beta distribution is the oldest for modeling datasets measured on the unit interval, the intractability of its cumulative distribution function (CDF) and quantile function has called for the development of new distributions with tractable CDFs and quantile functions that are also capable of modeling data on the unit interval. Unit distributions proposed recently in literature include: unit Gamma/Gompertz distribution [12], bounded odd inverse Pareto exponential distribution [13], bounded shifted Gompertz distribution [14], unit modified Burr-III distribution [15], unit generalized half normal distribution [16], unit Lindley distribution [17], unit Gompertz distribution [18], logit slash distribution [19], unit Weibull distribution [20] and unit inverse Gaussian distribution [21].
Despite the existence of many unit distributions in the literature, no single distribution is capable of modeling all forms of data since the data generating process produces data with different characteristics such as symmetric, skewed, varied degrees of kurtosis, and monotonic and non-monotonic failure rates. This study thus proposes a new unit distribution called the bounded truncated Cauchy power exponential (BTCPE) distribution. The motivations for developing the new distribution are as follows: to provide a model capable of modeling complex data on unit interval that exhibits platykurtic, leptokurtic, reversed J, left-skewed, right-skewed, bathtub, and J shapes; to develop a bivariate distribution for modeling interdependence between random data on unit interval; and to develop a quantile regression model for understanding the relationship between a response variable and given covariates.
The remainder of the paper is organized in nine sections, described as follows: Section 2 presents the development of the BTCPE distribution, Section 3 describes some of its important properties, Section 4 focuses on a special bivariate extension of the BTCPE distribution, Section 5 is devoted to the parametric estimation methods, Section 6 presents the Monte Carlo simulation of nine frequentist estimation methods, Section 7 contains the univariate applications of the BTCPE distribution, Section 8 is about the quantile regression model and its application, and finally the conclusion of the paper is presented in Section 9.

2. Bounded Truncated Cauchy Power Exponential Distribution

A random variable X follows the truncated Cauchy power exponential (TCPE) distribution if its CDF and probability density function (PDF), respectively, are defined as
F X ( x ; α , λ ) = 4 π arctan [ ( 1 e λ x ) α ] , α > 0 , λ > 0 , x > 0 ,
and
f X ( x ; α , λ ) = 4 α λ e λ x ( 1 e λ x ) α 1 π [ 1 + ( 1 e λ x ) 2 α ] , x > 0 .
The TCPE distribution can be presented as a special case of the TCP Weibull distribution proposed by [22]. Now, we define a new unit distribution, called the BTCPE distribution, corresponding to the distribution of Y = e X . The associated CDF is obtained as follows:
F Y ( y ; α , λ ) = ( e X y ) = ( X log ( y ) )   = 1 ( X log ( y ) )   = 1 F X ( log ( y ) ; α , λ ) .
Hence, the CDF of the BTCPE distribution is expressed as
F Y ( y ; α , λ ) = 1 4 π arctan [ ( 1 y λ ) α ] , 0 < y < 1 ,
and α > 0 and λ > 0 are the shape parameters that have to be estimated. The associated PDF of the BTCPE distribution is obtained by differentiating Equation (3), and it is given by
f Y ( y ; α , λ ) = 4 α λ y λ 1 ( 1 y λ ) α 1 π [ 1 + ( 1 y λ ) 2 α ] , 0 < y < 1 .
Often, the PDFs are expressed in expanded form for easy derivation of the statistical properties of the proposed distribution. The expanded form of the PDF of the BTCPE distribution is mainly obtained using the generalized binomial expansion, ( y + a ) n = k = 0 ( n k ) y k a n k , | y | < a , where n is any real number. Thus, it is given by
f Y ( y ; α , λ ) = 4 α λ π i = 0 ( 1 ) i y λ 1 ( 1 y λ ) α ( 2 i + 1 ) 1 , 0 < y < 1 .
The corresponding hazard rate function (HRF) is given by
h Y ( y ; α , λ ) = α λ y λ 1 ( 1 y λ ) α 1 [ 1 + ( 1 y λ ) 2 α ] arctan [ ( 1 y λ ) α ] , 0 < y < 1 .
The shapes of the PDF and HRF for some given parameter values are shown in Figure 1. The PDF exhibits symmetric, bathtub, left-skewed and right-skewed shapes for the given parameter values. The HRF displays bathtub and increasing failure rates.

3. Some Important Properties

This section presents some relevant properties of the BTCPE distribution.

3.1. Distribution Inequalities

This subsection investigates some desirable inequalities satisfied by the CDF of the BTCPE distribution. These inequalities are very essential in determining the first order stochastic dominance of random variables [23].
Proposition 1. 
The CDF of the BTCPE distribution is increasing with respect to the parameter α . The CDF of the BTCPE distribution is decreasing with respect to the parameter λ .
Proof. 
For the first point, since ( 1 y λ ) α log ( 1 y λ ) < 0 , for y ( 0 , 1 ) , we have
F Y ( y ; α , λ ) α = 4 ( 1 y λ ) α log ( 1 y λ ) π [ 1 + ( 1 y λ ) 2 α ] 0 .
This means that F Y ( y ; α , λ ) is increasing with respect to α . For the second point, since y λ ( 1 y λ ) α 1 log ( y ) < 0 , for y ( 0 , 1 ) , we have
F Y ( y ; α , λ ) λ = 4 α y λ ( 1 y λ ) α 1 log ( y ) π [ 1 + ( 1 y λ ) 2 α ] 0 .
This implies that F Y ( y ; α , λ ) is decreasing with respect to λ . This completes the proof of the proposition. From Proposition 1, the following stochastic ordering property follows immediately: if α 1 α 2 then F Y ( y ; α 1 , λ ) F Y ( y ; α 2 , λ ) . Also, if λ 1 λ 2 then F Y ( y ; α , λ 2 ) F Y ( y ; α , λ 1 ) . ☐

3.2. Quantile Function

The quantile function or the inverse CDF is simply the solution Q ( u ; α , λ ) of the following nonlinear equation: F Y ( Q ( u ; α , λ ) ; α , λ ) = u , for all u ( 0 , 1 ) . Thus, after some algebraic manipulation, we have
Q ( u ; α , λ ) = { 1 ( tan [ π 4 ( 1 u ) ] ) 1 / α } 1 / λ , u ( 0 , 1 ) .
The median is obtained by substituting u = 0.5 . The quantile function plays an important role in the generation of random observations from the BTCPE distribution. The quantile function values are also useful in computing measures of skewness and kurtosis. As a classical quantile measure, the MacGillivray measure of skewness [24] is given by
ρ ( u ; α , λ ) = Q ( 1 u ; α , λ ) + Q ( u ; α , λ ) 2 Q ( 0.5 ; α , λ ) Q ( 1 u ; α , λ ) Q ( u ; α , λ ) , u ( 0 , 1 ) .
In particular, the MacGillivray measure of skewness can be used to efficiently describe the effect of the parameters ( α , λ ) on the skewness. The more the shapes of ρ ( u ; α , λ ) vary according to the parameters, the more flexible the skewness is. Figure 2 shows the plot of this skewness measure for a fixed value of λ while α varies and for a fixed value of α while λ varies. From Figure 2, the wider variations seen imply that both parameters have a strong influence on the skewness of the BTCPE distribution. In addition, as the values of α or λ increase, ρ ( u ; α , λ ) gets closer to the horizontal line. This shows that utilizing higher values of the parameter can result in a symmetrical distribution.
The kurtosis of the BTCPE distribution can be studied using the Moors kurtosis [25]. The Moors (coefficient of) kurtosis is usually given by
K ( α , λ ) = Q ( 7 / 8 ; α , λ ) Q ( 5 / 8 ; α , λ ) + Q ( 3 / 8 ; α , λ ) Q ( 1 / 8 ; α , λ ) Q ( 3 / 4 ; α , λ ) Q ( 1 / 4 ; α , λ ) .
Large values of the Moors kurtosis imply that the distribution has a heavy tail, and small values are indications of a light tail. Figure 3 displays the Moors kurtosis for the BTCPE distribution. It can be observed that the BTCPE distribution exhibits various degrees of kurtosis. When the parameters α and λ are equal, the distribution displays a platykurtic shape. The overall shapes show how flexible the BTCPE distribution is with regards to modeling datasets having different degrees of kurtosis and skewness.

3.3. Moments and Moments Generating Function

The r t h moments, incomplete moments and moment generating function of the BTCPE distribution are presented in this subsection.
Proposition 2. 
If Y is a BTCPE random variable, i.e., a random variable with the BTCPE distribution, then its r t h non-central moment is given by
μ r = 4 α π i = 0 ( 1 ) i B ( r λ + 1 , α ( 1 + 2 i ) ) , r = 1 , 2 , ,
where B ( a , b ) = 0 1 z a 1 ( 1 z ) b 1 d z is the beta integral function.
Proof. 
The r t h non-central moment of the BTCPE random variable is defined as μ r = E ( Y r ) = 0 1 y r f Y ( y ; α , λ ) d y . Thus, substituting the expanded form of the PDF given in Equation (5) yields
μ r = 4 α λ π i = 0 ( 1 ) i 0 1 y r + λ 1 ( 1 y λ ) α ( 1 + 2 i ) 1 d y .
Letting z = y λ , y 0 , z 0 ; y 1 , z 1 and d z = λ y λ 1 d y , we get
μ r = 4 α π i = 0 ( 1 ) i 0 1 z r λ ( 1 z ) α ( 1 + 2 i ) 1 d z .
Hence, several algebraic manipulation yield
μ r = 4 α π i = 0 ( 1 ) i B ( r λ + 1 , α ( 1 + 2 i ) ) .
This completes the proof. ☐
The non-central moments can be used to derive other important characteristics of the BTCPE distribution such as estimating the variance, coefficient of skewness and kurtosis.
Proposition 3. 
The r t h incomplete moment of the BTCPE random variable is given by
φ r = 4 α π i = 0 ( 1 ) i B ( y λ ; r λ + 1 , α ( 1 + 2 i ) ) , r = 1 , 2 , ,
where B ( q ; a , b ) = 0 q z a 1 ( 1 z ) b 1 d z is the incomplete beta integral function.
Proof. 
By definition, the r t h incomplete moment is given by
φ r = E ( Y r 1 { Y < y } ) = 0 y x r f ( x ; α , λ ) d x .
Hence, substituting the expanded form of the PDF into the definition yields
φ r = 4 α λ π i = 0 ( 1 ) i 0 y x r + λ 1 ( 1 x λ ) α ( 1 + 2 i ) 1 d x .
Letting z = x λ , x 0 , z 0 ; x y , z y λ and d z = λ x λ 1 d x . Hence, applying similar concepts for proving the incomplete moments yields
φ r = 4 α π i = 0 ( 1 ) i B ( y λ ; r λ + 1 , α ( 1 + 2 i ) ) .
This completes the proof. ☐
The moment generating function is useful for deriving the moments of a random variable if only the moment exists.
Proposition 4. 
The moment generating function of the BTCPE random variable is given by
M Y ( t ) = 4 α π r = 0 i = 0 ( 1 ) i t r r ! B ( r λ + 1 , α ( 1 + 2 i ) ) .
Proof. 
By definition and a standard exponential expansion, we have M Y ( t ) = E ( e t Y ) = r = 0 t r r ! μ r . Hence, substituting the r t h non-central moment of the BTCPE distribution into the definition completes the proof. ☐
Table 1 shows the first six moments of the BTCPE distribution and other useful measures, such as the standard deviation (SD), coefficients of variation (CV), skewness (CS) and kurtosis (CK). The SD, CV, CS and CK are, respectively, given by
S D = μ 2 μ 2 ,
C V = σ μ = μ 2 μ 2 1 ,
C S = μ 3 3 μ μ 2 + 2 μ 3 ( μ 2 μ 2 ) 3 / 2
and
C K = μ 4 4 μ μ 3 + 6 μ 2 μ 2 3 μ 4 ( μ 2 μ 2 ) 2 .
From Table 1, the CS is negative for the given parameter values and positive for others. It can be seen that the BTCPE distribution can be leptokurtic or platykurtic depending on the parameter values, since the CK can be lower than 3 or greater than 3, respectively. The coefficient of skewness also reveals that the BTCPE distribution can model both left and right-skewed data.

3.4. Order Statistics

Order statistics play an imperative role in both statistics and industrial reliability analysis. They can be used to estimate the minimum, maximum, and range of observations. They are used in developing control charts that are useful in industrial quality control analyses. Let Y 1 : n Y 2 : n Y n : n be n order statistics from n BTCPE random variables. Then, the PDF of Y k : n is given by
f k : n ( y ; α , λ ) = Ω k : n [ F Y ( y ; α , λ ) ] k 1 [ 1 F Y ( y ; α , λ ) ] n k f Y ( y ; α , λ ) ,
where
Ω k : n = n ! ( k 1 ) ! ( n k ) ! .
Using the binomial expansion ( 1 y ) λ 1 = i = 0 ( 1 ) i ( λ 1 i ) y i , | y | < 1 , we can write
f k : n ( y ; α , λ ) = Ω k : n i = 0 k 1 ( 1 ) i ( k 1 i ) [ 1 F Y ( y ; α , λ ) ] n k + i f Y ( y ; α , λ ) .
Thus, we have
f k : n ( y ; α , λ ) = Ω k : n 4 α λ y λ 1 ( 1 y λ ) α 1 π [ 1 + ( 1 y λ ) 2 α ] i = 0 k 1 ( 1 ) i ( k 1 i ) [ 4 π arctan [ ( 1 y λ ) α ] ] n k + i .
On the other side, the CDF of Y 1 : n is simply given by
F 1 : n ( y ; α , λ ) = 1 [ 1 F Y ( y ; α , λ ) ] n = 1 [ 4 π arctan [ ( 1 y λ ) α ] ] n ,
and the CDF of Y n : n is derived as
F n : n ( y ; α , λ ) = [ F Y ( y ; α , λ ) ] n = [ 1 4 π arctan [ ( 1 y λ ) α ] ] n .
The distribution of the smallest order statistic represents the lifetime of a system connected in series, and that of the maximum order statistic denotes the lifetime of a system connected in parallel. Hence, they are vital in studying the minimum and maximum time to failure of components in engineering reliability. The minimum and maximum (min-max) plots of the order statistics can be used to investigate the distributional behavior of observations. The min–max plot captures not only the information in the tails but all the information about the whole distribution. The min–max plots shown in Figure 4 for some parameter values depend on E ( Y 1 : n ) and E ( Y n : n ) . From the min–max plots, the distribution can exhibit symmetrical, left-skewed, and right-skewed shapes.

4. Bivariate Extension

Researchers may be interested in modeling the dependence between two (quantitative) measures in a dataset. For instance, one may be interested in modeling the relationship between age and the body mass index of individuals. Bivariate distributions have been used in reliability analysis, queuing theory, finance, and insurance risk analysis, among others, to study interdependency (see [26]). In this section, the bivariate extension of the BTCPE (BEBTCPE) distribution is proposed following the strategy developed in [26,27]. Given a bivariate continuous random vector ( X , Y ) , the CDF of the BEBTCPE distribution with parameters α , λ , δ 1 , δ 2 , δ 3 , where α > 0 , λ > 0 ,   1 < δ 1 + δ 3 < 1 , 1 < δ 2 + δ 3 < 1 , 0 < x < 1 and 0 < y < 1 , is given by
F X Y ( x , y ; η ) = ( 1 4 π arctan [ ( 1 x λ ) α ] ) ( 1 4 π arctan [ ( 1 y λ ) α ] ) { 1 + ( δ 1 + δ 3 ) 4 π arctan [ ( 1 x λ ) α ] + ( δ 2 + δ 3 ) 4 π arctan [ ( 1 y λ ) α ] } 1 ,
where η = ( α , β , δ 1 , δ 2 , δ 3 ) T . The parameters δ 1 , δ 2 and δ 3 quantify the dependence between the two variables of a BEBTCPE random vector. The plots of the CDF for the following parameter values are shown in Figure 5:
(a)
α = 3.5 , λ = 8.2 , δ 1 = 0.3 , δ 2 = 0.1 , δ 3 = 0.3 ;
(b)
α = 2.5 , λ = 0.8 , δ 1 = 0.5 , δ 2 = 0.4 , δ 3 = 0.2 and
(c)
α = 0.5 , λ = 4.8 , δ 1 = 0.3 , δ 2 = 0.7 , δ 3 = 0.1 .
We notice various concave and convex shapes from these plots.
The corresponding bivariate PDF is given by
f X Y ( x , y ; η ) = ( 4 α λ / π ) 2 ( x y ) λ 1 ( 1 x λ y λ + ( x y ) λ ) α 1 [ 1 + ( 1 x λ ) 2 α ] 1 [ 1 + ( 1 y λ ) 2 α ] 1 { 1 + ( δ 1 + δ 3 ) 8 π arctan [ ( 1 x λ ) α ] + ( δ 2 + δ 3 ) 8 π arctan [ ( 1 y λ ) α ] } 1 .
Figure 6 shows the bivariate PDF plots of the BEBTCPE distribution for the following parameter values:
(a)
α = 3.5 , λ = 8.2 , δ 1 = 0.3 , δ 2 = 0.1 , δ 3 = 0.3 ;
(b)
α = 2.5 , λ = 0.8 , δ 1 = 0.5 , δ 2 = 0.4 , δ 3 = 0.2 and
(c)
α = 0.5 , λ = 4.8 , δ 1 = 0.3 , δ 2 = 0.7 , δ 3 = 0.1 .
The first graph displays a non-monotonic shape whereas the other two exhibit monotonic shapes, illustrating the versatility in the bivariate modeling sense.

5. Parameter Estimation Methods

This section presents nine estimation methods for estimating the parameters of the BTCPE distribution. These include the maximum likelihood (ML) estimation (MLE), ordinary least squares (OLS), weighted least squares (WLS), Cramér–von Mises (CVM), percentile (PC) estimation, Anderson–Darling (AD) methods, and maximum and minimum product spacing methods.

5.1. Maximum Likelihood Estimation

One of the most common methods used for estimating the parameters of a developed model is the MLE method. Suppose that Y follows the BTCPE distribution, with ϑ = ( α , λ ) T as the parameter vector. For a single observation y of Y , the log-likelihood function = ( ϑ ) is
= log ( 4 α λ π ) + ( λ 1 ) log ( y ) + ( α 1 ) log ( 1 y λ ) log ( 1 + ( 1 y λ ) 2 α ) .
To obtain the estimates of the parameters for the single observation, the first partial derivative of Equation (14) with respect to the parameters needs to be derive. Here, we obtain
α = 1 α + log ( 1 y λ ) 2 ( 1 y λ ) 2 α log ( 1 y λ ) 1 + ( 1 y λ ) 2 α ,
and
λ = 1 λ + log ( y ) y λ ( α 1 ) log ( y ) 1 y λ + 2 α y λ ( 1 y λ ) 2 α 1 log ( y ) 1 + ( 1 y λ ) 2 α .
Given that y 1 , y 2 , , y n are (independent and identically) observations from n BTCPE random variables, then the total log-likelihood function is given by n * = i = 1 n i ( ϑ ) , where i ( ϑ ) , i = 1 , 2 , , n is defined in Equation (14) with y = y i . The estimates of the parameters can be obtained by maximizing the total log-likelihood function directly using MATLAB, MATHEMATICA and R software. In this study, the R software is used [28]. Alternatively, the estimates of the parameters can be obtained by equating the first partial derivatives with respect to the parameters to zero and solving the resulting system of equations simultaneously. However, since the resulting system of equations does not have a closed form, the nonlinear system of equations ( n * α , n * λ ) T = ( 0 , 0 ) T is solved numerically to obtain the estimates of the parameters.

5.2. Ordinary and Weighted Least Squares Estimation

Suppose that y ( 1 ) , y ( 2 ) , , y ( n ) are ordered observations from n BTCPE random variables. The OLS estimates of the parameters α ^ L S E and λ ^ L S E are obtained by minimizing the following function:
L S E ( α , λ ) = i = 1 n [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) i n + 1 ] 2 ,
with respect to the parameters α and λ . On the other hand, the OLS estimates can be obtained by numerically solving the following nonlinear equations:
i = 1 n [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) i n + 1 ] Δ s ( y ( i ) | α , λ ) = 0 ,   s = 1 , 2 ,
where
Δ 1 ( y ( i ) | α , λ ) = 8 ( 1 y ( i ) λ ) α log ( 1 y ( i ) λ ) π [ 1 + ( 1 y ( i ) λ ) 2 α ]
and
Δ 2 ( y ( i ) | α , λ ) = 8 α y ( i ) λ ( 1 y ( i ) λ ) α 1 log ( y ( i ) ) π [ 1 + ( 1 y ( i ) λ ) 2 α ] .
The WLS estimates α ^ W L S and λ ^ W L S are obtained by minimizing the following function:
W L S ( α , λ ) = i = 1 n ( n + 1 ) 2 ( n + 2 ) i ( n i + 1 ) [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) i n + 1 ] 2 ,
with respect to the parameters α and λ . Alternatively, the WLS estimates can be obtained by numerically solving the following nonlinear equations:
i = 1 n ( n + 1 ) 2 ( n + 2 ) i ( n i + 1 ) [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) i n + 1 ] Δ s ( y ( i ) | α , λ ) = 0 ,   s = 1 , 2 ,
where Δ s ( x ( i ) | α , λ ) , s = 1 , 2 are defined above.

5.3. Cramér–Von Mises Estimation

Let y ( 1 ) , y ( 2 ) , , y ( n ) be ordered observations from n BTCPE random variables. The CVM estimates of the parameters α ^ C V M and λ ^ C V M are obtained by minimizing the following function:
C V M ( α , λ ) = 1 12 n + i = 1 n [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) 2 i 1 2 n ] 2 ,
with respect to the parameters α and λ . The estimates of the parameters can also be obtained by numerically solving the following equations:
i = 1 n [ ( 1 4 π arctan [ ( 1 y ( i ) λ ) α ] ) 2 i 1 2 n ] Δ s ( y ( i ) | α , λ ) = 0 ,   s = 1 , 2 ,
where Δ s ( y ( i ) | α , λ ) , s = 1 , 2 are given above.

5.4. Anderson–Darling Estimation

Another minimum distance estimation method is the AD estimation technique. Let y ( 1 ) , y ( 2 ) , , y ( n ) be ordered observations from n BTCPE random variables. The AD estimates for the parameters of the BTCPE distribution are obtained by minimizing the following function:
A D ( α , λ ) = n 1 n i = 1 n ( 2 i 1 ) [ log ( 1 4 π arctan [ ( 1 y ( i ) λ ) ] ) + log ( 4 π arctan [ ( 1 y ( i ) λ ) ] ) ] ,
with respect to the parameters α and λ .

5.5. Percentile Estimation

The PC estimation approach is another method of estimating the parameters of a given model. Let y ( 1 ) , y ( 2 ) , , y ( n ) be ordered observations from n BTCPE random variables and u i = i / ( n + 1 ) be an unbiased estimate of F Y ( y ( i ) ; α , λ ) . The PC estimates of the parameters of the BTCPE distribution are obtained by minimizing the following function:
P C ( α , λ ) = i = 1 n [ y ( i ) { 1 ( tan [ π 4 ( 1 u i ) ] ) 1 / α } 1 / λ ] 2 ,
with respect to the parameters α and λ .

5.6. Maximum and Minimum Product Spacing Estimation

An alternative parameter estimation technique which is based on the Kullback–Leibler information measure is the maximum product spacing (MPS). Let y ( 1 ) , y ( 2 ) , , y ( n ) be ordered observations from n BTCPE random variables. Consider the uniform spacing
D i = F Y ( y ( i ) ; α , λ ) F Y ( y ( i 1 ) ; α , λ ) = 4 π arctan [ ( 1 y ( i 1 ) λ ) ] 4 π arctan [ ( 1 y ( i ) λ ) ]
where F Y ( y ( 0 ) ; α , λ ) = 0 , F Y ( y ( n + 1 ) ; α , λ ) = 1 and D 0 ( α , λ ) + D 1 ( α , λ ) + + D n + 1 ( α , λ ) = 1 . The estimates of the parameters are obtained via the MPS approach by maximizing the logarithm of the geometric mean of the spacing defined by
M P S ( α , λ ) = 1 n + 1 i = 1 n + 1 log D i ( α , λ ) ,
with respect to the parameters α and λ .
Additionally, the minimum spacing distance (MSD) estimates for the parameters α and λ are obtained by minimizing the following function:
M S D ( α , λ ) = i = 1 n + 1 ϑ ( D i ( α , λ ) , 1 n + 1 ) ,
where ϑ ( x , y ) is an appropriate distance, with respect to the parameters α and λ . Although different choices of ϑ ( x , y ) exist, in this study the absolute distance | x y | and the absolute-log distance | log x log y | are utilized. Thus, the minimum spacing absolute distance (MSAD) and minimum spacing absolute-log distance (MSALD) estimates are, respectively, obtained by minimizing the following functions:
M S A D ( α , λ ) = i = 1 n + 1 | D i ( α , λ ) 1 n + 1 |
and
M S A L D ( α , λ ) = i = 1 n + 1 | log D i ( α , λ ) log 1 n + 1 | ,
where D i ( α , λ ) 1 n + 1 and log D i ( α , λ ) log 1 n + 1 .

6. Simulation

In this section, simulation experiments are carried out to assess how well the proposed parameters of the BTCPE distribution have been estimated. The experiments are carried out with the following two different parameter combinations: α = 4.1 , λ = 2.5 and α = 3.1 , λ = 8.5 . The experiments are replicated 5000 times with the following different sample sizes: n = 25 , 75 , 125 , 175 and 225. The bias (AB) and root mean square error (RMSE) of the estimates are then computed and compared.
The AB and RMSE are, respectively, computed using
A B = 1 R i = 1 R ( ϑ i ^ ϑ )
and
R M S E = 1 R i = 1 R ( ϑ i ^ ϑ ) 2 ,
where ϑ ^ is either α ^ or λ ^ and R = 5000 is used in this study.
From Table 2 and Table 3, most of the estimates have their ABs and RMSEs decreasing as the sample size increases. This is an indication that most of the estimates exhibit the consistency property. From Table 2, it can be observed that for sample sizes 25, 75 and 125 the PC estimate is the best for α and, for the sample sizes 175 and 225, the MLE is the best for α . For the parameter λ , the PC estimate is the best for the sample size 25 and the MLE is the best for 75, 125, 175 and 225. In Table 3, for sample sizes 25 and 75 the AD estimate is the best for the parameter α and the MLE is the best for 125, 175 and 225. For the parameter λ , the MLE is the best for sample sizes 25, 125, 175 and 225. The AD estimate is best for λ when the sample size is 75.

7. Applications

Three applications of the BTCPE distribution are illustrated in this section, and its performance is compared to other competitive distributions defined in the unit interval. The performance of the BTCPE distribution was compared with that of the beta, unit Burr-III (UBIII) [29], bounded M-O extended exponential (BMOEE) [30], unit Gompertz (UG) [18], unit Lindley (UL) [17], unit Weibull (UW) [20] and unit-improved second-degree Lindley (UISDL) [31] distributions. The Akaike information criterion (AIC), Bayesian information criterion (BIC), Anderson–Darling (AD) test, and Cramér–von Mises (CVM) are the model selection techniques employed in arriving at the best model. For these selection techniques, the best model is the one with the smallest test statistic. The datasets represent the mortality rate of COVID-19 patients in Canada and the United Kingdom (UK), and the recovery rate of COVID-19 patients in Spain. The first two datasets were recently reported by [8].
The first dataset is the mortality rate for UK from 1 December 2020 to 29 January 2021. The data are: 0.1292, 0.3805, 0.4049, 0.2564, 0.3091, 0.2413, 0.1390, 0.1127, 0.3547, 0.3126, 0.2991, 0.2428, 0.2942, 0.0807, 0.1285, 0.2775, 0.3311, 0.2825, 0.2559, 0.2756, 0.1652, 0.1072, 0.3383, 0.3575, 0.2708, 0.2649, 0.0961, 0.1565, 0.1580, 0.1981, 0.4154, 0.3990, 0.2483, 0.1762, 0.1760, 0.1543, 0.3238, 0.3771, 0.4132, 0.4602, 0.352, 0.1882, 0.1742, 0.4033, 0.4999, 0.3930, 0.3963, 0.3960, 0.2029, 0.1791, 0.4768, 0.5331, 0.3739, 0.4015, 0.3828, 0.1718, 0.1657, 0.4542, 0.4772, 0.3402.
The second dataset denotes the mortality rate for Canada from 1 November to 26 December 2020. The data are: 0.1622, 0.1159, 0.1897, 0.1260, 0.3025, 0.2190, 0.2075, 0.2241, 0.2163, 0.1262, 0.1627, 0.2591, 0.1989, 0.3053, 0.2170, 0.2241, 0.2174, 0.2541, 0.1997, 0.3333, 0.2594, 0.2230, 0.2290, 0.1536, 0.2024, 0.2931, 0.2739, 0.2607, 0.2736, 0.2323, 0.1563, 0.2677, 0.2181, 0.3019, 0.2136, 0.2281, 0.2346, 0.1888, 0.2729, 0.2162, 0.2746, 0.2936, 0.3259, 0.2242, 0.1810, 0.2679, 0.2296, 0.2992, 0.2464, 0.2576, 0.2338, 0.1499, 0.2075, 0.1834, 0.3347, 0.2362.
The third dataset constitutes the recovery rates of COVID-19 patients in Spain from 3 March to 7 May 2020. The dataset can be found in [1] and are: 0.6670, 0.5000, 0.5000, 0.4286, 0.7500, 0.6531, 0.5161, 0.7895, 0.7689, 0.6873, 0.5200, 0.7251, 0.6375, 0.6078, 0.6289, 0.5712, 0.5923, 0.6061, 0.5924, 0.5921, 0.5592, 0.5954, 0.6164, 0.6455, 0.6725, 0.6838, 0.6850, 0.6947, 0.7210, 0.7315, 0.7412, 0.7508, 0.7519, 0.7547, 0.7645, 0.7715, 0.7759, 0.7807, 0.7838, 0.7847, 0.7871, 0.7902, 0.7934, 0.7913, 0.7962, 0.7971, 0.7977, 0.8007, 0.8038, 0.8289, 0.8322, 0.8354, 0.8371, 0.8387, 0.8456, 0.8490,0.8535, 0.8547, 0.8564, 0.8580, 0.8604, 0.8628, 0.6586, 0.7070, 0.7963, 0.8516.
The ML estimates of the parameters are estimated using the bbmle package in R [32]. The initial values of the parameters of the fitted distributions used for the optimization are obtained using the GenSA package in R [33]. Table 4 displays the descriptive statistics for COVID-19 mortality for the UK and Canada, as well as the recovery rate for Spain. The datasets are platykurtic due to the negative excess kurtosis. The UK mortality is right-skewed and that of Canada is left-skewed. The recovery rate for Spain is also left-skewed. This is affirmed by the boxplot of the datasets shown in Figure 7.

7.1. UK COVID-19 Mortality

Table 5 presents ML estimates of the parameters and their corresponding standard errors in brackets, the log-likelihood ( ), AIC, BIC, AD, and CVM for the fitted distributions. Given that it has the lowest values for the AIC, BIC, AD, and CVM and the maximum log-likelihood, the BTCPE distribution offers the best fit to the UK mortality dataset.
Figure 8 displays the empirical and fitted PDFs and CDFs of the various distributions used to model the UK mortality dataset. The figure gives an indication that the BTCPE distribution provides a good fit to the dataset compared to the other models.
Figure 9 is the probability–probability (P-P) plots of the fitted distributions. Figure 9 once more shows that the BTCPE distribution fits the UK drought mortality well because the points cluster along the diagonal.
The profile log-likelihood plots for the estimated parameter values of the BTCPE distribution for the UK mortality data are shown in Figure 10. From the plots, it can be observed that the estimated values are the true maxima.

7.2. Canada COVID-19 Mortality

Table 6 presents ML estimates of the parameters and their corresponding standard errors in brackets and model selection criteria for the fitted distributions. The BTCPE distribution again provides the best fit to the Canada mortality dataset since it has the highest log-likelihood and the lowest values of the AIC, BIC, AD, and CVM.
Figure 11 shows the empirical and fitted PDFs and CDFs of the various distributions used to model the Canada drought mortality dataset. The figure gives an indication that the BTCPE distribution provides a better fit to the drought mortality for Canada than the other models, as it mimics the empirical PDF and CDF of the dataset better than the other models.
Figure 12 shows the P-P plots of the fitted models. Figure 12 gives an indication that the BTCPE distribution provides a good fit to the Canada mortality as the points cluster along the diagonal.
Figure 13 displays the profile log-likelihood plots for the estimated parameter values of the BTCPE distribution for the Canada mortality data. It can be observed from the plots that the estimates are unique and represent the true maxima.

7.3. Spain COVID-19 Recovery Rate

The ML estimates of the parameters and their corresponding standard errors in brackets and model selection criteria for the fitted distributions are shown in Table 7. Because it has the lowest values for the AIC, BIC, AD, and CVM and the maximum log-likelihood, the BTCPE distribution again offers the best fit to the Spain recovery rate dataset.
The empirical and fitted PDFs and CDFs of the various distributions used to model the Spain recovery rate dataset are shown in Figure 14. It can be seen that the BTCPE distribution provides a better fit to the recovery rate data than the other models.
The P-P plots of the fitted models for the recovery rate data are displayed in Figure 15. The plots indicate that the BTCPE distribution provides a good fit to the recovery rate data as the points cluster along the diagonal.
The profile log-likelihood plots for the estimated parameter values of the BTCPE distribution for the recovery rate data are shown in Figure 16. The plots suggest that the estimates are unique and represent the true maxima.

8. Quantile Regression

When the response variable defined in the unit interval is skewed or contaminated with outliers, the beta regression model, which models the conditional mean of the response variable, is no longer reliable. A robust regression model is needed to model the effects of the covariates on the response variable. In this study, a quantile regression model is proposed for modeling the conditional quantile of the response variable. Given the quantile function of the BTCPE distribution, the PDF of the BTCPE distribution can be re-parameterized in terms of its u t h quantile as ρ = Q ( u ; α , λ ) , ρ [ 0 , 1 ] . If λ = log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ ) , then the re-parameterized PDF is
f Y ( y ; α , λ ) = 4 α ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ ) ) y ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ ) ) 1 ( 1 y ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ ) ) ) α 1 π [ 1 + ( 1 y ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ ) ) ) 2 α ] .
The parameter ρ is the quantile parameter. The BTCPE quantile regression is defined as
g ( ρ i ) = z i θ ,
where θ = ( θ 0 , θ 1 , , θ p ) is the vector of unknown parameters, ρ i is the i t h quantile parameter and z i = ( 1 , z i 1 , z i 2 , , z i p ) are the known i t h vector of covariates. The link function g ( ) is used to link the covariates to the conditional median of the dependent variable Y . The logit link function is used to link the covariates to the conditional quantile since y ( 0 , 1 ) . Hence, we have
g ( ρ i ) = logit ( ρ i ) = log ( ρ i 1 ρ i ) .
Further, we can write
ρ i = exp ( z i θ ) 1 + exp ( z i θ ) .
Substituting ρ i into the re-parameterized PDF, the log-likelihood for estimating the parameters of the BTCPE quantile regression is given by
= i = 1 n log ( ( 4 α / π ) ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ i ) ) ) i = 1 n log ( 1 + ( 1 z i ) 2 α ) + i = 1 n [ ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ i ) ) 1 ] log ( y i ) + ( α 1 ) i = 1 n log ( 1 z i ) ,
where z i = y i ( log ( 1 ( tan [ π ( 1 u ) / 4 ] ) 1 / α ) / log ( ρ i ) ) . The estimates of the parameters of the regression equation are obtained by directly maximizing the log-likelihood function. They will be denoted as α ^ and θ ^ = ( θ ^ 0 , , θ ^ p ) of α and θ , respectively.

8.1. Residual Analysis

Model diagnostics are very essential when fitting a model to a dataset. Often, the behavior of the model residuals is examined to see if the model really provides a good fit to the data. In this study, the randomized quantile residuals are used to assess the adequacy of the regression model. The randomized quantile residuals are defined as
r i = Φ 1 ( F Y ( y i ; α ^ , θ ^ ) ) , i = 1 , 2 , , n ,
where Φ 1 ( ) is the quantile of the standard normal distribution. The randomized quantile residuals are expected to be distributed as the standard normal distribution if the models provide a good fit to the data.

8.2. Monte Carlo Simulation for Quantile Regression

Monte Carlo simulations are carried out in this section to examine the performance of the ML estimates of the parameters of the BTCPE regression model. The exercise is performed with two covariates. The following regression structure is adopted for the simulation:
ρ i = exp ( θ 0 + θ 1 z i 1 + θ 2 z i 2 ) 1 + exp ( θ 0 + θ 1 z i 1 + θ 2 z i 2 ) .
The observations for the response variable are generated from the BTCPE distribution using sample sizes n = 50 , 100 , 250 , 350 , 500 , 600 and 700. The experiments were repeated 5000 times for each sample size. The performance of the ML estimates is examined using AB and RMSE. The simulations were carried out using the median, u = 0.5 . The following parameter combinations were used in the simulation: I : ( α , θ 0 , θ 1 , θ 2 ) = ( 0.7 , 0.2 , 0.8 , 0.3 ) ,   I I : ( α , θ 0 , θ 1 , θ 2 ) = ( 0.6 , 0.5 , 0.4 , 1.8 ) and I I I : ( α , θ 0 , θ 1 , θ 2 ) = ( 0.8 , 0.4 , 0.9 , 0.6 ) . From the simulation results shown in Table 8, the ABs and RMSEs of the estimates’ decrease as the sample size increases. Hence, the ML estimates for the BTCPE regression parameters are consistent.

8.3. Application

The application of the quantile regression model is demonstrated in this section using a real dataset. The data are taken from [34] and are also available at http://www.leg.ufpr.br/doku.php/publications:papercompanions:multquasibeta (accessed on 30 August 2022). The data consist of body fat percentage (response variable) measured in five regions: android, arms, gynoids, legs and trunk. The data are comprised of 298 observations and the independent variables are: age (in years), body mass index (in kg/m2), sex (female or male) and IPAQ (sedentary (S), insufficiently active (I), or active (A)). In this study, the response variable body fat percentage at arms is regressed on age ( z i 1 ), body mass index ( z i 2 ) and sex ( z i 3 , 0 for female and 1 for male). The response variable is regressed on the covariates using the relationship logit ( ρ i ) = θ 0 + θ 1 z i 1 + θ 2 z i 2 + θ 3 z i 3 , i = 1 , 2 , , 298 . Table 9 presents ML estimates, standard errors, and p-values for the parameters of the fitted models for the different quantiles. The estimates are all significant at the 5% level of significance.
Table 10 presents the model selection criteria for the different quantiles. It is observed that the 0.90th quantile provides the best fit for the data as it has the least values of the model selection criteria.
Figure 17 shows the rate of change of the regression coefficients for the different quantile levels and the corresponding 95% confidence interval (CI). It can be observed that all the coefficients approach zero as the quantile level increases, suggesting that they are more important in explaining smaller quantiles.
Figure 18 and Figure 19 show the P-P plots and half-normal plots with simulated envelopes, respectively, for the randomized quantile residuals. These figures display good fits of the BTCPE quantile regression model to the u t h percentage of body fat in arms for u ( 0.10 , 0.25 , 0.50 , 0.75 , 0.90 ) .

9. Conclusions

In this study, the BTCPE distribution is proposed for modeling datasets that are defined on the unit interval. The PDF of this distribution exhibits left-skewed, right-skewed, reversed J, and approximately symmetric shapes. The HRF displays increasing and bathtub shapes. This makes the distribution a suitable candidate for modeling datasets that exhibit these traits. Nine estimation methods were proposed for estimating the parameters of the distribution, and simulation results revealed that most of these estimates were consistent when it came to the estimation of the parameters of the distribution. The applications of the BTCPE distribution were illustrated using datasets on the mortality rate and recovery rates of COVID-19. The results revealed that for the three datasets, the BTCPE model provided a better fit than the other competing models. A quantile regression model for studying the relationship between the conditional quantiles of a bounded response variable and a set of covariates was proposed. The application of the regression model was illustrated using real data. The study only defined the cumulative distribution and probability density functions of the bivariate distribution. Our future research will study the detailed properties of the bivariate distribution, estimate its parameters, and illustrate its applications.

Author Contributions

Conceptualization, S.N., A.G.A., and C.C.; Data curation, S.N., A.G.A., and C.C.; Methodology, S.N., A.G.A., and C.C.; Supervision, S.N., and C.C.; Validation, S.N., and C.C.; Visualization, S.N., and A.G.A.; Writing, S.N., and A.G.A.; Review & editing, S.N., and C.C. All authors read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Afify, A.Z.; Nassar, M.; Kumar, D.; Cordeiro, G.M. A new unit distribution: Properties and applications. Electron. J. Appl. Stat. Anal. 2022, 15, 460–484. [Google Scholar]
  2. Almazah, M.M.A.; Ullah, K.; Hussam, E.; Hossain, M.; Aldallal, R.; Riad, F.H. New Statistical Approaches for Modeling the COVID-19 Data Set: A Case Study in the Medical Sector. Complexity 2022, 2022, 1325825. [Google Scholar] [CrossRef]
  3. Alahmadi, A.A.; Alqawba, M.; Almutiry, W.; Shawki, A.W.; Alrajhi, S.; Al-Marzouki, S.; Elgarhy, M. A New Version of Weighted Weibull Distribution: Modelling to COVID-19 Data. Discret. Dyn. Nat. Soc. 2022, 2022, 3994361. [Google Scholar] [CrossRef]
  4. Algarni, A.; Almarashi, A.M.; Elbatal, I.; Hassan, A.S.; Almetwally, E.M.; Daghistani, A.M.; Elgarhy, M. Type I Half Logistic Burr X-G Family: Properties, Bayesian, and Non-Bayesian Estimation under Censored Samples and Applications to COVID-19 Data. Math. Probl. Eng. 2021, 2021, 5461130. [Google Scholar] [CrossRef]
  5. Bantan, R.A.R.; Shafiq, S.; Tahir, M.H.; Elhassanein, A.; Jamal, F.; Almutiry, W.; Elgarhy, M. Statistical Analysis of COVID-19 Data: Using a New Univariate and Bivariate Statistical Model. J. Funct. Spaces 2022, 2022, 2851352. [Google Scholar] [CrossRef]
  6. Arif, M.; Khan, D.M.; Aamir, M.; Khalil, U.; Bantan, R.A.R.; Elgarhy, M. Modeling COVID-19 Data with a Novel Extended Exponentiated Class of Distributions. J. Math. 2022, 2022, 1908161. [Google Scholar] [CrossRef]
  7. Nagy, M.; Almetwally, E.M.; Gemeay, A.M.; Mohammed, H.S.; Jawa, T.M.; Sayed-Ahmed, N.; Muse, A.H. The New Novel Discrete Distribution with Application on COVID-19 Mortality Numbers in Kingdom of Saudi Arabia and Latvia. Complexity 2021, 2021, 7192833. [Google Scholar] [CrossRef]
  8. Almetwally, E.M. The Odd Weibull Inverse Topp–Leone Distribution with Applications to COVID-19 Data. Ann. Data Sci. 2021, 9, 121–140. [Google Scholar] [CrossRef]
  9. Muse, A.H.; Tolba, A.H.; Fayad, E.; Abu Ali, O.A.; Nagy, M.; Yusuf, M. Modelling the COVID-19 Mortality Rate with a New Versatile Modification of the Log-Logistic Distribution. Comput. Intell. Neurosci. 2021, 2021, 8640794. [Google Scholar] [CrossRef]
  10. Haq, M.A.U.; Babar, A.; Hashmi, S.; Alghamdi, A.S.; Afify, A.Z. The Discrete Type-II Half-Logistic Exponential Distribution with Applications to COVID-19 Data. Pak. J. Stat. Oper. Res. 2021, 17, 921–932. [Google Scholar] [CrossRef]
  11. Gündüz, S.; Korkmaz, M. A New Unit Distribution Based on the Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution. Pak. J. Stat. Oper. Res. 2020, 16, 471–490. [Google Scholar] [CrossRef]
  12. Bantan, R.; Jamal, F.; Chesneau, C.; Elgarhy, M. Theory and Applications of the Unit Gamma/Gompertz Distribution. Mathematics 2021, 9, 1850. [Google Scholar] [CrossRef]
  13. Nasiru, S.; Abubakari, A.G.; Angbing, I.D. Bounded Odd Inverse Pareto Exponential Distribution: Properties, Estimation, and Regression. Int. J. Math. Math. Sci. 2021, 2021, 9955657. [Google Scholar] [CrossRef]
  14. Jodrá, P. A bounded distribution derived from the shifted Gompertz law. J. King Saud Univ.-Sci. 2020, 32, 523–536. [Google Scholar] [CrossRef]
  15. Haq, M.A.U.; Hashmi, S.; Aidi, K.; Ramos, P.L.; Louzada, F. Unit Modified Burr-III Distribution: Estimation, Characterizations and Validation Test. Ann. Data Sci. 2020, 99, 1–26. [Google Scholar] [CrossRef]
  16. Korkmaz, M.Ç. The unit generalized half normal distribution: A new bounded distribution with inference and application. U. P. B. Sci. Bull. Ser. A 2020, 82, 133–140. [Google Scholar]
  17. Mazucheli, J.; Menezes, A.F.B.; Chakraborty, S. On the one parameter unit-Lindley distribution and its associated regression model for proportion data. J. Appl. Stat. 2019, 46, 700–714. [Google Scholar] [CrossRef] [Green Version]
  18. Mazucheli, J.; Menezes, A.F.; Dey, S. Unit-Gompertz distribution with applications. Statistica 2019, 79, 25–43. [Google Scholar]
  19. Korkmaz, M. A new heavy-tailed distribution defined on the bounded interval. J. Appl. Stat. 2019, 47, 2097–2119. [Google Scholar] [CrossRef]
  20. Mazucheli, J.; Menezes, A.F.; Ghitany, M.E. The unit Weibull distribution and associated inference. J. Appl. Probab. Stat. 2018, 13, 1–22. [Google Scholar]
  21. Ghitany, M.E.; Mazucheli, J.; Menezes, A.F.B.; Alqallaf, F. The unit-inverse Gaussian distribution: A new alternative to two-parameter distributions on the unit interval. Commun. Stat.-Theory Methods 2018, 48, 3423–3438. [Google Scholar] [CrossRef]
  22. Aldahlan, M.A.; Jamal, F.; Chesneau, C.; Elgarhy, M.; Elbatal, I. The Truncated Cauchy Power Family of Distributions with Inference and Applications. Entropy 2020, 22, 346. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Wiley: New York, NY, USA, 2007. [Google Scholar]
  24. MacGillivray, H.L. Skewness and Asymmetry: Measures and Orderings. Ann. Stat. 1986, 14, 994–1011. [Google Scholar] [CrossRef]
  25. Moors, J.J. A quantile alternative for kurtosis. J. R. Stat. Soc. Ser. D 1988, 37, 25–32. [Google Scholar] [CrossRef]
  26. Elhassanein, A. On Statistical Properties of a New Bivariate Modified Lindley Distribution with an Application to Financial Data. Complexity 2022, 2022, 2328831. [Google Scholar] [CrossRef]
  27. Ganji, M.; Bevrani, H.; Hami, N. A New Method For Generating Continuous Bivariate Families. J. Iran. Stat. Soc. 2018, 17, 109–129. [Google Scholar] [CrossRef] [Green Version]
  28. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019; Available online: https://www.R-project.org/ (accessed on 30 October 2022).
  29. Modi, K.; Gill, V. Unit Burr-III distribution with application. J. Stat. Manag. Syst. 2019, 23, 579–592. [Google Scholar] [CrossRef]
  30. Ghosh, I.; Dey, S.; Kumar, D. Bounded M-O Extended Exponential Distribution with Applications. Stoch. Qual. Control 2019, 34, 35–51. [Google Scholar] [CrossRef]
  31. Altun, E.; Cordeiro, G.M. The unit-improved second-degree Lindley distribution: Inference and regression modeling. Comput. Stat. 2019, 35, 259–279. [Google Scholar] [CrossRef]
  32. Bolker, B. Tools for General Maximum Likelihood Estimation; R Development Core Team: Vienna, Austria, 2014. Available online: https://github.com/bbolker/bbmle (accessed on 30 October 2022).
  33. Xiang, Y.; Gubian, S.; Suomela, B.; Hoeng, J. Generalized simulated annealing: GenSA package. R J. 2013, 5, 13–29. [Google Scholar] [CrossRef] [Green Version]
  34. Petterle, R.R.; Bonat, W.H.; Scarpin, C.T.; Jonasson, T.; Borba, V.Z.C. Multivariate quasi-beta regression models for continuous bounded data. Int. J. Biostat. 2020, 17, 39–53. [Google Scholar] [CrossRef] [PubMed]
Figure 1. PDF (left) and HRF (right) of the BTCPE distribution.
Figure 1. PDF (left) and HRF (right) of the BTCPE distribution.
Mca 27 00105 g001
Figure 2. Plots of the MacGillivray skewness.
Figure 2. Plots of the MacGillivray skewness.
Mca 27 00105 g002
Figure 3. Plots of Moors kurtosis.
Figure 3. Plots of Moors kurtosis.
Mca 27 00105 g003
Figure 4. Min–max plots for some parameter values.
Figure 4. Min–max plots for some parameter values.
Mca 27 00105 g004
Figure 5. CDF plots of the BEBTCPE distribution.
Figure 5. CDF plots of the BEBTCPE distribution.
Mca 27 00105 g005
Figure 6. PDF plots of the BEBTCPE distribution.
Figure 6. PDF plots of the BEBTCPE distribution.
Mca 27 00105 g006
Figure 7. Boxplots of COVID-19 datasets.
Figure 7. Boxplots of COVID-19 datasets.
Mca 27 00105 g007
Figure 8. Empirical and fitted PDFs (left) and CDFs (right) of UK dataset.
Figure 8. Empirical and fitted PDFs (left) and CDFs (right) of UK dataset.
Mca 27 00105 g008
Figure 9. P-P plots for UK drought mortality.
Figure 9. P-P plots for UK drought mortality.
Mca 27 00105 g009
Figure 10. Profile log-likelihood plots for estimated parameters of BTCPE for UK.
Figure 10. Profile log-likelihood plots for estimated parameters of BTCPE for UK.
Mca 27 00105 g010
Figure 11. Empirical and fitted PDFs (left) and CDFs (right) of Canada dataset.
Figure 11. Empirical and fitted PDFs (left) and CDFs (right) of Canada dataset.
Mca 27 00105 g011
Figure 12. P-P plots for Canada mortality.
Figure 12. P-P plots for Canada mortality.
Mca 27 00105 g012
Figure 13. Profile log-likelihood plots for estimated parameters of BTCPE for Canada.
Figure 13. Profile log-likelihood plots for estimated parameters of BTCPE for Canada.
Mca 27 00105 g013
Figure 14. Empirical and fitted PDFs (left) and CDFs (right) of Spain dataset.
Figure 14. Empirical and fitted PDFs (left) and CDFs (right) of Spain dataset.
Mca 27 00105 g014
Figure 15. P-P plots for Spain recovery data.
Figure 15. P-P plots for Spain recovery data.
Mca 27 00105 g015
Figure 16. Profile log-likelihood plots for estimated parameters of BTCPE for Spain.
Figure 16. Profile log-likelihood plots for estimated parameters of BTCPE for Spain.
Mca 27 00105 g016
Figure 17. Rate of change of regression coefficients for different quantiles.
Figure 17. Rate of change of regression coefficients for different quantiles.
Mca 27 00105 g017
Figure 18. P-P plots for randomized quantile residuals.
Figure 18. P-P plots for randomized quantile residuals.
Mca 27 00105 g018
Figure 19. Half-normal plots with simulated envelopes for randomized quantile residuals.
Figure 19. Half-normal plots with simulated envelopes for randomized quantile residuals.
Mca 27 00105 g019
Table 1. Values of moment measures, including the SD, CV, CS and CK.
Table 1. Values of moment measures, including the SD, CV, CS and CK.
μ r α = 0.4 , λ = 2.5 α = 4.5 , λ = 3.1 α = 20.0 , λ = 1.5
μ 1 0.87990.56020.1339
μ 2 0.80210.34010.0242
μ 3 0.74570.21850.0053
μ 4 0.70200.14650.0013
μ 5 0.66670.10170.0004
μ 6 0.63730.07260.0001
SD0.16680.16190.0794
CV0.18960.28900.5931
CS−1.9527−0.34030.7713
CK6.68502.70843.5390
Table 2. AB and RMSE for α = 4.1 and λ = 2.5 .
Table 2. AB and RMSE for α = 4.1 and λ = 2.5 .
Parameter n ABRMSE
MLEMPSMADSMALDSOLSWLSCVMADPCMLEMPSMADSMALDSOLSWLSCVMADPC
α 250.79802.2327−2.31890.5310.3530−2.64231.14770.4713−0.31552.22333.57282.93223.33912.44572.67293.53771.99641.4972
750.21400.7443−1.84420.06340.1365−3.16320.34150.1180−0.16940.91571.31392.42411.13301.14893.16571.28200.95350.8506
1250.13420.4372−1.3088−0.00310.0472−3.32680.13370.0713−0.07830.68430.83131.98600.77950.81493.32790.81590.70250.6641
1750.09140.2987−0.8738−0.02720.0460−2.17210.11640.0657−0.04840.53650.67911.55440.63230.68452.19900.69550.59410.5393
2250.06770.2509−0.69760.00620.03013.07910.10960.0365−0.06230.48410.55051.32660.59260.59063.33770.61470.52400.4860
λ 250.18710.5436−1.10170.0300−0.0075−1.33440.20600.0670−0.17370.60380.82011.38620.73820.65381.36870.73400.57490.5401
750.04780.2197−0.8939−0.00790.0078−2.00030.08020.1185−0.06720.30890.40261.20900.39960.36922.00290.38860.33790.3175
1250.03780.1293−0.6271−0.0160−0.0055−2.14610.03050.0146−0.04480.24070.27400.96810.29870.27522.14720.27980.25600.2466
1750.02330.0866−0.3986−0.01390.0034−1.63940.02800.0114−0.02670.19590.23140.72640.23150.23721.64550.23830.21340.2008
2250.02080.0820−0.31010.00210.00030.19370.02300.0007−0.02250.18100.19390.60570.21960.20790.30660.21240.18740.1835
Table 3. AB and RMSE for α = 3.1 and λ = 8.5 .
Table 3. AB and RMSE for α = 3.1 and λ = 8.5 .
Parameter n ABRMSE
MLEMPSMADSMALDSOLSWLSCVMADPCMLEMPSMADSMALDSOLSWLSCVMADPC
α 250.51201.5281−1.31580.37120.2359−1.91210.77010.2725−0.67481.47932.55972.08171.96891.60861.93662.43251.32311.4204
750.20810.5190−0.94560.04770.0264−2.29240.19730.0989−0.33020.68480.86711.59130.78650.70882.29490.81220.62940.8145
1250.09940.3218−0.68950.05700.0153−2.42550.10200.0757−0.27040.47780.62281.28540.56570.55322.42660.56440.51090.6262
1750.08670.2259−0.54780.01070.0240−1.38570.08820.0554−0.21870.40770.45541.05380.48210.48131.48100.49400.41630.5242
2250.04610.1719−0.41920.00070.01952.21660.05550.0200−0.17350.33310.38800.84600.40210.41332.37850.41840.34770.4768
λ 250.57251.9602−3.52820.16750.1107−4.64430.72930.2368−1.64322.09513.15554.83582.67532.38834.77032.65172.12032.6346
750.29570.8211−2.4563−0.0292−0.0416−6.92430.24490.0825−0.72201.16271.41303.90501.36761.26556.93301.32461.10471.4877
1250.10980.4964−1.69750.0128−0.0435−7.39600.10150.0784−0.52590.88371.03213.06311.06940.98997.39940.96700.93271.1482
1750.11820.3504−1.2348−0.02320.0022−5.56780.10610.0622−0.40110.73610.77862.43680.88640.86085.59370.86870.77340.9365
2250.06310.2843−0.9196−0.00340.00150.67150.0430.0249−0.31710.62230.70911.92410.76040.74521.08980.75150.65900.8305
Table 4. Descriptive statistics for datasets.
Table 4. Descriptive statistics for datasets.
CountryMinimumMaximumMeanSkewnessKurtosis
UK0.08070.53310.28880.0476−1.1034
Canada0.11590.33470.2305−0.0850−0.4402
Spain0.42860.86280.7240−0.6890−0.4761
Table 5. Parameter estimates and model selection criteria for UK.
Table 5. Parameter estimates and model selection criteria for UK.
ModelParameter AICBICADCVM
BTCPE α = 16.6904 ( 5.2798 )
λ = 2.3884 ( 0.2865 )
45.4400−86.8726−82.68400.64940.1049
Beta α = 4.0502 ( 0.7128 )
β = 10.0132 ( 1.8287 )
45.4000−86.7958−82.60710.73560.1280
UBIII α = 0.0757 ( 0.0383 )
β = 13.3804 ( 6.5631 )
38.9000−73.8075−69.61882.89480.5248
BMOEE α = 105.2655 ( 59.9004 )
β = 3.5949 ( 0.4092 )
40.7200−77.4396−73.25091.14650.1698
UW α = 0.2834 ( 0.0602 )
β = 3.1228 ( 0.3047 )
42.5600−81.1208−76.93221.06560.1820
UG α = 686.3600 ( 2.2295 × 10 10 )
β = 0.0011 ( 1.4051 × 10 4 )
2.8400−1.67602.512712.22902.4707
UL α = 2.8293 ( 0.3029 ) 32.3800−62.7533−60.65904.48780.7574
UISDL α = 3.4259 ( 0.3151 ) 33.6100−65.2142−63.11983.99720.6545
Table 6. Parameter estimates and model selection criteria for Canada.
Table 6. Parameter estimates and model selection criteria for Canada.
ModelParameter AICBICADCVM
BTCPE α = 622.2064 ( 399.8188 )
λ = 4.5085 ( 0.4837 )

86.4400

−168.8806

−164.8299

0.3767

0.0689
Beta α = 14.5128 ( 2.7128 )
β = 48.4900 ( 9.1745 )

85.9400

−167.8800

−163.8293

0.4398

0.0692
UBIII α = 0.0080 ( 0.0011 )
β = 101.7700 ( 8.4127 × 10 8 )

30.8900

−57.7749

−53.7242

14.8770

3.1113
BMOEE α = 2822.9776 ( 3.3087 × 10 5 )
β = 5.4444 ( 0.1439 )

80.6700

−157.3394

−153.2887

1.5514

0.2327
UW α = 0.0552 ( 0.0193 )
β = 6.1602 ( 0.5868 )

79.9500

−155.9080

−151.8573

1.4890

0.2389
UG α = 628.3885 ( 2.4072 × 10 10 )
β = 0.0011 ( 1.4212 × 10 4 )

5.2500

−6.4901

−2.4393

18.5180

3.9712
UL α = 3.9381 ( 0.4506 ) 41.1400−80.2707−78.245312.70902.5936
UISDL α = 3.4259 ( 0.3151 ) 42.2000−82.3913−80.366012.30102.4925
Table 7. Parameter estimates and model selection criteria for Canada.
Table 7. Parameter estimates and model selection criteria for Canada.
ModelParameter AICBICADCVM
BTCPE α = 7.1385 ( 1.7764 )
λ = 7.1961 ( 0.9033 )
58.7500−113.4953−109.11600.87700.1363
Beta α = 12.7943 ( 2.2291 )
β = 4.8994 ( 0.8270 )
57.5700−111.1489−106.76921.05200.1783
UBIII α = 5.4398 ( 0.7948 )
β = 2.0613 ( 0.1723 )
53.8000−103.5927−99.21341.37250.2209
BMOEE α = 22.1286 ( 9.9041 )
β = 10.0043 ( 1.2381 )
51.4600−98.9276−94.54831.49580.2100
UW α = 8.6445 ( 1.6973 )
β = 2.2320 ( 0.2036 )
53.9700−103.9316−99.55231.38300.2238
UG α = 0.2792 ( 0.1059 )
β = 3.8482 ( 0.6025 )
46.0300−88.0569−83.67762.47090.3691
UL α = 0.5200 ( 0.0466 ) 46.1100−90.2298−88.04024.24800.6736
UISDL α = 0.7403 ( 0.0539 ) 52.0400−102.0717−99.88202.34500.3194
Table 8. Simulation results for the quantile regression.
Table 8. Simulation results for the quantile regression.
IIIIII
ParameternABRMSEABRMSEABRMSE
θ 0 500.19490.22350.35990.37530.26090.2969
1000.19460.19610.35510.37260.21780.2579
2500.19190.19410.34650.36730.15250.1926
3500.18980.19280.32710.35440.13200.1700
5000.18380.19270.31090.34820.11010.1431
6000.17790.18860.30510.34340.09980.1318
7000.17610.18500.29080.33330.09080.1196
θ 1 500.28260.30670.34850.38070.81940.8276
1000.26050.29040.31810.34860.81420.8238
2500.22900.26510.31710.33630.80130.8134
3500.21760.25390.31380.33420.78720.8041
5000.20970.24540.30830.33050.77270.7945
6000.20790.24330.30200.32720.71880.7610
7000.20530.23890.29780.32530.68620.7447
θ 2 501.58891.59591.71041.71530.52120.5338
1001.58351.59131.70461.71020.51400.5291
2501.58181.59101.69381.70060.50730.5250
3501.56981.58151.67511.68450.48930.5130
5001.55661.57231.64321.65780.47530.5046
6001.47491.51321.55591.59170.46010.4999
7001.38031.45201.45931.52640.45350.4921
α 500.07920.09980.08420.11100.10910.1520
1000.05770.07450.05700.07470.08720.1382
2500.03520.04630.03390.04370.05230.0859
3500.02950.03780.02870.03660.04270.0650
5000.02460.03160.02390.03170.03400.0467
6000.02270.02870.02170.02900.03170.0449
7000.02100.02670.02010.02590.02870.0375
Table 9. ML estimates for quantile regression.
Table 9. ML estimates for quantile regression.
u θ ^ 0 θ ^ 1 θ ^ 2 θ ^ 3 α ^
0.10Estimates−3.66990.00760.0905−1.004308.7724
Standard error0.1681 1.1670 × 10 3 7.5355 × 10 3 4.3797 × 10 2 9.3305 × 10 5
p-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
0.25Estimates−3.25440.00710.0845−0.9326325.4705
Standard error0.1545 1.0687 × 10 3 6.9379 × 10 3 4.0103 × 10 2 4.6137 × 10 5
p-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
0.50Estimates−2.89770.00670.0792−0.8732340.4285
Standard error0.1436 9.9065 × 10 4 6.4570 × 10 3 3.7166 × 10 2 1.3990 × 10 5
p-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
0.75Estimates−2.64240.00640.0766−0.8384281.1611
Standard error0.1405 9.7128 × 10 4 6.3363 × 10 3 3.6303 × 10 2 6.4012 × 10 6
p-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
0.90Estimates−2.40300.00610.0731−0.7987273.9968
Standard error0.1353 9.3470 × 10 4 6.1047 × 10 3 3.4900 × 10 2 2.9792 × 10 5
p-value < 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
Table 10. Model selection criteria for quantile regression.
Table 10. Model selection criteria for quantile regression.
u 2 AICBIC
0.10−885.3517−875.3517−856.8663
0.25−887.4067−877.4067−858.9212
0.50−889.1990−879.1990−860.7136
0.75−889.8634−879.8634−861.3779
0.90−890.8307−880.8307−862.3453
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Nasiru, S.; Abubakari, A.G.; Chesneau, C. New Lifetime Distribution for Modeling Data on the Unit Interval: Properties, Applications and Quantile Regression. Math. Comput. Appl. 2022, 27, 105. https://doi.org/10.3390/mca27060105

AMA Style

Nasiru S, Abubakari AG, Chesneau C. New Lifetime Distribution for Modeling Data on the Unit Interval: Properties, Applications and Quantile Regression. Mathematical and Computational Applications. 2022; 27(6):105. https://doi.org/10.3390/mca27060105

Chicago/Turabian Style

Nasiru, Suleman, Abdul Ghaniyyu Abubakari, and Christophe Chesneau. 2022. "New Lifetime Distribution for Modeling Data on the Unit Interval: Properties, Applications and Quantile Regression" Mathematical and Computational Applications 27, no. 6: 105. https://doi.org/10.3390/mca27060105

Article Metrics

Back to TopTop