Bounded Odd Inverse Pareto Exponential Distribution: Properties, Estimation, and Regression

In this paper, we introduce a new three-parameter distribution defined on the unit interval. &e density function of the distribution exhibits different kinds of shapes such as decreasing, increasing, left skewed, right skewed, and approximately symmetric. &e failure rate function shows increasing, bathtub, and modified upside-down bathtub shapes. Six different frequentist estimation procedures were proposed for estimating the parameters of the distribution and their performance assessed via Monte Carlo simulations. Applications of the distribution were illustrated by analyzing two datasets and its fit compared to that of other distributions defined on the unit interval. Finally, we developed a regression model for a response variable that follows the new distribution.


Introduction
e development of distributions defined on the unit interval is increasingly gaining grounds in literature due to their usefulness in the areas of psychology, economics, biology, and engineering among others. ese distributions are useful for modeling data that are defined on the unit interval such as proportions, percentages, or rates. In psychology for instance, proportions and percentages play a critical role in assessing the probability of judgments, the proportion of the brain's volume occupied by a specific part of the brain, and the proportion of a period of time spent on an activity [1]. In economics, there are many instances where data are bounded on the unit interval, for example, proportion of income spent on nondurable consumption, pension plan participation rates, market shares, fractional repayment on debts, and capital structures [1][2][3].
Distributions defined on the unit interval are known to have desirable failure (hazard) rate characteristics such as increasing, decreasing, and bathtub shapes. ese failure rate characteristics are vital when modeling datasets. For instance, Rajarshi and Rajarshi [4] and Lawless [5] indicated different scenarios where distributions with bathtub hazard rates are needed to model lifetime of electronic, electrochemical, and mechanical products. Lai [6] reported that the optimum number of minimal repairs for systems have increasing failure rates. Also, Woosley and Cossman [7] revealed that drugs have increasing failure rate during clinical development.
Although the two-parameter beta distribution [8] is one of the oldest distributions for modeling dataset on the unit interval, its cumulative distribution and quantile functions are not tractable. is makes generation of random observations for simulation from the beta distribution a bit complex. Hence, many researchers aim at developing bounded distributions with tractable cumulative distribution and quantile functions. Some of the existing bounded distributions in the literature include bounded M-O extended exponential distribution [2], unit Gompertz distribution [9], unit gamma distribution [10], Kumaraswamy distribution [11], Topp-Leone distribution [12], unit Burr III distribution [13], unit Weibull distribution [14], unit Lindley distribution [15], log-extended exponential-geometric distribution [16], logit slash distribution [17], unit Burr XII distribution [18], Arcsecant hyperbolic normal distribution [19], unit Johnson S U distribution [20], and unit inverse Gaussian distribution [21].
Despite the existence of some bounded distributions in the literature, no single distribution can be considered as the best for modeling all kinds of datasets. We are therefore motivated to develop a new bounded distribution with tractable cumulative distribution and quantile functions called bounded odd inverse Pareto exponential (BOIPE) distribution for modeling datasets on unit intervals. e BOIPE distribution is developed using the transformation Y � e − X , where X follows the odd inverse Pareto exponential (OIPE) distribution [22]. e new distribution hubs other existing distributions such as the Kumaraswamy, bounded M-O extended exponential, and power function distributions as submodels.
e remainder of the article is organized as follows: Sections 2 and 3 present the BOIPE distribution and its statistical properties, respectively. In Section 4, different frequentist estimation techniques are discussed. In Section 5, Monte Carlo simulations are carried out to examine their performance of the estimators. In Section 6, the applications of the BOIPE distribution are demonstrated. In Section 7, a regression model is proposed. Finally, the conclusion of the study is presented in Section 8.

BOIPE Distribution
Let the random variable X follow the OIPE distribution with probability density function (PDF) given by en, the distribution of the random variable Y � e − X is the BOIPE distribution. e PDF, cumulative distribution function (CDF), and hazard rate of the BOIPE distribution are respectively given by e BOIPE distribution generalizes some existing distributions defined on the (0, 1) support. ese are the Kumaraswamy distribution for β � 1, the bounded M-O extended exponential (BMOEE) distribution for α � 1, and the power function (PF) distribution for α � β � 1. Figure 1 shows the relationship between the BOIPE distribution and its submodels. e PDF and hazard rate function of the BOIPE distribution exhibit different kinds of shapes as shown in Figure 2. e PDF exhibits left skewed, right skewed, symmetric, J shape, and reversed J shape for the given parameter values. e hazard rate function displays increasing, bathtub, and modified upside-down bathtub shapes. e R codes for PDF and hazard rate function can be found in the Appendix section.
e limiting behavior of the density and hazard rate functions as y ⟶ 0 and y ⟶ 1 are respectively given by Sometimes, to derive the statistical properties of a developed distribution, the expansion of the density function is required. Using the generalized binomial expansion, the density function can be written as

Quantile.
e quantile function is useful when generating random observations from a distribution. It can also be utilized in estimating measures of shapes (skewness and kurtosis) when the moments of the random variable do not exist. e quantile function of the BOIPE distribution is e first quartile, the median, and the upper quartile are obtained by substituting u � 0.25, 0.5 and 0.75, respectively, into equation (10). e quantile function can be used to generate random observations from the BOIPE distribution. e algorithm for generating random observation from the BOIPE distribution is as follows: e following R codes can be used to generate random observations from the distribution.

Moments and Incomplete Moments.
e moments of a random variable, if they exist, are useful for estimating measures of central tendency, dispersion, and shapes. For the BOIPE random variable, the n th noncentral moment is given by us, using the expanded form of the density function yields If we let u � y λ , then as y ⟶ 0, u ⟶ 0 and as y ⟶ 1, u ⟶ 1. Further, dy � du/λy λ− 1 . Hence, after some algebraic manipulations, we have is the beta function and n � 1, 2, . . . ,. e central moments (μ s ) and the cumulants (κ s ) can be obtained from the noncentral moments as μ s � respectively, where κ 1 � μ 1 ′ . e skewness and kurtosis are respectively calculated from the third and fourth standardized cumulants as c 1 � κ 3 /κ 3/2 2 and c 2 � κ 4 /κ 2 2 . Table 1 displays the first six moments, standard deviation (SD), coefficient of variation (CV), coefficient of skewness (CS), and coefficient of kurtosis (CK). e values for SD, CV, CS, and CK are computed respectively using 4 International Journal of Mathematics and Mathematical Sciences e incomplete moments are important when estimating measures of inequalities such as the Lorenz and Bonferroni curves and measures of deviations such as the mean and median deviations. e incomplete n th moment for the BOIPE distribution is defined as Substituting the expanded form of the density function into the definition of the incomplete moments and simplifying yields

Generating Functions.
e moment generating, characteristic, and cumulant generating functions are derived in this section. e moment generating function, if it exists, is given by M Y (t) � E(e tY ). Hence, employing Taylor series expansion, the moment generating function of the BOIPE random variable Y is given by us, the characteristic function is given by 6 International Journal of Mathematics and Mathematical Sciences

Entropies.
Entropies are useful measures of variation of a random variable. ey have been extensively used in the areas of physics, molecular imaging of tumours, and sparse kernel density estimation. In this subsection, the Rényi [23] and δ entropies are discussed. e Rényi entropy of a random variable Y having the BOIPE distribution is defined as Using the generalized binomial expansion, we have Letting u � y λ , as y ⟶ 0, u ⟶ 0 and as y ⟶ 1, u ⟶ 1. Also, dy � du/λy λ− 1 .
us, after some algebraic manipulations, the Rényi entropy is obtained as e δ-entropy is defined as and then it follows from equation (22).

Stochastic
Ordering. Stochastic ordering is used to examine comparative behavior in reliability theory and other fields. Suppose Y 1 and Y 2 are two continuous random variables with PDFs f(y) and g(y), respectively. If f(y)/g(y) is nondecreasing, then the random variable Y 2 is smaller than Y 1 in likelihood ratio order denoted as Y 2 ≤ lr Y 1 . e likelihood ratio order is stronger than the hazard rate order and the usual stochastic order, which are defined as follows: and G(y) are the CDFs of Y 1 and Y 2 , respectively. (ii) Y 2 is said to be smaller than Y 1 in hazard rate order denoted by Y 2 ≤ hr Y 1 if τ 2 (y) ≤ τ 1 (y) for all y. τ 1 (y) and τ 2 (y) are the CDFs of Y 1 and Y 2 , respectively.

Proposition 2.
Let Y 1 and Y 2 be two random variables having the BOIPE distribution with parameters (α 1 , β, λ) and Proof. e ratio of the densities of the random variables is Next, dlog dy Hence, if α 2 > α 1 , then (dlog/dy)(f(y)/g(y)) > 0 for all y. is implies that f(y)/g(y) is nondecreasing in y and thus □ 3.6. Order Statistics. Order statistics are important for estimating summary statistics such as the minimum, maximum, and range of a dataset. ey are also used in quality control testing and reliability to forecast failure of future International Journal of Mathematics and Mathematical Sciences items based on the times of few early failures. Given that Y (1:n) ≤ · · · ≤ Y (n:n) are order statistics of a random sample Y 1 , . . . , Y n from the BOIPE distribution, then the PDF of the k th order statistic, Y (k:n) , is given by Using the binomial expansion, the PDF of the k th order statistic can be written as Substituting the PDF and CDF of the BOIPE distribution defined in equations (2) and (3), respectively, we have

Parameter Estimation
e methods for estimating the parameters of the BOIPE distribution are presented in this section. ese include maximum likelihood, ordinary least-squares (OLS), maximum product spacing (MPS), Cramér-von Mises (CVM), Anderson-Darling (AD), and percentile (PC) methods.

Maximum Likelihood Method.
If y 1 , . . . , y n are n independent and identically distributed observations from the BOIPE distribution and θ � (α, β, λ) T , then the total loglikelihood function, ℓ � ℓ(θ), is given by e maximum likelihood estimates (MLE) of parameters can be obtained by directly maximizing equation (29) using the R software or equating the following system of equations to zero and solving them simultaneously using numerical methods: When the regularity conditions are satisfied, the multivariate normal N 3 (0, J(θ) − 1 ) distribution, where J(θ) − 1 is the observed information estimated at θ, can be utilized to estimate the approximate confidence intervals for the BOIPE distribution parameters.

Ordinary Least Squares.
e OLS technique is an estimation procedure introduced by Swain et al. [24] for estimating the parameters of a model. Suppose y (1) , y (2) , . . . , y (n) are ordered observations from the BOIPE distribution with CDF F(y|α, β, λ). e OLS estimates are obtained by minimizing with respect to the parameters α, β, and λ.
e MPS estimates are obtained by maximizing the logarithm of the geometric mean of the spacing 8 International Journal of Mathematics and Mathematical Sciences with respect to α, β, and λ.

Percentile Method.
e PC method is also another approach for estimating the parameters of a model [30,31]. Suppose u i � 1/(n + 1) is an unbiased estimator of F(y (i) |α, β, λ). e PC estimates of the BOIPE distribution parameters are obtained by minimizing with respect to the parameters and Q Y (u i ) is given by equation (10).

Cramér-von Mises
Method. e CVM estimation method is considered to have less bias than other minimum distance estimators [32]. e CVM estimates for the BOIPE distribution are obtained by minimizing with respect to the parameters α, β, and λ.

Anderson-Darling Method.
e AD estimator is another type of minimum distance estimators. e AD estimates of the BOIPE distribution are obtained by minimizing with respect to the parameters α, β, and λ.

Monte Carlo Simulation
In this section, the performance of the estimators for the parameters of the BOIPE distribution is examined via Monte Carlo simulations. e simulation exercise was carried out using two sets of parameter values, that is, (α, β, λ) � (0.8, 0.5, 0.2) and (0.9, 3.8, 0.8). e sample sizes n � 25, 75, 150, 200, 300, and 500 were used to generate random observations from the BOIPE distribution using its quantile function. For each sample size, the experiment was replicated for N � 5, 000 times and the average estimate (AE), absolute bias (AB), and mean square error (MSE) were estimated. e results, as shown in Tables 2 and 3, revealed that all the estimators are consistent. For the first case (Table 2), the maximum likelihood estimators tend to have the least MSEs compared to the other estimators. For the second case (Table 3), when the sample size was 25, all the estimators tend to over estimate the parameter β. However, as the sample size increases, the estimates tend to converge to the actual parameter value. Again, the maximum likelihood estimators had the smallest of the MSEs, as the sample size increases.

Applications
e empirical applications of the BOIPE distribution are illustrated in this section using two real datasets. e first dataset (data I) can be found in the study by Yousof et al. [33] and consists of transformed total milk production in the first birth of 107 cows from the SINDI race. e data are 0.  [35] distributions using goodness-of-fit statistics such as the Akaike information criterion (AIC), corrected Akaike information criterion (AICc), − 2ℓ, Anderson-Darling method (AD), and Cramér-von Mises (CVM) method. e p values of the AD and CVM statistics are given in the parentheses. e distribution with the smallest values of the goodness-of-fit statistics is considered the best for a given dataset. e R codes for the empirical illustration can be found in the appendix section. Table 4 presents the maximum likelihood estimates of the parameters of the fitted distributions with their corresponding standard errors in parentheses for data I. e goodness-of-fit statistics for the fitted distribution for the first dataset are shown in Table 5. It can be seen that the BOIPE distribution provides the best fit to the dataset since it has the least values for all the goodness-of-fit statistics.
International Journal of Mathematics and Mathematical Sciences Figure 4 displays the PDF and CDF plots of the fitted distributions for data I. e graph clearly shows that the BOIPE distribution provides a good fit to the dataset. e probability-probability (P-P) plots of the fitted distributions for data I are shown in Figure 5. e plots again reveal that the BOIPE distribution fits the data well. e maximum likelihood estimates for the parameters of the fitted distributions for data II are given in Table 6. e goodness-of-fit statistics for the fitted distributions for the second dataset are given in Table 7. e results revealed that the BOIPE distribution again provides the best fit to the second dataset as compared to the other competing distributions. e PDF and CDF plots of the fitted distributions, shown in Figure 6, give a pictorial representation of how well the distributions fit data II. It can be seen that the BOIPE distribution mimics the empirical distribution of the dataset. e P-P plots shown in Figure 7 also revealed that the BOIPE distribution provides a good fit to data II compared to the other fitted distributions.

BOIPE Regression Model
Sometimes, one may be interested in investigating the effects of some exogenous variables on an endogenous variable and a regression model may be required to accomplish this task.
us, we proposed a new parametric regression model with assumption that the underlying distribution of the response variable follows the BOIPE distribution. In order to establish the regression model, we relate the parameters α and λ to exogenous variables by the logarithm link functions α i � exp(x T i α 1 ) and λ i � exp(x T i λ 1 ), i � 1, . . . , n, respectively, where α 1 � (α 10 , . . . , α 1p ) T and λ 1 � (λ 10 , . . . , λ 1p ) T constitute the vectors of the regression coefficients and x T i � (x i1 , . . . , x ip ). e survival function of Y|X from equation (3) follows as

12
International Journal of Mathematics and Mathematical Sciences To estimate the parameters of the regression model, the maximum likelihood technique was employed. e total loglikelihood function that needs to be maximized in order to obtain the estimates of the regression parameters is given by    14 International Journal of Mathematics and Mathematical Sciences We demonstrated the application of the BOIPE regression by modeling the relationship between long-term interest (LTI) rates of the Organization for Economic Cooperation Development (OECD) countries and foreign direct investment (FDI). e data can be found in the study by Altun and Cordeiro [36]  e performance of the BOIPE regression model was compared to that of the beta and simplex regression models. e beta and simplex regression models were fitted using the betareg and simplexreg packages of the R software respectively. e estimated parameters of the BOIPE regression model were obtained using the mle2 function of the bbmle package of the R software. e R codes can be found in the appendix section. Table 8 presents the estimated parameters (standard errors) of the fitted regression model and their goodness of fit statistics. For all the fitted models, the coefficient of the FDI is significant. e coefficient of the FDI in the BOIPE regression model is positive indicating that a change in the FDI increases the LTI rate. However, the coefficient of the FDI in the beta and simplex regression models is negative implying that the FDI decreases the LTI In order to examine the adequacy of the BOIPE regression model, we estimated the Cox-Snell residuals [37]. e Cox-Snell residual is defined as where S(·) is the estimated survival function. If the model fits the data well, the Cox-Snell residuals are expected to behave like a sample from the standard exponential distribution [5]. Also, the plot of the Cox-Snell residuals versus − logS(r i ), where S(r i ) is the Kaplan-Meier estimate of the Cox-Snell residuals, is expected to be a straight line with zero intercept and unit slope. Figure 8 shows the P-P plot of the Cox-Snell residuals and the plot of the Cox-Snell residuals versus the negative logarithm of the Kaplan-Meier estimate of the Cox-Snell residuals. It can be seen from the P-P plot that the plotted points are closer to the diagonal line indicating that the model provides an adequate fit to the data. Also, the plot of the Cox-Snell residuals versus the negative logarithm of the Kaplan-Meier estimate of the residuals is a straight line with zero intercept, as shown in Figure 8.

Conclusion
In this study, a three-parameter distribution called bounded odd inverse Pareto exponential distribution was proposed. e hazard rate function of the proposed distribution exhibits different kinds of shapes making it suitable for modeling the dataset with both monotonic and nonmonotonic failure rates defined on the (0, 1) interval. Different estimation techniques were proposed for estimating the parameters of the model. However, the Monte Carlo simulation results revealed that the maximum likelihood procedure estimates the parameters better compared to the other estimation procedures. e empirical applications of the model using real datasets indicated that the new distribution provides good fit to the given datasets compared to other existing distributions. Finally, we proposed the BOIPE regression model and compared its performance with the beta and simplex regression models using the real datasets. e goodness-of-fit statistics revealed that the BOIPE regression model fitted the given data better than the beta and simplex regression models.

Data Availability
e study is on methodological improvement, and the data used can be found within the paper with the appropriate source duly cited.

Conflicts of Interest
e authors declare that they have no conflicts of interest.