A novel weighted family of probability distributions with applications to world natural gas, oil, and gold reserves

: Recent innovations have focused on the creation of new families that extend well-known distributions while providing a huge amount of practical flexibility for data modeling. Weighted distributions o ff er an e ff ective approach for addressing model building and data interpretation problems. The main objective of this work is to provide a novel family based on a weighted generator called the length-biased truncated Lomax-generated (LBTLo-G) family. Discussions are held about the characteristics of the LBTLo-G family, including expressions for the probability density function, moments, and incomplete moments. In addition, di ff erent measures of uncertainty are determined. We provide four new sub-distributions and investigated their functionalities. Subsequently, a statistical analysis is given. The LBTLo-G family’s parameter estimation is carried out using the maximum likelihood technique on the basis of full and censored samples. Simulation research is conducted to determine the parameters of the LBTLo Weibull (LBTLoW) distribution. Four genuine data sets are considered to illustrate the fitting behavior of the LBTLoW distribution. In each case, the application outcomes demonstrate that the LBTLoW distribution can, in fact, fit the data more accurately than other rival distributions


Introduction
Weighted distributions (WDs) provide an approach to deal with model specification and data interpretations problems.They adjust the probabilities of the actual occurrence of events to arrive at a specification of the probabilities when those events are recorded.Reference [1] extended the basic ideas of the methods of ascertainment upon the estimation of frequencies in [2].The author defined a unifying concept of the WDs and described several sample conditions that the WDs can model.The usefulness and applications of the WDs in various areas, including medicine, ecology, reliability, and branching processes, can also be seen in [3][4][5].Important findings on the WDs have been reported by several research.For examples, reference [6] suggested a weighted x-gamma distribution, reference [7] derived a new generalized weighted Weibull distribution, reference [8] introduced the weighted exponential-Gompertz distribution, reference [9] studied the new weighted inverse Rayleigh distribution, reference [10] introduced a weighted version of the generalized inverse Weibull distribution, reference [11] proposed a bounded weighted exponential distribution, reference [12] derived a new weighted exponential distribution, reference [13] proposed a weighted power Lomax distribution, reference [14] derived a new generalized weighted exponential distribution, reference [15] introduced a new version of the weighted Weibull distribution, reference [16] proposed the modified weighted exponential distribution, and reference [17] proposed a weighted Nwikpe distribution, reference [18] introduced a new version of the double weighted quasi Lindley distribution and reference [19] proposed the modified length-biased weighted Lomax distribution.
In contrast, statistical models have the capacity to depict and predict real-world phenomena.Over the past few decades, numerous extended distributions have been extensively utilized in data modeling.Recent progress has been centered on the development of novel distribution families that not only enhance existing distributions but also offer significant versatility in practical data modeling.Engineering, economics, biology, and environmental science are particular examples of this.Regarding this, a number of writers suggested some of the created families of continuous distributions, (see for example [20][21][22]).Our interest here is in the same scheme used for the beta-G (B-G) family prepared in [23].The following is the cumulative distribution function (cdf) for the B-G family: where G(x) is a cdf of a continuous distribution and r(t) is the probability density function (pdf) of the beta distribution.Naturally, any new family can be created by taking another pdf for r(t) with support [0, 1] (see reference [23]).
As a matter of fact, few works about the weighted-G family have been proposed in the literature.For example, reference [24] studied the weighted exponential-G family, reference [25] introduced the weighted exponentiated family, reference [26] proposed a weighted general family, and reference [27] developed a weighted Topp-Leone-G family.
The primary purpose of this study is to introduce the length-biased truncated Lomax-G (LBTLo-G) family.The following arguments give enough motivation to study it: 1) The LBTLo-G family is very flexible and simple.
2) The LBTLo-G family contains some new distributions.
3) The shapes of the pdfs of the generated distributions can be unimodal, decreasing, bathtub, rightskewed, and symmetric.Also, the hazard rate function (hrf) shapes for these distributions can be increasing, decreasing, U-shaped, upside-down-shaped, or J-shaped.
After emphasizing these important aspects, some statistical and mathematical properties of the newly suggested family are discussed.The maximum likelihood (ML) method of estimation is used to es-timate the LBTLo Weibull (LBTLoW) model parameters based on complete and type II censoring (T2C).The variability of the LBTLoW distribution is demonstrated through four authentic data sets.The first data set describes age data on rest times (in minutes) for analgesic patients.The second data set shows the percentage of natural gas reserves in 44 countries in 2020.The third authentic data set listed the top 20 countries by oil reserves.Proven reserves refer to the quantities of petroleum that can be predicted as commercially recoverable from known reservoirs, based on the analysis of geological and engineering data.These estimates are made considering existing economic conditions and are projected from a specific period onwards.The fourth data set displays the top 100 central banks in terms of gold reserves.This gold reserve data, collected from IMF IFS figures, tracks central banks' reported gold purchases and sales as a percentage of their international reserves.The application results show that the LBTLoW distribution can indeed match the data better than other competing distributions.
The following is the structure for this article: Section 2 defines the crucial functions of the LBTLo-G family and provides four special distributions of the family.In Section 3, some statistical properties of the LBTLo-G family are provided.Section 4 deals with the ML estimates (MLEs) of the unknown parameters.A simulation study to examine the theoretical performance of MLEs for the LBTLoW distribution is studied in Section 5. Section 6 presents the applicability and goodness of fit of the proposed models using four real data sets.The paper ends with a few last observations, as may be seen in Section 7.

Construction of the LBTLo-G family
Here, we suggest a new weighted family based on the weighted version of the truncated Lomax distribution, which is called the LBTLo distribution [28].The cdf and pdf of the LBTLo distribution are, respectively, given by where For these functions, it is assumed the standard complementary values for t ≤ 0 and t ≥ 1.
As mentioned in [28], the following advantages of the LBTLo distribution are outlined: (i) It depends on only one parameter; (ii) the pdf has only one maximum point with a relatively sharp peak and a heavy tail; (iii) the hrf has increasing behavior or is N-shaped; and (iv) it outperforms some other competing models in real-world applications to medical data and the percentage of household spending on education out of total household expenditure from the Household Income, Expenditure, and Consumption Survey data for North Sinai Governorate.
In light of these merits, the LBTLo distribution is a great choice to use in various fields.As a consequence, we present a novel generated family that is based on the LBTLo distribution.In order to define the LBTLo-G family, let G(x; ζ) and g(x; ζ) be the baseline cdf and pdf, respectively, of a continuous distribution, and ζ is a vector of parameters.The generalized B-G generator specified in (1.1) and the LBTLo distribution (2.2) are combined to generate the cdf of the LBTLo-G family: where α is a shape parameter.Therefore, the pdf of the LBTLo-G family is given by A random variable X with the pdf (2.4) is designated as X ∼ LBTLo-G from here on out.The complementary cdf (ccdf), and hrf, are, provided by We create four new LBTLo-G family sub-distributions in the subsections that follow: LBTLoinverse exponential, LBTLo-uniform, LBTLo-Weibull, and LBTLo-Kumaraswamy distributions.

LBTLo-Kumaraswamy distribution
The cdf and pdf of the LBTLo-Kumaraswamy (LBTLoKw) distribution are obtained from (2.3) and (2.4) by taking G( Further, the hrf is as follows: The plots of pdf and hrf for the LBTLoIE, LBTLoU, LBTLoW and LBTLoKw distributions are given in Figures 1 and 2, respectively.The pdfs of the investigated distributions can have a variety of forms, including right-and leftskewed, bathtub, uni-modal, declining, and symmetric shapes, as shown in Figure 1.The corresponding hrf can take any form, including U, J, reverse J, growing, or decreasing, as seen in Figure 1.

Some statistical properties of LBTLo-G family
In this part, we give some statistical properties of the LBTLo-G family.

Important expansion
The LBTLo-G family representations in pdf and cdf format are displayed here.The generalized binomial theorem says that for |z| < 1.Hence, by using (3.1) in (2.4), the pdf of the LBTLo-G family can be written as follows: where For example, the expansion of pdf of the LBTLoW distribution is derived from (3.2) as follows: But, in the special case where b is a positive integer, the standard generalized binomial theorem says that Then using the binomial expansion (3.4) in (3.3), we get where In what follows, an expansion for F (x; α, ζ) h is derived, for h is an integer, again, the exponential and the binomial expansions are worked out: Using the binomial expansion (3.4) in (3.6), we get Using the binomial expansion (3.1), we obtain By using (3.4) in (3.8), we obtain where ϖ d, j,m = h j=0 j m=0 (−1) d+m+h− j α m h j j m For example, the expansion of the cdf of the LBTLoW distribution is derived from (3.9), where G(x, ζ) = 1 − e −βx γ , as follows: By using the binomial expansion (3.4) in the last term of the previous equation, we get The above representations are of interest to express various important moment measures as series.By truncating the index of summation, we can have a precise approximation with a reasonable computation cost.

The Probability Weighted Moments
As a special class of moments, the probability weighted moments (PWMs) have been proposed in [29].This class is used to derive estimates of the parameters and quantiles of distributions expressible in inverse form.Let X be a random variable with pdf and cdf f (x) and F(x), respectively, and r and q be non-negative integers.Then, the (r, q) th PWM of X, denoted by π r,q , can be calculated through the following relation: On this basis, the (r, q) th PMW of X with pdf and cdf of the LBTLo-G family is obtained by substituting (3.2) and (3.9) into (3.11), as follows: Then, provided that the interchange of the integral and sum is valid, depending on the definitions of g (x; ζ) and G (x; ζ), we have For example, the (r, q) th of a random variable X that follows the LBTLoW distribution can be obtained by substituting (3.5) and (3.10) into (3.11), and replacing h with q.We thus obtain where Γ (.) stands for gamma function.

Moments and incomplete moments
In this part, for any non-negative integer r, the r th moment associated with the LBTLo-G family is derived.
Let X be a random variable having the pdf of the LBTLo-G family.Then, the r th moment of X is obtained as follows: where υ r,i+1 is the (r, i + 1) th PWM of the baseline distribution.For example, after some developments, the r th moment associated with LBTLoW distribution is given by Tables 1-3 show the numerical values of the first four moments , also the numerical values of variance (σ 2 ), coefficient of skewness (CS), coefficient of kurtosis (CK) and coefficient of variation (CV) associated with the LBTLoW and LBTLoIE distribution.
Table 1.Results of some moments, σ 2 , CS, CK, and CV associated with the LBTLo-W distribution at β = 1.8.It can be seen from Tables 1-3 that, when the value of α, γ increases for a fixed value of β, the first four moments and σ 2 decrease, while the CS, CK, and CV measures increase.When the value of β increases for a fixed value of α and γ, we observe that the first four moments and σ decrease and then increase, while the CS, CK, and CV measures increase.The LBTLoW distribution is skewed to the right by leptokurtic curves.
Furthermore, if X is a random variable having the pdf of the LBTLo-G family, then the r th incomplete moment of X is obtained as follows: For example, after some developments, the r th incomplete moment associated with the LBTLoW distribution is given by where Γ (., x) is the lower incomplete gamma function.

Some information measures
Here, some uncertainty measures of the LBTLo-G family are derived.Then, these measures are specialized to the LBTLoW distribution.To begin, the Rényi entropy (RE), presented in [30], associated with a distribution with pdf f (x), is defined by A numerical study with integral calculus is possible; here, we focus on a series expansion.In what follows, an expansion for f (x; α, ζ) ε is derived, for ε is a non-integer (again, the generalized binomial expansion is worked out): where Then, the RE associated with the LBTLo-G family is given by For example, the RE associated with the LBTLoW distribution can be obtained as follows: .
The Havrda and Charvát entropy (HaCE) (see [31]) associated with a distribution with pdf f (x) is defined by Hence, the HaCE of the LBTLo-G family is given by For example, the HaCE of the LBTLoW distribution can be obtained as follows: The Arimoto entropy (ArE) (see [32]) associated with a distribution with pdf f (x) is defined by Hence, the ArE of the LBTLo-G family is given by For example, the ArE of the LBTLoW distribution can be obtained as follows: The Tsallis entropy (TsE) ( see [33]) associated with a distribution with pdf f (x), is defined by Hence, the TsE of the LBTLo-G family is obtained as follows: For example, the TsE of the LBTLoW distribution can be obtained as follows: . Some numerical values for the proposed entropy measures are obtained for the LBTLoW and LBT-LoIE distribution in Tables 4 and 5.
We can see from these tables that, as the value of ε rises, all entropy values decrease, providing more information.For a fixed value of β, as the values of α and γ rise, we infer that all entropy metrics decrease, indicating that there is less fluctuation.Additionally, we deduce that all entropies have less variability as the values of α, γ and β increase.When compared to other measures, the TsE measure values typically have the smallest values.

Maximum likelihood estimates via T2C
Let x (1) ≤ x (2) ≤ . . .≤ x (n) be a T2C of size r resulting from a life test on n items whose lifetimes are described by the LBTLo-G family with a given set of parameters α and ζ, see [34][35][36][37].The loglikelihood function of r failures and (n − r) censored values, is given by log and we write x (i) = x i for simplified form.
By maximizing the previous likelihood function, the MLEs of unknown parameters are determined.To achieve this, we can first compute the first derivative of the score function U α , U ζ k , given as follows: where By putting U α and U ζ k equal to zero and solving these equations simultaneously, the MLEs of the LBTLo-G family are found.These equations are not amenable to analytical solution, however they are amenable to numerical solution by iterative techniques utilizing statistical software.
The confidence interval (CI) of the vector of the unknown parameters ξ = (α, ζ) could be obtained from the asymptotic distribution of the MLEs of the parameters as ξMLE − ξ → N 2 0, I −1 ξMLE , where I (ξ) is the Fisher information matrix.Under particular regularity conditions, the two-sided 100 (1 − v) %, 0 < v < 1, asymptotic CI for the vector of unknown parameters ξ can be acquired in the following ways: ξMLE ± z v/2 var( ξ), where var( ξ) is the element of the main diagonal of the asymptotic variance-covariance matrix I −1 ξMLE and z v/2 is the upper v th 2 percentile of the standard normal distribution.

Numerical results
This section includes a simulation study to evaluate the performance of the MLEs for the LBTLoW model (α, β, γ), for complete and T2C.The Mathematica 9 package is used to get the mean squared error (MSE), lower bound (LB) of CI, upper bound (UB) of CI, average length (AL) of 95%, and coverage probability (CP) of 95% of the estimated values of α, β and γ.The algorithm is developed in the way described below: 3) Three levels of censorship are chosen: r = 70%, 80% (T2C), and 100% (complete sample).4) The MLEs, Biases, and MSEs for all sample sizes and for all selected sets of parameters are computed.Furthermore, the LB, UB, AL, and CP with a confidence level of 0.95 for all sample sizes and for all selected sets of parameters are calculated.5) Numerical outcomes are reported in Table 6.Based on complete and T2C samples, we can detect the following about the performance of the estimated parameters.
A. For almost all the true values, the MSE of all the estimates decreases as the sample sizes and the censoring level r increase, demonstrating that the various estimates are consistent (see Table 6 and Figure 3).
B. For all true parameter values, the ALs of all the estimates decrease as the sample sizes and the censoring level r increase (see Table 6 and Figure 4).
C. For all true parameter values, the CP of all the estimates increases as the sample sizes and the censoring level r increase (see Table 6).D. The MSE of the estimate of α at the true value of Set1 yields the lowest values in comparison to the other actual parameter values for all sample sizes (see Table 6 and Figure 5).
E. At all actual values, the MSE of the estimate of β produces the largest results for all sample sizes (see Table 6 and Figure 6).Also, it is evident that except for n = 50 and 200, the MSE of β estimates obtains the smallest values for the actual value of Set1 compared to the other actual sets at the censoring level 70%.At the censoring level 80%, the MSE of β estimates gets the smallest values at all sets of parameters except at n = 50.
F. The MSE of the estimate of γ at the true value of Set2 gets the smallest values in comparison to the other actual parameter values for all sample sizes (see Table 6 and Figure 7).
G. The MSEs, biases, and ALs of γ are smaller than the other estimates of α and β in almost all of the cases.H.As n rises, the CI's lengths get shorter.
I. As n increases, parameter estimates grow increasingly accurate, suggesting that they are asymptotically unbiased.J.For the parameter values examined, the CI's overall performance is fairly strong.is associated with reduced values of A IC , C AIC , B IC , H QIC , K S , and the highest magnitude of P V .The maximum likelihood estimators (MLEs) of the competitive models, along with their standard errors (SEs) and values of A IC , C AIC , B IC , H QIC , P V , and K S for the suggested data sets, are displayed in Tables 11-18.It has been observed that the LBTLoW distribution, characterized by three parameters, exhibits superior goodness of fit compared to alternative models.This distribution exhibits the lowest values of A IC , C AIC , B IC , H QIC , and K S , and the highest value of P V among the distributions under consideration in this analysis.Furthermore, Figures 8-15 exhibit the graphical representations of the estimated pdf, cdf, ccdf, and probability-probability (PP) plots for the competitive model applied to the given data sets.
From the previous figures, we conclude that the LBTLoW model clearly gives the best overall fit and so may be picked as the most appropriate model for explaining data.

Conclusions
The LBTLo-G family of distributions is explored in this article.The LBTLo-G family of probability distributions has a number of desirable characteristics, including being very flexible and simple, containing a number of new distributions, the ability for the generated distributions' pdfs to be unimodal, decreasing, bathtub-shaped, right-skewed, and symmetric, and the ability for their hrf shapes to be increasing, decreasing, U-shaped, upside-down-shaped, or J-shaped.These include discussion of the characteristics of the LBTLo-G family, including expansion for the density function, moments, incomplete moments, and certain entropy metrics.Estimating the model parameters is done using the ML technique.A simulation study demonstrated that the estimates of the model parameters are not far from their true values.Also, the biases and mean squared errors of estimates based on censored samples are larger than those based on complete samples.As the censoring levels and sample sizes increase, the coverage probability of estimates increases in approximately most cases.
As one distribution of the LBTLo-G family, the real datasets for global reserves of oil, gold, and natural gas were chosen to fit the LBTLoW distribution.The first data set proposed was the lifetime data relating to relief times (in minutes) of patients receiving an analgesic.The second data set provides the percent of global reserves of natural gas for 44 countries.We have considered the third real data analysis of the countries with the largest oil reserves in 20 countries.We consider another real-data analysis of the central bank owning the largest gold reserves in 100 countries.This gold reserve data, compiled using international monetary funds and international financial statistics, tracks central banks' reported purchases and sales of gold as a percentage of their international reserves.The LBTLoW model typically provides superior fits in comparison to certain other alternative models, as shown by real-world data applications.

Use of AI tools declaration
The authors declare that they have not used artificial intelligence tools in the creation of this article.

Figure 3 .
Figure 3. MSE of the estimates at the true value of Set1.

Figure 4 .
Figure 4. AL of the estimates at the true value of Set2.

Figure 5 .
Figure 5. MSE of α for all sets.

Figure 6 .
Figure 6.MSE of β for all sets.

Figure 7 .
Figure 7. MSE of γ for all sets.

Figure 8 .
Figure 8.Estimated pdf, cdf and ccdf plots of the competitive models for the first data set.

Figure 9 .
Figure 9.The PP plots of the fitted models for the first data set.

Figure 10 .
Figure 10.Estimated pdf, cdf and ccdf plots of the competitive models for global reserves natural gas data set.

Figure 11 .
Figure 11.The PP plots of the fitted models for the global reserves natural gas data set.

Figure 12 .
Figure 12.Estimated pdf, cdf and ccdf plots of the competitive models for global oil reserves data set.

Figure 13 .
Figure 13.The PP plots of the fitted models for the global oil reserves data set.

Figure 14 .
Figure 14.Estimated pdf, cdf and ccdf plots of the competitive models for global gold reserves data set.

Figure 15 .
Figure 15.The PP plots of the fitted models for global gold reserves data set.

Table 3 .
Results of some moments, σ 2 , CS, CK, and CV associated with the LBTLoIE distribution.

Table 4 .
Numerical values of entropy measures of the LBToW distribution.

Table 5 .
Numerical values of entropy measures of the LBToIE distribution.

Table 6 .
Accuracy measures of the LBTLoW estimates under T2C and complete samples.

Table 7 .
The percent Global Reserves Natural Gas of the Countries (2020).

Table 8 .
Top 20Countries with the Largest Oil Reserves (in thousand million barrels).

Table 9 .
Top 100 central bank owned the largest gold Reserves (in thousand tons).

Table 10 .
Some descriptive analysis of all data sets.

Table 11 .
MLEs and SEs for the first data set.

Table 12 .
Measures of fitting for the first data set.

Table 13 .
MLEs and SEs for the global reserves natural gas data set.

Table 14 .
Measures of fitting for the global reserves natural gas data set.

Table 15 .
MLEs and SEs for the global oil reserves data set.

Table 16 .
Measures of fitting for the global oil reserves data set.

Table 18 .
Measures of fitting for the global gold reserves data set.