Estimation methods based on ranked set sampling for the power logarithmic distribution

The sample strategy employed in statistical parameter estimation issues has a major impact on the accuracy of the parameter estimates. Ranked set sampling (RSS) is a highly helpful technique for gathering data when it is difficult or impossible to quantify the units in a population. A bounded power logarithmic distribution (PLD) has been proposed recently, and it may be used to describe many real-world bounded data sets. In the current work, the three parameters of the PLD are estimated using the RSS technique. A number of conventional estimators using maximum likelihood, minimum spacing absolute log-distance, minimum spacing square distance, Anderson-Darling, minimum spacing absolute distance, maximum product of spacings, least squares, Cramer-von-Mises, minimum spacing square log distance, and minimum spacing Linex distance are investigated. The different estimates via RSS are compared with their simple random sampling (SRS) counterparts. We found that the maximum product spacing estimate appears to be the best option based on our simulation results for the SRS and RSS data sets. Estimates generated from SRS data sets are less efficient than those derived from RSS data sets. The usefulness of the RSS estimators is also investigated by means of a real data example.

where ω ≡ (a, b, c) is the set of parameters.The cumulative distribution function (CDF) of the PLD is: According to PDF (1), the distributions mentioned below are considered as submodels of the PLD: • For c = 0 , the PDF (1) provides the power function distribution with parameter a.
• For a = 0 , the PDF (1) provides the logarithmic distribution with parameters b and c.
• For a + 1 = ϑ and c = 1 , the PDF (1) provides the Log-Lindley distribution with parameters ϑ and b.
The hazard rate function (HF) of the PLD is: The PDF and HF plots of the PLD are represented in Fig. 1.From Fig. 1, we can see that the PDF plot takes various shapes, such as growing, decreasing, constant, skewed to the right or left, and upside-down bathtub-shaped.
The HF plots can be increasing, U-shaped, bathtub or j-shaped.
The study of economical sampling techniques is one of the major and fascinating areas of statistics.The field's motivation stems from its exceptional ability to streamline the process of gathering data, particularly in situations when gathering relevant data is costly or time-consuming.In order to obtain accurate and cost-effective findings, researchers have developed a variety of sampling techniques over the past few decades.Ranked set sampling (RSS) is a useful technique for attaining observational economy in terms of the precision attained per sample unit.In the beginning, McIntyre 13 presented the idea of RSS as a method for improving the sample mean's accuracy as a population mean estimate.Ranking can be done without actually quantifying the observations by using expert opinion, visual examination, or any other method.Takahasi and Wakimoto 14 provided the mathematical framework for RSS.Dell and Clutter 15 demonstrated that, even in the presence of ranking errors, RSS outperforms simple random sampling (SRS).The RSS is extensively used in the fields of environmental monitoring 16 , entomology 17 , engineering applications 18 , forestry 19 , and information theory 20 .
The following is a description of the RSS design: Initially, s 2 randomly selected units are taken from the population and divided into s groups of s units each.Without using any measures, the s units in each set are ranked.The unit that ranks lowest among the first s units is selected for actual quantification.The unit that ranks second lowest among the second set of s units is measured.The procedure is carried out again until the largest unit is determined from the sth group of s units.Hence, X (h)h = X (1)1 , X (2)2 , . . ., X (s)s , h = 1, . . ., s , represents the one-cycle RSS.The process can be repeated l times to produce a sample of size s • = sl if a larger number of samples is needed.The l-cycle RSS is represented as X (h)hv = X (1)11 , X (2)22 , . . ., X (s)sl , h = 1, . . ., s and v = 1, . . ., l .In the present work, we write X hv instead of X (h)hv .Wolfe 21 mentioned that set sizes (s) larger  The issue of RSS-based estimation for a variety of parametric models has been the subject of several studies recently.The location-scale family distributions' parameter estimator was examined by Stokes 22 .Bhoj 23 investigated the scale and location parameter estimates for the extreme value distribution.Abu-Dayyeh et al. 24 used SRS, RSS and a modification of RSS to investigate various estimators for the location and scale parameters of the logistic distribution.Under RSS, median RSS (MRSS), and multistage MRSS in case of imperfect ranking, Lesitha and Yageen 25 investigated the scale parameter of a log-logistic distribution.Inference of the log-logistic distribution parameters, based on moving extremes RSS, was discussed by He et al. 26 .Using RSS and SRS, Yousef and Al-Subh 27 obtained the maximum likelihood estimators (MLEs), moment estimators, and regression estimators of the Gumbel distribution parameters.Regarding SRS, RSS, MRSS, and extreme RSS (ERSS), Qian et al. 28 derived a number of estimators for the Pareto distribution parameters in the case where one parameter is known and both are unknown.The MLEs for the generalized Rayleigh distribution parameters were derived by Esemen and Gurler 29 , using SRS, RSS, MRSS and ERSS.In the framework of SRS, RSS, MRSS, and ERSS, Samuh et al. 30 presented the MLEs of the parameters pertaining to the new Weibull-Pareto distribution.Yang et al. 31 explored the Fisher information matrix of the log-extended exponential-geometric distribution parameters based on SRS, RSS, MRSS, and ERSS.Al-Omari et al. 32 investigated the generalized quasi-Lindley distribution parameters using the following estimators: MLEs, maximum product of spacings (MXPS) estimators, weighted least squares estimators, least squares estimators (LSEs), Cramer-von-Mises (CRM) estimators, and Anderson-Darling (AD) estimators based on RSS.Further, Al-Omari et al. 33 considered similar procedures discussed as in Al-Omari et al. 32 to examine estimators of the x-gamma distribution.Under stratified RSS, Bhushan and Kumar 34 examined the effectiveness of combined and separate log type class population mean estimators.The suggested estimators' mean square error and bias expressions were determined.The efficiency criteria were provided and a theoretical comparison between the proposed and current estimators was conducted.For more recent studies, see [35][36][37][38][39][40][41][42] .
The statistical literature proposes different estimation techniques since parameter estimation is important in real-world applications.Parameter estimation frequently involves the use of conventional estimation techniques like the LSE and MLE approaches.Both of them have advantages and disadvantages, but the most often used estimation technique is the ML method.The parameters of the PLD may be estimated using eight other methods of estimation in addition to the widely used MLE and LSE.These eight methods are AD, minimum spacing absolute distance (SPAD), MXPS, minimum spacing absolute log distance (SPALoD), minimum spacing square distance (MSSD), CRM, minimum spacing square log distance (MSSLD), and minimum spacing Linex distance (MSLND).It is difficult to compare the theoretical performance of different techniques, hence, extensive simulation studies are carried out under various sample sizes and parameter values to assess the performance of different estimators.Using a simulation scheme, the various PLD estimators based on the RSS design are then contrasted with those offered by the SRS approach.In this regard, six evaluation criteria are employed to assess the effectiveness of the estimating techniques.As far as the authors are aware, no attempt has been made to compare all of these estimators under RSS for the PLD.This fact served as the novelty and motivation for this study as we compare all of these estimators under RSS for the PLD.
The following sections provide a rough outline of the article.The various estimation methods for the PLD under RSS are provided in "Estimation methods based on RSS".Several PLD estimators under SRS are given in "Estimation methods based on SRS".The Monte Carlo simulation analysis that compares the effectiveness of the RSS-based estimators is examined in "Numerical simulation".In "Real data analysis", data analysis on milk production is conducted to demonstrate the practical applicability of the recommended estimate techniques.Some closing thoughts are included in "Concluding remarks".

Maximum likelihood method
In the following, the MLEs âML of a, bML of b, and ĉML of c for the PLD are obtained based on RSS.To get these estimators let X hv = {X hv , h = 1, . . ., s, v = 1, . . ., l} be an RSS of size s • = sl with PDF (1) and CDF (2), where l is cycles count and s is the set size.The likelihood function (LF) of the PLD is obtained by inserting Eqs.(1) and (2) into Eq.(3) as follows: where �(x vh , ω) = (a + 1)(b − c ln(x vh )).The log-LF of the PLD, denoted by L RSS , is as follows:

Anderson-Darling method
The class of minimal distance approaches includes the AD method.In this subsection, the ADEs of a, b, c, say âAD , bAD , ĉAD of the PLD are obtained using RSS.Suppose that X (1:s • ) , X (2:s • ) , . . ., X (s • :s • ) are ordered RSS items taken from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The ADEs âAD , bAD , ĉAD of a, b, and c are derived by minimizing the function where F(.|ω ) is the survival function.Alternatively, the ADEs âAD , bAD , and ĉAD of the PLD can be obtained by solving the subsequent non-linear equations in place of Eq. (8): and (4) |ω have the same expressions as ( 9), ( 10) and ( 11) by replacing the ordered sample x (i:s • ) by the ordered sample x (s • −i+1:s • ) .

Cramer-von-Mises method
The CRM method is a member of the minimal distance method class.This subsection provides the CRME of the parameter a, denoted by âCR , the CRME of parameter b, denoted by bCR , and the CRME of parameter c, denoted by ĉCR , of the PLD using RSS.Let X (1:s • ) , X (2:s • ) , . . ., X (s • :s • ) are ordered RSS items taken from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The CRMEs âCR , bCR , ĉCR of a, b, and c are derived by minimizing the function Rather than using Eq. ( 12), these estimators can be obtained by solving the non-linear equations and where ζ 1 x (i:s • ) |ω , ζ 2 x (i:s • ) |ω , and ζ 3 x (i:s • ) |ω are given in Eqs. ( 9), (10), and (11).

Maximum product of spacings method
This subsection provides the MXPSE of the parameter a, denoted by âMP , the MXPSE of parameter b, denoted by bMP , and the MXPSE of parameter c, denoted by ĉMP , of the PLD using RSS. Let are ordered RSS items from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The uniform spacings are defined as the differences where F x (0:s The MXPSEs âMP , bMP , ĉMP of a, b, and c are found by maximizing the geometric mean of the spacing, which is obtained by maximizing the following function The MXPSEs âMP , bMP , ĉMP are provided by solving numerically the equations Vol:.( 1234567890 9), (10), and (11).

Least squares method
Here, the LSE of the parameter a, denoted by âLS , the LSE of parameter b, denoted by bLS , and the LSE of param- eter c, denoted by ĉLS , of the PLD are covered using RSS.Suppose that  9), (10), and (11).

Minimum spacing absolute distance
In the following, we obtain the SPADE of the parameter a, denoted by âSPA , the SPADE of parameter b, denoted by bSPA , and the SPADE of parameter c, denoted by ĉSPA , of the PLD using RSS.

Minimum spacing square distance
In this subsection we are concerned with the MSSDE of parameter a, represented by âSD , the MSSDE of parameter b, represented by bSD , and the MSSDE of the parameter c, represented by ĉSD of the PLD based on RSS.
Let X (1:s • ) , X (2:s • ) , . . ., X (s • :s • ) are ordered RSS items from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The MSSDEs âSD , bSD , and ĉSD are obtained after minimizing the following function with respect to a, b, and c: The following nonlinear equations can be numerically solved to obtain âSD , bSD , and ĉSD rather of using Eq. ( 16),  9), (10), and (11).

Minimum spacing square-log distance
Here, the MSSLDE of parameter a, represented by âSLo , the MSSLDE of parameter b, represented by bSLo , and the MSSLDE of the parameter c, represented by ĉSLo of the PLD are determined based on RSS.
Let X (1:s • ) , X (2:s • ) , . . ., X (s • :s • ) are ordered RSS items from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The MSSLDEs âSLo , bSLo , and ĉSLo are obtained after minimizing the following func- tion with respect to a, b, and c: The following nonlinear equations can be numerically solved to obtain âSLo , bSLo , and ĉSLo rather of using Eq. ( 17), (15)

Maximum likelihood estimators
Here, the MLEs ãML of a, bML of b, and cML of c for the PLD are obtained based on SRS.To get these estimators, suppose that x 1 , x 2 , . . ., x s • is an observed SRS of size s • from the PLD with PDF (1).The log-LF of a, b and c, is given by: When differentiating ℓ SRS with respect to a, b and c, we obtain the following equations: Using the statistical software Mathematica, the nonlinear equations ( 19)-( 21) may be solved numerically after setting them equal to zero, to obtain the MLEs ãML , bML , and cML of a, b, and c, respectively.

Anderson-Darling estimators
In this subsection, the ADE of parameter a, say ãAD , ADE of parameter b, say bAD , and the ADE of parameter c, say cAD of the PLD are obtained using SRS.Let X (1) , X (2) , . . ., X (s • ) be ordered items from SRS following PLD with sample size s • .Thus, the ADEs ãAD , bAD , cAD of a, b, and c are derived by minimizing the function Alternatively, the ADEs of the PLD can be obtained by solving the subsequent non-linear equations in place of Eq. ( 22 9), (10), and (11) with ordered sample x (j) and x (s • −j+1) .

Cramer-von-Mises estimators
Here, we get the CRME of the parameter a, denoted by ãCR , the CRME of parameter b, denoted by bCR , and the CRME of parameter c, denoted by cCR , of the PLD are covered using SRS method.

Maximum product of spacings estimators
This subsection presents the MXPSE of the PLD using SRS for parameter a, indicated by ãMP , the MXPSE of parameter b, indicated by bMP , and the MXPSE of parameter c, indicated by cMP .
Let X (1) , X (2) , . . ., X (s • ) are ordered SRS items taken from the PLD with sample size s • .The MXPSEs ãMP , bMP , and cMP of a, b, and c are found by maximizing the geometric mean of the spacing, which is obtained by maximizing ( 21)  9), (10), and (11).

Least squares estimators
Here, we use the SRS method to produce the LSE of the parameter a, denoted by ãLS , the LSE of parameter b, denoted by bLS , and the LSE of parameter c, denoted by cLS , of the PLD.Let X (1) , X (2) , . . ., X (s • ) be ordered SRS items taken from the PLD with sample size s • .The LSEs ãLS , bLS , and cLS are obtained after minimizing the following function with respect to the unknown parameters a, b, and c: Alternately, the LSEs ãLS , bLS , and cLS are acquired by minimizing the following equations: and where ζ 1 (.|ω ), ζ 2 (.|ω ), and ζ 3 (.|ω ) are given in ( 9), (10), and (11).

Minimum spacing absolute distance estimators
This subsection provides the SPADE of the unknown parameter a, represented by ãSPA , the SPADE of the unknown parameter b, represented by bSPA , and the SPADE of the unknown parameter c, represented by cSPA of the PLD based on the SRS method.
Let X (1) , X (2) , . . ., X (s • ) be ordered SRS items taken from PLD with sample size s • .The SPADEs ãSPA , bSPA , and cSPA are obtained after minimizing the following function with respect to a, b, and c: Alternately, the following nonlinear equations can be solved numerically to yield the SPADEs ãSPA , bSPA , and cSPA and  9), (10), and (11).

Minimum spacing absolute-log distance estimators
Here, we determine the SPALoDE of the unknown parameter a, represented by ãLD , the SPALoDE of the unknown parameter b, represented by bLD , and the SPALoDE of the unknown parameter c, represented by cLD of the PLD based on the SRS technique.Let X (1) , X (2) , . . ., X (s • ) are ordered SRS items taken from the PLD with sample size s • .The SPALoDEs ãLD , bLD , and cLD are obtained after minimizing the following function with respect to a, b, and c: The following nonlinear equations can be numerically solved to obtain ãLD , bLD , and cLD rather of using Eq. ( 24 9), (10), and (11).

Minimum spacing square distance estimators
Here, we determine the MSSDE of the unknown parameter a, represented by ãSD , the MSSDE of the unknown parameter b, represented by bSD , and the MSSDE of the unknown parameter c, represented by cSD of the PLD based on SRS.
Let X (1) , X (2) , . . ., X (s • ) are ordered SRS items taken from the PLD with sample size s • .The MSSDEs ãSD , bSD , and cSD are obtained after minimizing the following function with respect to a, b, and c: The following nonlinear equations can be numerically solved to obtain ãSD , bSD , and cSD rather of use Eq. ( 25 9), (10), and (11).

Minimum spacing square-log distance estimators
Here, the MSSLDE of parameter a, represented by ãSLo , the MSSLDE of parameter b, represented by bSLo , and the MSSLDE of parameter c, represented by cSLo of the PLD are determined based on SRS.

Minimum spacing Linex distance estimators
This subsection provides the MSLNDE of parameter a, say ãSLx , the MSLNDE of parameter b, say bSLx , and the MSLNDE of parameter c, say cSLx of the PLD based on SRS.Let X (1) , X (2) , . . ., X (s • ) are ordered SRS items taken from the PLD with sample size s • .The MSLNDEs ãSLx , bSLx , and cSLx are obtained after minimizing the following function with respect to a, b, and c: The following nonlinear equations can be numerically solved to obtain âSLx , bSLx , and ĉSLx rather of using Eq. ( 27 9), (10), and (11).

Numerical simulation
The variety of estimation techniques described in this study is examined in this section.By creating random data sets produced from the proposed model, the effectiveness of these methods in identifying model parameters is evaluated.After that, these data sets go through ranking processes, and the estimation techniques are used to identify which one is the best.The simulation operates on the assumption of a flawless ranking, as elaborated below: • We compute the corresponding sample sizes s • = sl , resulting in s • = sl = 30, 75, 150, 250, 400 .This allows us to create an RSS from the suggested model with a fixed set size of s = 5 and changing cycle numbers l = 6, 15, 30, 50, 80. • We generate SRS from the suggested model using the specified sample sizes s • = 15, 50, 120, 200, 300, 450.
• Using the actual parameter values (a, b, c), we derive a set of estimates for each sample size.
• To assess the efficacy of the estimation methods, six metrics are utilized, comprising: • The average of absolute bias (BIAS), computed by the formula: • The mean squared error (MSE), determined as follows: • The mean absolute relative error (MRE), evaluated using the expression: , where F(x; ω ω ω) = F(x) and x ij represent values obtained at the i-th iteration sample and j-th component of this sample.
• T h e m a x i mu m ab s olut e d i f fe re n c e, re pre s e nt e d by D max , o bt ai n e d f rom : www.nature.com/scientificreports/ • The average squared absolute error (ASAE), computed as: , where x (i) denotes the ascending ordered observations, and ω ω ω = (a, b, c).
• The metrics delineated in the preceding step function as impartial standards for appraising the precision and dependability of the estimated parameters.Employing these assessment criteria facilitates a thorough evaluation of the efficacy of the estimation methods.This evaluative procedure yields significant insights into the effectiveness and suitability of these methods for the specific model in question.• This approach can be repeated several times to provide a solid and trustworthy assessment of the estimation methods.By ensuring consistency and clarity in the performance findings, this repeated assessment improves our comprehension of how successful these strategies are in parameter estimation for the model.• The assessment metrics related to RSS and SRS are shown in Suppl Tables 1-10 (see Suppl Appendix).These tables provide a thorough summary of the outcomes attained.The numbers in these tables represent the relative effectiveness of each strategy out of all the estimation techniques that were looked at.Reduced values indicate better performance than the examined estimation methods.These tables are crucial for evaluating the relative merits and efficacy of the various estimation methods.• The MSE ratio of SRS to RSS is shown in Suppl Table 11, which facilitates the evaluation of the MSE perfor- mance of different sampling techniques and provides information on their efficiency.• Suppl Tables 12 and 13 for SRS and RSS (see Suppl Appendix), respectively, give comprehensive rankings, including partial and total ranks.These ranking tables thoroughly analyze each estimation technique's relative efficacy and performance, facilitating a greater comprehension of its advantages and disadvantages.
After a meticulous examination of the simulation outcomes and the rankings depicted in the tables, several deductions emerge: • Notably, our model estimates demonstrate consistency for both SRS and RSS data sets.This consistency implies that the estimates progressively approach to the true parameter values as the sample size expands.• Every metric used shows a similar trend: a decline with increasing sample size.This trend implies that more accurate and precise parameter estimations are produced with larger sample numbers.• Our simulation findings for the SRS and RSS data sets suggest that MXPSE is the best technique when assess- ing the precision of our calculations.• Estimates from RSS data sets show more efficiency than estimates from SRS data sets, as seen in Suppl Table 11.This result suggests that RSS is a more effective sampling technique, producing estimates with a lower MSE.

Real data analysis
This section emphasizes the usefulness of the suggested estimation techniques by thoroughly elaborating on a real data set.This analysis clarifies how these estimation methods may be applied to real data, demonstrating their usefulness and applicability in real-world research and decision-making scenarios.The data set under consideration features the total milk production during the initial birth of 107 cows from the SINDI.This data set was investigated by Abd El-Bar et al. 12 , and its values are as follows: 0. Results of the descriptive analysis can be found in Table 1.A variety of graphical representations is shown in Fig. 2, including histograms, quantile-quantile (Q-Q) plots, violin plots, box plots, total time on test (TTT) plots, and kernel density plots.The probability-probability (P-P) plot, estimated CDF, estimated survival function, and a histogram with the estimated PDF, are given in Fig. 3.The SRS and RSS estimates obtained from the PLD are shown in Tables 2 and 3, respectively.Several goodness-of-fit statistics, namely from the Anderson-Darling test (AT), the Cramer-von-Mises test (WT), and the Kolmogorov-Smirnov test (KST), are used to assess the models, see Table 4.These values (the smaller, the better) demonstrate that RSS is better than SRS for various estimation techniques.Moreover, it is evident from Figs. 4 and 5 how well the models fit the data.

Concluding remarks
The accuracy of parameter estimators is considerably influenced by the sampling technique used in statistical parameter estimation problems.In the current work, the parameter estimates of the PLD are examined using both SRS and RSS approaches.The various estimates obtained by RSS were contrasted with those obtained through SRS.Six metrics were used to evaluate the effectiveness of the estimation methods.Based on our simulation findings for the SRS and RSS data sets, the MXPS method seems to be the best choice in terms of accuracy of the estimates.For both the SRS and RSS data sets, our model estimates show consistency.It may be inferred from this consistency that as the sample size increases, the estimates gradually become closer to the actual parameter values.Compared to estimates obtained from RSS data sets, those created from SRS data sets are less efficient.

Figure 2 .Figure 3 .
Figure 2. Some plots for the real data set.

Figure 4 .
Figure 4. Plots of the estimated PDFs of the PLD with histogram for the two sampling methods when s • = 60.
This subsection provides the SPALoDE of the unknown parameter a, represented by âLD , the SPALoDE of the unknown parameter b, represented by bLD , and the SPALoDE of the unknown parameter c, represented by ĉLD of the PLD based on the RSS method.Let X (1:s • ) , X (2:s • ) , . . ., X (s • :s • ) are ordered RSS items from the PLD with sample size s • = sl, where s is set size and l is the cycle number.The SPALoDEs âLD , bLD , and ĉLD are obtained after minimizing the following function with respect to a, b, and c: Vol.:(0123456789) Scientific Reports | (2024) 14:17652 | https://doi.org/10.1038/s41598-024-67693-4www.nature.com/scientificreports/Minimum spacing absolute-log distance

Table 4 .
Parameter estimates and goodness-of-fit measures for the SRS and RSS designs with s • = 60.