Characterization and Goodness-of-Fit Test of Pareto and Some Related Distributions Based on Near-Order Statistics

In this paper, a new definition of the number of observations near the kth order statistics is developed.*en some characterization results for Pareto and some related distributions are established in terms of mass probability function, first moment of these new counting random variables, and using completeness properties of the sequence of functions x, 0< x< 1, n≥ 1 { }. Finally, new goodness-of-fit tests based on these new characterizations for Pareto distribution are presented. And the power values of the proposed tests are compared with the power values of well-known tests such as Kolmogorov–Smirnov and Cramer-von Mises tests by Monte Carlo simulations.


Introduction
Let order statistics X 1: n ≤ X 2: n ≤ · · · ≤ X n: n be the nondecreasing order of independent and identically distributed (iid) random variables X 1 , X 2 , . . . , X n with cumulative distribution function (cdf ) F X (·). e order statistics have important roles in different areas of statistics and probability. In reality, some kinds of order statistics are more applicable. For example, in actuarial science, the distribution of the minimum of the two lifespans of the couple is important for insurance policy to make decisions. In industries, specifically in reliability and survival analysis, order statistics are used to solve problems. Meteorology, hydrology, and so on are other fields of applications of order statistics. Interested readers can study the detail of theory and application of order statistics, for instance, in Arnold et al. [1]. Let F X (·) be discrete cdf. For the first time, Eisenberg et al. [2] defined the quantity K n � n i�1 I(X i � X n: n ) as the number of winners in a golf competition. ey studied sufficient conditions under which K n converges to 1. After that Pakes and Steutel [3] considered similar notions for continuous cdf as follows: K n (a) � n j�1 I X n:n − a,X n:n ( ) X j , where a > 0 is a constant. In fact, K n (a) counts the number of observations in the left-hand neighbourhood of the sample maximum with fixed distance "a." Later, K n (a) was developed to the number of observations near the kth-order statistics as follows: K − (n, k, a) � # j ∈ 1, 2, . . . , n { }; X j ∈ X k: n − a, X k:n , where the support of K − (n, k, a) is 0, 1, . . . , k − 1. Similarly, the number of observations in the right-hand neighbourhood of the kth order statistics was defined as K + (n, k, a) � # j ∈ 1, 2, . . . , n { }; X j ∈ X k: n , X k:n + a . (3) e support of K + (n, k, a) is 0, 1, . . . , n − k. e probability mass function (pmf ) of K − (n, k, a) is as follows: where η L (x, a) � (F X (x − a)/F X (x)) and F k:n (·) is the cdf of X k: n . Also, the pmf of K + (n, k, a) is given by where η U (x, a) � (F X (x + a)/F X (x)). For more details, one can refer to Dembińska et al. [4].
More results of the description of their distributions, asymptotic properties, and their generalization have been investigatede.g., Pakes and Li [5], Li [6], Pakes ([7,8]), Balakrishnan and Stepanov ([9,10]), Dembińska et al. [4], and Dembińska ( [11][12][13]). So far, few researchers have addressed the issue of statistical inference based on nearorder statistics, e.g., Müller [14], Hashorva and Hüsler [15], Akbari et al. [16], and Akbari and Akbari [17]. In the present paper, a new version of near-order statistics is first defined. en some characterization results as a statistical tool in goodness-of-fit (GOF) test for some continuous distributions are obtained. e results of this paper are organized as follows. Section 2 contains preliminary results. e characterization results of the paper are included in Section 3. Finally, in Section 4 are introduced two tests for goodnessof-fit tests for Pareto distribution. e critical values of the proposed test statistics are computed by Monte Carlo simulations. Also, their power is compared with those computed by well-known tests such as Kolmogorov-Smirnov and Cramer-von Mises tests by simulations. All simulations are carried out by using R 3.6.3 and with 10000 replications.

Preliminary Results
Let X 1 , . . . , X n be iid random variables from continuous cdf F X (·) with support S X , and F X be one of the distribution functions Pareto, uniform, or power function. Constructing characterizations for such F X by the pmfs or moments of some functions of random variables is a common way. But it is not possible by counting random variables K ± as the number of observations on the location-type neighbourhood of the certain order statistics defined in equations (4) and (5), because their pmfs do not have a closed-form expression.
erefore, new types of number of observations falling within the left-hand and right-hand of neighbourhood of the specific order statistics, as an extension to scale-type neighbourhood, are introduced, respectively, as follows: where 0 < a < 1 and where b > 1.

Proposition 1.
Using the same arguments given in Dembińska et al. [4], it is concluded that the pmfs of new counting random variables K − ′ (n, k, a) and K + ′ (n, k, b) are the same as the pmfs of K − (n, k, a) and K + (n, k, b), respectively, with η L (x, a) � (F X (ax)/F X (x)) and η U (x, a) � (F X (bx)/F X (x)), that is, On the other hand, by simple algebra calculations, the first moment of K − ′ (n, k, a) and K + ′ (n, k, b) can be derived, respectively, as follows: Here, some results as examples of special cases of Pareto and power function distributions are reported that will be useful for obtaining further results in the next section. Example 1. Let X 1 , X 2 , . . . , X n be iid random variables from Pareto (α, β). So their survival distribution functions is given by en from equation (9), the pmf of K + ′ (n, k, b) related to this sequence is as follows: Equation (13) shows that K + ′ (n, k, b) has binomial pmf with parameters (n − k) and 1 − (1/b) α . us, which does not depend on scale paremeter β.
Example 2. It is well-known that if X∼ Pareto (α, β), then random variable Y � (1/X) has power function distribution with following cdf: e notation power (α, β) is used for power function distribution with parameters α and β. It is also called generalized uniform distribution because it is standard uniform cdf at β � 1 and α � 1. From (8), the pmf of K − ′ (n, k, a) for random variables X 1 , . . . , X n that are distributed as power (α, β) is concluded as According to (16), K − ′ (n, k, a) has binomial pmf with parameters k − 1 and success probability 1 − a α . So As we know, the uniform distribution function on interval (0, 1) is a special case of power (1,1) with following cdf: erefore from (16), the pmf of K − ′ (n, k, a) when and its expectation is

Characterizaion Results
In this section, some characterization results based on distributional properties of near-order statistics K − ′ (n, k, a) and K + ′ (n, k, b) for some continuous distributions are established in terms of property of sequence of complete functions. us, in the sequel, some notions and theorems related to this theory are reminded.

Definition 1.
A sequence ϕ n n≥1 of elements of a Hilbert space H is called complete if the only element which is orthogonal to every ϕ n is the null element, that is implies f null. e notation 〈·, ·〉 denotes the inner product of H. In the present paper, the Hilbert space L 2 [0, 1] with the following inner product being considered: where f and g are real-valued square integrable functions on [0, 1]. One of the sequences of complete functions in L 2 [0, 1] is x n , n ≥ 1 { } which is used in this paper. e following theorem is known as Müntz theorem that states the necessary and sufficient condition for completeness of the subsequence x n j , j ≥ 1, n 1 < n 2 < · · · . Theorem 1 (Higgins [18], page 95). Sequence x n 1 , x n 2 , . . . , For more details about Hilbert space and complete sequences, refer to Higgins [18]. Pareto is one of the distributions that have many applications in economics and actuturial sciences. So far, a lot of properties and characterization of it based on order statistics or their functions have been obtained, for example, Lee and Chang [19], Afify [20], Ahsanullah and Shakil [21], Ahsanullah et al. [22], and Nofal and El Gebaly [23]. In the following theorem, some characterizations for Pareto law in terms of K + ′ (n, k, b) are established. { } such that for all n ≥ j 0 + 1, b > 1 and for k � n − j 0 , we have

Journal of Probability and Statistics
(b) For all n ≥ k, b > 1, and a fixed k ≥ 1, we have Proof. If X− sequence has Pareto (α, β) cdf, by the use of equations (13) and (14), one can easily obtain parts (a) and (b). Let part (a) hold. Using pmf of K + ′ (n, k, b) and the assumptions of (a), the equality in (a) can be rewritten as e right-hand side of equation (26) can be expressed as On the other hand, replacing dF k: Taking the change of variable u � 1 − F X (x) in the lefthand side of equation (28), it is deduced: By the assumption "k � n − j 0 " and after some algebra simplifications in the aforementioned equality, it is concluded that Since If (30) holds for all n ≥ j 0 + 1, by the completeness property of the sequence (1 − u) n− j 0 − 1 , n ≥ j 0 + 1 , the following identity can be derived Hence, (33), it can be rewritten as If (34) holds for all b > 1 and t ≥ β, by the use of the method of solution given in Aczél [24], it is concluded that function F X (t) � ct − α is the genaral solution of (34). Because t ≥ β and F X (·) is a survival distribution function, the constant c will be β α . So, the proof is completed.
Suppose that part (b) holds. en from equation (11), it is deduced that Since , the last equality, after some simplifications, can be expressed as e rest of the proof is similar to the proof of part (a). So far, some results of characterization of power function distribution have been obtained. For example, it was characterized by Ahsanullah et al. [25] through lower records. Also Khan and Khan [26] and Lim and Lee [27] characterized it based on dependency property of lower records. Tavangar [28] presented a characterization of it using dual generalized order statistics. Now, in the next theorem new characterization results of power function distribution are proved. □ Theorem 3. Suppose that X 1 , X 2 , . . . , X n are iid continuous random variables from cdf F X (·) with support [0, (1/β)]. en X i 's have power function distribution with cdf (15) if and only if one of the following statements holds.
(a) For all n ≥ k and 0 < a < 1, there exists j 0 ∈ 0, 1, . . . { } such that for k � j 0 + 1 and some α > 0, we have (b) For a fixed k ≥ 1 and for all n ≥ k, 0 < a < 1 and some α > 0, we have Proof. By supposing X i 's have power function distribution with cdf (15), from equations (7) and (16), parts (a) and (b) can be easily concluded. Let condition (a) be satisfied, then e completeness property of the sequence (1 − u) n− k , n ≥ k and equation (39) is is equivalent to Taking the change of variable t � F − 1 X (u) in (41) gives e function F X (x) � cx α is the general solution of above functional equation. is completes the proof of (a). In a similar way, if (b) holds, one can easily prove that the parent population is power function distribution. e results of eorem 3 can also be observed directly from eorem 2 by noticing that X∼ power (α, β) if and only if Y � (1/X)∼ Pareto (α, β). erefore, for 0 < a < 1 where K − ′ (n, k, a) with superscript X presents the number of observations near the kth-order statistics related to X-sequence. According to relationship between distributions of power function and standard uniform that is mentioned before, from eorem 3, the following results without proof are stated. (a) For all n ≥ k and 0 < a < 1, there exists j 0 ∈ 0, 1, . . .
(b) For a fixed k ≥ 1 and for all n ≥ k and 0 < a < 1, we have Remark 1. According to eorem 1, it is not necessary that the assumptions of eorem 2 hold for all "n ≥ j 0 + 1" or "n ≥ k." is fact is also true for eorem 3 and Corollary 1. So, the results of eorems 2 and 3 and Corollary 1 hold if their assumptions provide for any increasing subsequence n i ≥ j 0 + 1, i ≥ 1 or n i ≥ k, i ≥ 1 such that the equality (23) holds.
Remark 2. Some characterization results of two-parameter exponential distribution have been obtained based on counting random variable K + (n, k, a) by Akbari and Akbari [17]. eir results of characterizations in Section 2 can also be derived directly by eorems 2 and 3. For considering this claim, suppose X be a random variable having two-parameter exponential distribution with parameters (μ, α), denoted by Exp (μ, α), and the following cdf: Since X∼Exp (μ, α) if and only if Z � e X ∼Pareto (α, e μ ), the following relationship holds between X K + and Z K + ′ . For a > 0, X K + (n, k, a) � # j ∈ 1, . . . , n { }, X j ∈ X k: n , X k: n + a � # j ∈ 1, . . . , n { }, Z j ∈ Z k: n , e a Z k: n � Z K + ′ n, k, e a . (47)

Goodness-of-Fit Test Results
So far, many results of GOF tests for different distributions using characterization results have been obtained. For example, Rizzo [29], Obradovi et al. [30], and Volkova [31] obtained GOF tests for Pareto distribution in terms of its different characteristic properties. According to Nikitin [32], tests based on characterization results are usually more efficient than other tests, because the unique feature of the same distribution has been used in constructing test statistics.
Let X 1 , . . . , X n be iid random variables from continuous distribution function F X (x). For testing null hypothesis for some x, are presented two test statistics based on characterization results of eorem 2. According to part (a) of this theorem, the null hypothesis H 0 will be rejected if there exists "n" such that for all j ≤ n − 1 and k � n − j, equation (24) is not satisfied, i.e., the value of quantity to be large. From (9) and assumption k � n − j, the above expression is equivalent to Replacing F X (x) by F n (x), the emprical distribution function, a point estimator of (50), can be considered as erefore, the test statistic that its large value will be rejected, H 0 , is given by With the same discussion, the another test statistic based on the part (b) of eorem 2 can be as where D E (k, n) � (1/n) n i�1 (i/(n + 1)) k− 1 (1 − (i/(n + 1))) n− k | (1 − F n (bX i: n )) − (1 − (i/(n + 1)))(1/b) α |.
It is obvious that D P and D E are free of scale parameter of Pareto distribution and their large values reject H 0 .
In the rest of this section, the power values of two test statistics D P and D E will be compared with well-known tests, namely, Kolmogorov-Smirnov and Cramer-von Mises tests which their statistics are, respectively, where D + � max 1≤i≤n | (i/n) − F 0 (X i: n )| and D − � max 1≤i≤n |F 0 (X i: n ) − ((i − 1)/n)|, and Since Pareto distribution is long tail, following alternative distributions that are long tail on the right-hand side are considered for comparison of power values of statistics D P , D E , D, and W 2 . (ii) e gamma distribution with density (1/λ r Γ(r)) Since it is not easy to find the null distribution of D P , D E , D, and W 2 , Monte Carlo simulations with 10000 replications are used for calculating their power values and critical values at 5 percent significance level. Tables 1 and 2 show the results for null distribution Pareto (1,2) and Pareto (2,2), respectively. Because the statistics D P and D E have the parameter b, one can choose an optimal b to maximize corresponging power values. So, these values are calculated and shown in the tables. Unfortunately there is no accurate method to find these values and depend on the support of null and alternative distributions. e values in parentheses in the tables refer to estimated significance level. According to the results of the two tables, it is concluded that proposed tests are always more powerful than the other tests. Even in small sample size, the proposed tests perform very well and better than the others.
In the following example, it is used real data set to illustrate how the proposed tests can be applied.
Example 3 (As an application to real data). e following data represent the time for break down of a type of electrical insulating material subject to a constant-voltage stress (Nelson [33] is data recently were used by Tiku and Akkaya [34]. ey established that the null hypothesis where data come from exponential distribution cannot be rejected at 10 percent significance level. It is obvious that data come from a distribution with long tail on the right-hand side. So, Pareto distribution can be another suggested distribution for such data. For testing H 0 : X ∼ Pareto versus H 1 : X≁Pareto, first the parameters of Pareto distribution are estimated with shape parameter α � 0.51 and scale parameter β � 0.35. en, based on data, the values of the proposed statistics (with b � 1.2), Kolmogorov-Smirnov, and Cramer-von Mises statistics have been obtained as follows: Hence, the null hypothesis that data come from Pareto distribution cannot be rejected using this data.

Data Availability
No data are included in the study.

Conflicts of Interest
e authors declare that they have no conflicts of interest.