Сomparative Analysis of Polynomial Maximization and Maximum Likelihood Estimates for Data with Exponential Power Distribution

The work is devoted to the estimate accuracy comparative analysis of the experimental data parameters with exponential power distribution (EPD) using the classical Maximum Likelihood Estimation (MLE) and the original Polynomial Maximization Method (PMM). In contrast to the parametric approach of MLE, which uses the description in the form of probability density distribution, PMM is based on a partial description in the of higher-order statistics form and the mathematical apparatus of Kunchenko’s stochastic polynomials. An algorithm for finding PMM estimates using 3rd order stochastic polynomials is presented. Analytical expressions allowing to determine the variance of PMM-estimates of the asymptotic case parameters and EPD parameters with a priori information are obtained. It is shown that the relative theoretical estimates accuracy of different methods significantly depends on the EPD shape parameter and matches only for a separate case of Gaussian distribution. The effectiveness of different approaches (including valuation of mean values estimates) both with and without a priori information on EPD properties was investigated by repeated statistical tests (through Monte Carlo Method). The greatest efficiency areas for each of methods depending on EPD shape parameter and sample data volume are constructed.


Intro
Current information and measurement technologies trends are focused on the use and software systems development that will provide a solution to the problem of statistical processing the experimental results in any form of recorded measurements and necessarily includes modern statistical analysis methods. They may solve a wide range of measurement problems, but also should be used properly when applied for researching, like in case of using a model that includes the additive interaction of the measured parameter and random error [1].
It is well known that random measuring instruments errors comply with the normal (Gaussian) law seldom [2]. Moreover, the basis of measuring instruments and systems are different physical principles, different measurement methods and different measuring signals transformations. And in fact, measurement errors are the result of many factors that influence, both random and non-random, acting constantly or periodically. Therefore, it is clear that only if certain preconditions (theoretical and technical) have place -measurement errors are enough described by the normal distribution law [3].
EPD is offered as an alternative to the Gaussian distribution law assumption for measurement error probability while the model of exponential power distribution is applied [4]. And since the distribution of experimental data is usually symmetric according to its center, the exponential power distribution can be claimed as a model of error measurement distribution law.
A typical view of the EPD model is described by a species function: where -scale parameter, -shape parameter.
The EPD usage is substantiated as a model for error measurement probability distribution [5,6], to describe the probabilistic characteristics of random noise while video and audio signals processing [7], biomedical research statistics [8], risk management [9], etc. The great advantage of this model is that it can describe random variables that have both positive (flat-peak distributions) and negative (acute-peak distributions) kurtosis coefficient [6,10]. And such models are in demand in many fields of science and technology. In addition, EPD model modifications for asymmetrically distributed random variables have already been developed [11][12][13][14].

Purpose of research
As a detailed analysis of works [1,2,15,16] shows, due to the simplicity of their calculation simple statistical characteristics are used for getting the center coordinate of symmetrically distributed experimental data estimates. Incidentally, with certain predetermined requirements for the accuracy of obtaining estimates, such simple estimates may not be sufficiently accurate. Turning to the description of random variables using the probability density distribution, we actually have at present the use of the most accurate method for obtaining estimates -the method of maximum likelihood. But this approach produces problems associated with nonlinearity of processing [9,11,12]. The most complicated thing here is related to the practical application of numerical methods to solve nonlinear equations [13,14,17].
Thus, the researcher is often faced with the task of choosing either a simpler method, which may not satisfy the accuracy, or cumbersome calculations that require the implementation of complex computational procedures.
To solve this dilemma this paper proposes to apply an original approach to statistical parameter estimation, which is based on the polynomial maximization method (PMM) by describing random variables in the form of a finite moments or cumulants number [18].
In [19] was developed an algorithm for finding estimates for multiple measurements results on random errors background that are described by the EPD model; besides comparative analysis of the PMM estimates accuracy with estimates based on three nonparametric statistics: median, mean, mid-range. The purpose of this study is a comparative analysis of PMM estimates accuracy and parametric maximum likelihood estimates (MLE).
The analysis procedure involves obtaining both: theoretical expressionsfor parameters estimates variance ratio and the comparison of estimates variances empirical values using statistical modeling by the Monte Carlo method.

Mathematical problem statement
Supposing -is informative parameter, the value of which must be estimated on basis of set of values ⃗ = { 1 , 2 , . . . , } analysis. This vector contains independent and equally distributed sample values of the measurement model = + , where -random error, which is adequately described by the exponential power distribution of the form (1).
It is necessary to determine the accuracy (parameter estimates variance) of different estimation methods, as well as to assess the relative accuracy impact of the sample size and whether there is a priori information about the EPD model parameters.
3 Finding estimates using the polynomial maximization method As shown in [19][20][21][22], the application of the polynomial maximization method is based on finding the functional extremum in the form of a stochastic functional polynomial of a certain order. When using basic power transformations to form such a polynomial, a description in the form of a finite sequence of moments or cumulants up to the 2 order must be used. This polynomial is formed in a way that provides an extremum (maximum) in the true value vicinity of various evaluated parameter . Thus, finding the desired parameter estimates is reduced to finding the root of the function ( ), that is formed as a derivative of the corresponding parameter and depends on the sample values of ⃗ = { 1 , 2 , . . . , }: where ( ) = {︀ }︀ -mean values which are the initial points depending on the estimated parameter.
In fact, the function ( ) is also the sum of differences of theoretical and sample moments, considered by certain coefficients ℎ ( ). These coefficients can be found as the solution of a linear algebraic set of equations shown below. This set is based on the minimum variance ensuring criterion of the desired parameter estimates when the degree of polynomial is used: where the centered correlants , ( ) = + ( ) − ( ) ( ); , = − → 1, . In [19] it was shown that finding the parameter estimates using a symmetric error distribution model requires polynomial degree = 3 to be the least as possible.
The correlations for the first 6 initial moments required for the formation of the set of equations (3) have the form: where -the central moments of the distribution (1), depending only on the parameters of scale and shape [17]: To find the optimal coefficients that minimize the variance of PMM estimates, expressions for derivatives from the first 3 moments should be calculated: Via analytical solution of the set of equations (3) by the Cramer method (taking into account expressions (4) and (6)), the optimal coefficients expressions for the power stochastic polynomial are obtained: The equation of polynomial degree = 3 maximization for finding PMM estimatesˆfor the case of symmetrically distributed experimental data can be represented as: Substituting the values of the found optimal coefficients (7) in equation (8), we obtain directly the expression for estimating the desired parameter of the distribution center [19].

Polynomial maximization method estimates accuracy
It is known that the main statistical estimation methods accuracy criterion is the variance value of the resulting parameter estimates. One of the ways to obtain analytical expressions describing the method of maximum likelihood estimates variance is to use the apparatus of quantitative information according to Fisher where ( , , ) -logarithm of the likelihood function, -desired parameter of the distribution center,scale parameter , -shape parameter.
When using a probabilistic model of the form (1), as shown in [17], the logarithm of the likelihood function is described by the function: The variance estimates for the asymptotic case (at → ∞) can be found as the inverse of the Fisher information. Thus, using (9) and (10) allows to obtain an analytical expression: In [18] it was shown that for the polynomial maximization method a certain analogue of the amount of Fisher information is the so-called amount of obtained information about the parameter estimated using the stochastic polynomial degree , which is generally described by the expression: In addition, it is proved that in the asymptotic case (at → ∞) the amount of extracted information goes to the amount of Fisher information.
After certain basic power transformations, the expression for the amount of information obtained about the parameter can be represented as: This quantitative characteristic has statistical properties similar to Fisher's information. That is, in the asymptotic case (at → ∞) the variance of PMM-estimates is defined as the inverse of the amount obtained at the degree of information, i.e.: For a comparative analysis of the estimates relative accuracy, let us use the concept of the variance coefficient [19][20][21][22]: The obtained coefficient is the ratio of the PMM parameter estimates variance when the polynomial -th degree is used and the variance of the PMM estimates.
Using form (13), it is easy to determine an analytical expression for the amount of information obtained through the EPD shape parameter Thus, for the asymptotic case (at → ∞) expression (16) allows to theoretically determine the PMM estimates variance at the = 3 polynomial degree: It is also known [3] that the parameter estimates variance in the form of the arithmetic meandepends only on the random component variance and the sample values volume: Using expressions (11) and (18), it isn't hard to obtain the variance reduction coefficient of the PMMestimates compared to the arithmetic mean estimates: Using expressions (17)(18)(19), graphs can be presented (see Fig. 1) showing the dependence of the MLE relative accuracy and PMM estimates compared with the arithmetic mean estimates of the EPD shape parameter. Using expressions (17) and (19) it is also easy to obtain a function (the graph of which is presented in Fig. 2), which makes it possible to directly compare the accuracy (magnitude of asymptotic variances) of MLE and PMM-estimates: Obviously, the magnitude of different methods relative efficiency depends solely on the EPD shape parameter . Analyzing the dependencies shown in Fig. 1 and Fig. 2, it should be noted that for a sufficiently wide range of shape parameter values ∈ (2; 5) the theoretical efficiency of MLE and PMM estimates actually matches. For sharp-topped distributions at < 2 the MLE efficiency can be essentially higher. Also at increase > 5, i.e. at essentially flat-topped (in asymptotics uniform) distribution the related efficiency of PMM-estimations increases too.
At the end of this analysis it is necessary to emphasize once again that it was obtained the theoretical expressions of estimates variances ratio (19)(20) and built on Fig. 1 and Fig. 2 corresponding graphs depending on the presence of a priori information about the probabilities of measurement errors, i.e. EPD parameter valuestype (1).

Statistical modeling
To experimentally verify the obtained theoretical results of the different evaluation methods effectiveness analysis, statistical modeling by the Monte Carlo Method was implemented. For the statistical modeling implementation programming language R is used. This choice was made due its free distribution, as well as the presence of a large number of libraries focused on the tasks of statistical data analysis. Among them is the software module normalp package, which contains a set of functions for the generating statistical evaluation by maximum likelihood method and visualization of random EPD variables [17].
Empirical values of the coefficients of the parameter estimates variance were used as quantitative criteria characterizing the different methods efficiency: whereˆ2 ( ) ,ˆ2 ( ) ,ˆ2 ( ) 3 -based on experiments obtained using the arithmetic mean, MLE and PMM (at = 3) estimates variance value, respectively.
The experiment was carried out depending on presence or absence of a priori information about measurement errors probability. For the second case, instead of a priori information posteriori model parameters estimates values (1) (required for the maximum likelihood method) and estimates of central moments of 2,4 and 6-th degrees (required for the polynomial maximization method) were used. It is obvious that the reliability of the simulation results was also significantly influenced by the amount of sample data and the number of experiments .
The set of results of statistical modeling for different values of the EPD shape parameter ( = 1 ÷ 10) and the sample values volumes ( = 20 ÷ 200), obtained by repeated = 10 4 experiments are presented in Tab. 1 (with presence of a priori information) and Tab. 2 (with absence of a priori information).
The theoretical and experimental values analysis of the estimates variance coefficients ratio shows that there is a certain correlation between analytical calculations and results obtained by statistical modeling. The emergence of the difference (as noted earlier) is due to the fact that the expressions describing the variances of the MLE and PMM-estimates obtained for the asymptotic case (at → ∞). Tab. 1 data confirms that with increasing sample size, the difference between theoretical and experimental data decreases.

Tab. 1 Estimates variance coefficients with presence of a priori information
Statistical modeling resultŝ The experimental results also confirm the previously stated thesis that the accuracy of obtaining MLE and PMM estimates is significantly affected by the factor of presence / absence of a priori information on the error model properties (exponential distribution parameters for MLE and even central moments up to 6th degree for PMM). Tab. 2 data reflects the important fact that in the absence of a priori information on EPD properties for flat-top distributions ( > 2) the efficiency of PMM estimates is mostly higher than the MLE estimates. This can be explained by the fact that upon these conditions the influence of the uncertainty of the EPD shape parameter value on the MLE is more significant than the influence of the uncertainty of the central moments of the PMM. The relative decrease in the variance of the MLE estimates (compared to the PMM) depends on both the shape parameter and the sample values volume. And it is especially significant (up to 30%) with least samples and significant flatness.
Such results are especially important from a practical point of view, because the vast majority of real situations lacks a priori information about the true values ofrandom component model parameters (measurement errors). Therefore, for such a situation Fig. 3 shows the boundaries delimiting the areas of greatest efficiency (based on the minimum variance criterion) on different estimation methods background: arithmetic mean, MLE and PMM. These areas were also obtained by statistical modeling via Monte Carlo method (for = 10 4 experiments) at different values of EPD shape parameter and sample size .
Based on the totality of the above results the following conclusions can be drawn: -in case of the normal (Gaussian) errors distribution, the most effective (both in terms of accuracy and ease of implementation) is the use of arithmetic mean estimates; -for sharp-topped distributions, the maximum likelihood method is generally more effective, the application of which requires large amount of sample data (the boundary separating the efficiency areas depends significantly on the sample values volume and is parabolic); -for flat-topped distributions, the polynomial maximization method is more effective (the boundary separating the areas, although to a lesser extent, also depends on the sample values volume).

Conclusion
The obtained results are the second part that completes the research initiated in [19] on the validity of applying the polynomial maximization method for finding the experimental data parameters estimates with exponential power distribution.
The set of obtained results allows to position the proposed approach as a compromise in terms of the balance between simple estimates (mean, median and mid-range) and potentially more accurate estimates of maximum likelihood, the use of which requires additional a priori information. In this context, the advantages of using the polynomial maximization method are: -providing additional opportunities (compared to simple non-parametric methods) to increase the informative parameter estimates accuracy by taking into account the measurement errors properties (a posteriori values of the central moments); -reducing the impact (compared to the parametric maximum likelihood method) of the lack of a priori information on the results of the informative parameter estimates accuracy.
The obtained efficiency areas of different methods allow us to recommend their choice depending on the probabilistic error model properties (the value of the EPD shape parameter) and the experimental data sample values amount.