SEQUENTIAL d -GUARANTEED ESTIMATE OF THE NORMAL MEAN WITH BOUNDED RELATIVE ERROR

In this paper, we continue our research on evaluation of the mean value of the normal distribution with prior information that this parameter is positive and very small. These data are obtained by using a prior exponential distribution with a large intensity parameter. The estimation problem with guaranteed relative error is considered. This issue is more important when small fractions are estimated. In addition to restrictions on the relative error, the procedure must have a given level of d -risk. We suggest a sequential procedure based on the ﬁrst achieve-ment by posterior probability of estimate reliability of a given level 1 − β . The procedure is adapted to the problem of estimating harmful impurities in food products.


Introduction
As in [1], the problem of estimating the mean value θ of the normal distribution with a known variance σ 2 is considered under the assumption that the unknown value θ is an implementation of ϑ with a prior exponential distribution F (x) = 1 − exp{−λ θ}. An estimation θ ν = θ ν (X 1 , . . . , X ν ) by a random sample X ν = X 1 , . . . , X ν with an observation stopping moment ν should satisfy the following restriction: The posterior reliability of the estimate for a given relative accuracy is calculated. A Bayesian estimate for θ in the case of a fixed number of observations n is found and its " d-reliability" is calculated. Assuming that the values σ 2 and λ are known, a sequential estimation procedure is constructed. This procedure is based on the first crossing of the given constraint 1 − β by the posterior reliability (see [2,3] for examples of using this procedure).
The constructed procedure is adapted to the problem of estimating harmful impurities in food products. It should be noted that we know nothing about the guaranteed procedure based on a fixed number of observations for the case when constraints are given by the relative estimation error. The d-posterior approach for solving this problem enables construction of a sequential guaranteed procedure (see [4][5][6][7][8][9] for the classical solutions of sequential guaranteed procedures). The simulation results show that the observations volume with a sequential procedure will be with high probability unacceptably large for practical use. It turned out that these large volumes correspond to too small values of θ , which gives the prior distribution. We considered a more practical situation with truncation of the prior distribution at a certain threshold close to the standardized value of the parameter. In this case, the number of observations becomes plausible for practical application.

Bayesian estimate θ for a fixed volume of observations
Formally, within the framework of the general theory of statistical inference, the problem consists in estimating of the mean θ of the normal (θ, σ 2 ) distribution with a loss function: L(θ, d) = 1, if |θ − d|/d > ∆ , and is 0 otherwise. Here, ∆ is a given restriction on the accuracy of estimation. Since the statistical experiment has sufficient statistics S = n 1 X k ,, the family of distributions of the random sample is reduced to a family of normal (θ, σ 2 /n) density functions of sufficient statistics: It is assumed that the prior distribution of the random parameter ϑ has a density function Now, we turn to the problem of θ estimation with the prior information taken into account. This problem consists in constructing a procedure for estimating (ϕ, ν) with a decision function θ ν satisfying the inequality The Bayesian estimation for θ, which we will use for constructing sequential procedure, is defined as a maximum point of the posterior reliability over a. The density function of the posterior distribution is calculated in [1]. It is the density function of the truncated normal (T, σ 2 /n) distribution: and Φ(·) is the distribution function of the standard normal law.
Notably, that the inequality |ϑ − a| /a ≤ ∆ is equivalent to 1 − ∆ ≤ ϑ/a ≤ 1 + ∆, and, since posterior distribution ϑ is concentrated on the positive semiaxis, the posterior reliability of the solution a (estimation of θ ) is In [10], it was shown that the estimates that provide the minimum of d-risk should be d-minimax. Unfortunately, the methods for constructing such estimates are still not known and the minimax estimate for θ is found only for the case of normal prior distribution (see [11]). The form of expression (1) for posterior reliability does not show any hope to obtain the estimate with the uniformly minimal d-risk in the closed form. So, the only thing that we can do is to find the Bayesian estimate, which is necessary to construct the first crossing sequential procedure.
The Bayesian estimate θ G = a(T ) is the point of attaining the maximum of the function over a. Using the traditional methods of differential calculation to find the function extremum, we obtain the Bayesian estimate where It is easy to see that when ∆ → 0, Thus, the posterior reliability of the Bayesian estimate is Proposition 1. The Bayesian estimate θ G , based on a fixed number of observations n , has d-risk Proof. The d-risk of estimation is the conditional mathematical expectation of the posterior risk of estimation with respect to the decision function. So, it is equal to the substitution in a posteriori risk (4) d for a and the root of the equation θ G = d for T. The simple calculations show that this root is It is easy to check that the d-risk of the Bayesian estimate has a range of values that fills the entire interval [0; 1]. Thus, the Bayesian estimate, like the X in the classical approach, does not solve the problem of estimating the mean of the normal distribution with guaranteed limitations on the accuracy and reliability of the assessment.

Sequential first crossing procedure
The first crossing procedure is defined by the stopping moment when the posterior probability (2) crosses the given reliability level 1 − β for the first time. After the experiment is stopped at some step n, the unknown value θ is defined by the calculation of the Bayesian estimate. In contrast to the analogous problem for estimation with the absolute error, the first crossing procedure (see [1]) does not stop with probability 1.
The boundary of the regions of continuation and stopping of observations is shown in Fig. 1 with an illustration of the trajectory of reaching this boundary.
The construction of an empirical analogue of this evaluation procedure is set in paragraph 4 of [1]. There are also mentioned applications to evaluation of the content of arsenic in food products, which are applicable without change to the procedure with the relative error.

Investigating the properties of the first intersection procedure by statistical modeling
The distribution of the stopping moment of the presented sequential procedure is investigated using the statistical modeling method within the same formulation of the problem as in [1]. It should be reminded that the sanitary-epidemiological rules and regulations of the Russian Federation assign the upper threshold θ 0 of arsenic content as 0.1-0.2 mg/kg. Assuming that the input quality level is Q = 0.99, we get the value λ = 25 at the rate of 0.2. Using the same arguments as in [1] for selecting the values for variance and guaranteed accuracy, we can set ∆ = σ = 0.01.
The 10 4 replications of the sequential first intersection procedure were performed, where the stopping moment is determined by (3). An estimate of the distribution of ν is presented in Table 1. The obtained data indicate an unacceptably large amount of observations in comparison with the fact that we got under restrictions on the absolute error in [1]. The analysis of the values θ , which produce the prior distribution in our modeling, shows that the large values ν arise only for very small values θ ( < 10 −3 ). It is clear that such values of the arsenic content in a series of products are hardly possible in real production practice. Usually the content of arsenic, as well as other harmful impurities, is close to half of the standardized value. In this regard, if we assume that the value θ in the experiment comes from a truncated exponential distribution, such as 1 − exp{−λ(θ − 0.1)}, then the random number of observations in the experiment with a high probability becomes indeed acceptable (see Table 2).

Conclusions
The paper presents the solution of the statistical problem of estimating the mean θ of the normal distribution for prior information on the positivity and small value of the estimated parameter, when the volume of observations is determined by the given restrictions on the relative accuracy and reliability of the estimate. As noted in the Introduction, we do not know the guarantee procedure for estimating θ on a fixed number of observations, when the restrictions relate to the relative estimation error. The d-posterior approach to solving this problem led us to the construction of a consistent guaranteed procedure. Notably, this approach does not require a small probability of the given deviations ∆ of the estimate (random variable-statistics) from the fixed (given) value θ , but provide the guaranteed probability (1 − β) of a hitting random parameter ϑ to the given neighbourhood of the estimate obtained in the statistical experiment.
The results of simulation of the stopping moment distribution show that the amount of observations using a sequential procedure with a high probability may be unacceptably large for practical use. It was revealed that these large volumes arise only in cases when the prior distribution "throws out" an excessively small value of θ. If we assume that in practice the values θ are extremely small, which is quite true for a number of food products containing arsenic, then the proposed sequential procedure has quite acceptable volumes of observations.