Likelihood Ratio Testing under Measurement Errors

We consider the likelihood ratio test of a simple null hypothesis (with density f0) against a simple alternative hypothesis (with density g0) in the situation that observations Xi are mismeasured due to the presence of measurement errors. Thus instead of Xi for i=1,…,n, we observe Zi=Xi+δVi with unobservable parameter δ and unobservable random variable Vi. When we ignore the presence of measurement errors and perform the original test, the probability of type I error becomes different from the nominal value, but the test is still the most powerful among all tests on the modified level. Further, we derive the minimax test of some families of misspecified hypotheses and alternatives. The test exploits the concept of pseudo-capacities elaborated by Huber and Strassen (1973) and Buja (1986). A numerical experiment illustrates the principles and performance of the novel test.


Introduction
Measurement technologies are often affected by random errors; if the goal of the experiment is to compare two probability distributions using data, then the conclusion can be distorted if the data are affected by some measurement errors. If the data are mismeasured due to the presence of measurement errors, the statistical inference performed with them is biased and trends or associations in the data are deformed. This is common for a broad spectrum of applications e.g., in engineering, physics, biomedicine, molecular genetics, chemometrics, econometrics etc. Some observations can be even undetected, e.g., in measurements of magnetic or luminous flux in analytical chemistry when the flux intensity falls below some flux limit. Actually, we can hardly imagine real data free of measurement errors; the question is how severe the measurement errors are and what their influence on the data analysis is [1][2][3].
A variety of functional models have been proposed for handling measurement errors in statistical inference. Technicians, geologists, and other specialists are aware of this problem, and try to reduce the effect of measurement errors with various ad hoc procedures. However, this effect cannot be completely eliminated or substantially reduced unless we have some additional knowledge on the behavior of measurement errors.
There exists a rich literature on the statistical inference in the error-in-variables (EV) models as is evidenced by the monographs of Fuller [4], Carroll et al. [5], and Cheng and van Ness [6], and the references therein. The monographs [4] and [6] deal mostly with classical Gaussian set up while [5] discusses numerous inference procedure under semi-parametric set up. Nonparametric methods in EV models are considered in [7,8] and in references therein, and in [9], among others. The regression quantile theory in the area of EV models was started by He and Liang [10]. Arias [11] used an instrumental variable estimator for quantile regression, considering biases arising from unmeasured ability and measurement errors. The papers dealing with practical aspects of measurement error models include [12][13][14][15][16], among others. Recent developments in treating the effect of measurement errors on econometric models was presented in [17] or [18] The advantage of rank and signed rank procedures in the measurement errors models was discovered recently in [19][20][21][22][23][24]. The problem of interest in the present paper is to study how the measurement errors can affect the conclusion of the likelihood ratio test.
The distribution function of measurement errors is considered unknown, up to zero expectation and unit variance. When we use the the likelihood ratio test while ignoring the possible measurement errors, we can suffer a loss in both errors of the first and second kind. However, we show that under a small variance of measurement errors, the original likelihood ratio test is still most powerful, only on a slightly changed significance level.
On the other hand, we may consider the situation that H 0 or H 1 are classes of distributions of random variables Z + √ δV. Hence, both hypothesis and alternative are composite as families H 0 and H 1 ; if they are bounded by alternating Choquet capacities of order 2, then we can look for a minimax test based on the ratio of the capacities, and/over on the ratio of the pair of the least favorable distributions of H 0 and H 1 , respectively (cf. Huber and Strassen [25]).

Likelihood Ratio Test under Measurement Errors
Our primary goal is to test the null hypothesis H 0 that independent observations X = (X 1 , . . . , X n ) come from a population with a density f against the alternative H 1 that the true density is g, where f and g are fixed densities of our interest. For the identifiability, we shall assume that f and g are continuous and symmetric around 0. Although the alternative is the main concern of the experimenter, some measurement errors or just the nature may cause the situation that the true alternative should be considered as composite. Specifically, X 1 , . . . , X n , can be affected by additive measurement errors, what appears in numerous fields, as illustrated in Section 1.
Hence the alternative is H 1,δ under which the observations are Z i,δ = X i + √ δV i , identically distributed with continuous density g δ . Here, both under the hypothesis and under the alternative, V i are independent random variables, unobservable with unknown distribution, independent of X i ; i = 1, . . . , n. The parameter δ > 0 is also unknown, only we assume that I E V i = 0 and I EV 2 i = 1, for simplicity. The mismeasured, hence unobservable, X i are assumed to have the density g under the alternative. Quite analogously, the mismeasured observations lead to a composite hypothesis H 0,δ under which the density of observations If we knew f δ and g δ , we would use the Neyman-Pearson critical region with u determined so that with a significance level α. Evidently Indeed, notice that where the expectations are considered with respect to the conditional distribution; a similar equality holds for f δ .
Combining the integration transmission in the conditional distribution, we obtain hence the size of the critical region W when used for testing H 0 against H 1 differs from α. Then we ask how the critical region W in (1) behaves when it is used as a test of H 0 . This problem we shall try to attack with an expansion of f δ , g δ in δ close to zero.

Approximations of Densities
Put f = f 0 , g = g 0 the densities of X under the hypotheses and alternative, respectively. For the identifiability, we shall assume that f 0 and g 0 are continuous and symmetric around 0. Denote f δ the density of Z δ = X + √ δV. This means that X is affected by an additive measurement error √ δ V, where V is independent of X and I EV = 0, I EV 2 = 1, I EV 4 < ∞. Notice that if densities of X and V are strongly unimodal, then that of Z is also strongly unimodal (see [26]). Under some additional conditions on f 0 , g 0 , we shall derive approximations of f δ and g δ for small δ > 0. More precisely, we assume that both f 0 and g 0 have differentiable and integrable derivatives up to order 5. Then we have the following expansion of f δ and a parallel result for g δ : Theorem 1. Assume that f 0 and g 0 are symmetric around 0, strongly unimodal with differentiable and integrable derivatives, up to the order 5. Then, as δ ↓ 0, where ϕ V denotes the characteristic function of V. Taking the inverse Fourier transform on both sides, we obtain (3), taking the above assumptions on V into account.
Consider the problem of testing the hypothesis H 0 that the observations are distributed according to density f 0 against the alternative H 1 that they are distributed according to density g 0 . Parallelly, we consider the hypothesis H 0,δ that observations are distributed according the g δ against the alternative H 1,δ that the true density is g δ . Let Φ(x) be the likelihood ratio test with critical region . . , n. We know neither δ nor V, hence the test Φ * is just an application of the critical region W for contaminated data Z 1 , . . . , Z n . Thus, due to our lack of information, we use the test Φ even for testing H 0,δ against H 1,δ , and the performance of this test is of interest. This is described in the following theorem: Theorem 2 (Assume the conditions of Theorem 1). Then, as δ ↓ 0, the test Φ * is the most powerful even for testing H 0,δ against H 1,δ , with a modified significance level satisfying Proof.
If f 0 is symmetric, then the derivative f (k) 0 is symmetric for k even and skew-symmetric for k odd, k = 1, . . . , 4. Moreover, because | f 0 (x)| and | f

Robust Testing
If the observations are missmeasured or contaminated, we observe Z δ = Z + √ δV with unknown δ and unobservable V instead of Z. Hence, instead of simple f 0 and g 0 , we are led to composite hypothesis and alternative H and K. Following [25], we can try to find suitable 2-alternating capacities, dominating H and K and to construct a pertaining minimax test. As before, we assume that Z and V are independent, I EV = 0, I EV 2 = 1, and I EV 4 < ∞. Moreover, we assume that f 0 and g 0 are symmetric, strongly unimodal and differentiable up to order 5, with derivatives integrable and increasing distribution functions F 0 and G 0 , respectively. The measurement errors V are assumed to satisfy 1 ≤ I EV 4 ≤ K with a fixed K, 0 < K < ∞. Hence the distribution of V is restricted to have the tails lighter than t-distribution with 4 degrees of freedom. We shall construct a pair of 2-alternating capacities around specific subfamilies of f 0 and g 0 . Let us determine the capacity around g 0 ; that for f 0 is analogous. By Theorem 1 we have We shall concentrate on the following family K * of densities (similarly for f 0 ): with fixed suitable ∆, K > 0. Indeed, under our assumptions, each g * δ,κ ∈ K * is a positive and symmetric density satisfying Let G * δ,κ (B), B ∈ B, be the probability distribution induced by density g * δ,κ ∈ K * , with B being the Borel σ-algebra. Then the set function is a pseudo-capacity in the sense of Buja [27], i.e., satisfying Analogously, consider a density f 0 , symmetric around 0 and satisfying the assumptions of Theorem 1 as a simple hypothesis. Construct the family H * of densities and the corresponding family of distributions F * δ,κ (·), δ ≤ ∆, κ ≤ K similarly as above. Then the set function is a pseudo-capacity in the sense of Buja [27]. Buja [27] showed that on any Polish space exists a (possibly different) topology which generates the same Borel algebra and on which every pseudo-capacity is a 2-alternating capacity in the sense of [25].
Let us now consider the problem of testing the hypothesis H = {F * ∈ H * |F * (·) ≤ v(·)} against the alternative K = {G * ∈ K * |G * (·) ≤ w(·)} , based on an independent random sample Z 1 , . . . , Z n . Assume that H * and K * satisfy (5). Then, following [27] and [25], we have the main theorem providing the minimax test of H against K with significance level α ∈ (0, 1) : where π(·) is a version of dw dv (·) and C and γ are chosen so that I E v φ(Z) = α, is a minimax test of H against K of level α.

Numerical Illustration
We assume to observe independent observations Z 1,δ , . . . , Z n,δ for i = 1, . . . , n, where Z i,δ = X i + √ δV i as described in Section 3, where X 1 , . . . , X n are independent identically distributed (with a distribution function F) but unobserved. Let us further denote by Φ the distribution function of N(0, 1) and by Φ * σ the distribution function of N(0, σ 2 ). The primary task here is to test H 0 : F ≡ Φ against with a fixed σ > 1 and λ ∈ (0, 1). We perform all the computations using the R software [28].
To describe our approach to computing the test, we will need the notation for the set of pseudo-distribution functions corresponding to the set of pseudo-densities H * denotes as where Φ denotes the distribution function of N(0, 1) distribution. Under the alternative, the set analogous to K * is defined as Our task is to approximate and Here, the functions F * δ,κ (z) and G * δ,κ (z) are evaluated over a grid with step 0.05. Then, the maximization in (8) and (9) is performed for values of z over the grid and over four boundary values of (δ, κ) T , which are equal to (0, 0) T , (0, K) T , (∆, 0) T , and (∆, K) T . Additional computations with 10 randomly selected pairs of (δ, κ) T over δ ∈ [0, ∆] and κ ∈ [0, K] revealed that the optimum is attained in one of the boundary values. Further, the Radon-Nikodym derivatives of V and W are estimated by a finite difference approximation in order to compute the test statistic.
The test rejects H 0 if the test statistics ∏ n i=1 π(z i ) exceeds a critical value, which (as well as the p-value) can be approximated by a Monte Carlo simulation, i.e., by a repeated random generating random variables X 1 , . . . , X n under H 0 , and we generate them 10,000 times here.
We perform the following particular numerical study. We compute the critical value of the α-test for n = 20 (or n = 40), λ = 0.25, σ 2 = 3, ∆ = 0.2, K = 1.1, and α = 0.05. Further, we are interested in evaluating the probability of rejecting this test for data generated from with different values ofλ andσ 2 . Its values are shown in Table 1 (for n = 20) and Table 2 (for n = 40), which are approximated using (again) 10,000 randomly generated variables from (10). The boldface numbers are equal to the power of the test (under the simple H 1 ). The proposed test seems meaningful, while its power is increased for n = 40 compared to n = 20; in addition, the power increases with an increasingλ ifσ 2 is retained; and the power also increases with an increasingσ 2 ifλ is retained.

Conclusions
The likelihood ratio test of f 0 against g 0 is considered in the situation that observations X i are mismeasured due to the presence of measurement errors. Thus instead of X i for i = 1, . . . , n, we observe Z i = X i + √ δV i with unobservable parameter δ and unobservable random variable V i . When we ignore the presence of measurement errors and perform the original test, the probability of type I error becomes different from the nominal value, but the test is still the most powerful among all tests on the modified level.
Under some assumptions on f 0 and g 0 and for δ < ∆, I EV 4 ≤ K, we further construct a minimax likelihood ratio test of some families of distributions of the Z i = X i + √ δV i , based on the capacities of the Huber-Strassen type. The test treats the composite null and alternative hypotheses, which cover all possible measurement errors satisfying the assumptions. The advantage of the novel test is that it keeps the probability of type I error below the desired value (α = 0.05) across all possible measurement errors. The test is performed in a straightforward way, while the user must specify particular (not excessively large) values of ∆ and K. We do not consider this a limiting requirement, because parameters corresponding to the severity of measurement errors are commonly chosen in a similar way in numerous measurement error models [5,23] or robust optimization procedures [29]. The critical value of the test can be approximated by a simulation. The numerical experiment in Section 4 illustrates the principles and performance of the novel test.