Accessing the Power of Tests Based on Set-Indexed Partial Sums of Multivariate Regression Residuals

The intention of the present paper is to establish an approximationmethod to the limiting power functions of tests conducted based on Kolmogorov-Smirnov and Cramér-von Mises functionals of set-indexed partial sums of multivariate regression residuals. The limiting powers appear as vectorial boundary crossing probabilities. Their upper and lower bounds are derived by extending some existing results for shifted univariate Gaussian process documented in the literatures. The application of multivariate CameronMartin translation formula on the space of high dimensional set-indexed continuous functions is demonstrated. The rate of decay of the power function to a presigned value α is also studied. Our consideration is mainly for the trend plus signal model including multivariate set-indexedBrownian sheet and pillow.The simulation shows that the approach is useful for analyzing the performance of the test.


Introduction
Investigating the partial sums of least squares residuals has been shown to be reasonable and powerful tool for testing the adequacy of an assumed multivariate regression model; see Somayasa and et al. [1][2][3][4].The development of the technique was motivated by the works proposed mainly for the purpose of detecting change in parameter as well as for detecting the existence of boundaries in univariate spatial regression; see [5][6][7][8] for references.The rejection region is constructed based on either Kolmogorov-Smirnov (KS) or Cramér-von Mises (CvM) functionals of the processes.It was shown in the literatures cited above that the limiting power function of the test appeared as a type of boundary crossing probability which has been involving shifted multidimensional Gaussian process.
To understand the objective considered in this paper in more detail we present below brief review how such a kind of probability appears.Let Z  fl ( () )  =1 be the −dimensional set-indexed Brownian sheet defined on a probability space (Ω, B(Ω), P), say with sample paths in C  (B(G)) fl ×  =1 C(B(G)) and the control measure  0 , where  0 is a probability measure on (G, B(G)), G fl Π  =1 [  ,   ], and   <   , for  = 1, . . ., .We refer the reader to [9,10] for well documented notion of C(B(G)).In the literature of Gaussian process Z  is frequently called − dimensional Gaussian white noise having  0 as the control measure, cf.[11], p. 13 , where for any  ∈  2 ( 0 , G),   is defined as   () fl ∫   0 .Under mild condition, [1][2][3] showed after a suitable localization given to the regression function that the sequence of the partial sums of the least squares residuals obtained from the multivariate regression model converges, when g ∉ W  fl ×  =1 W, to a −dimensional signal plus noise model defined by where Σ > 0 means that Σ is positive definite, and for  ∈ B(G), Journal of Applied Mathematics provided that { 1 , . . .,   } builds an ONB of W in  2 ( 0 , G) ∩   (G).Thereby  g fl (   )  =1 and   (G) is the space of functions on G with bounded variation in the sense of Hardy.It is worth mentioning that the notion of   (G) is a direct extension of the definition of   ([ 1 ,  1 ] × [ 2 ,  2 ]) formulated in [12] to higher dimensional space.
Here the notation  () (t) stands for  () (Π  =1 [  ,   ]), cf.[8].Throughout the paper Σ −1/2  W ⊥ H Z   g will be denoted by  g and  * W ⊥ H Z  Z  by W f, 0 , for the sake of brevity.It was established in [1][2][3] that W f, 0 is a projection of Z  onto the orthogonal complement of W H Z  which is a finite dimensional subspace of the so-called reproducing kernel Hilbert Space (RKHS) of Z  , denoted by H Z  , given by with   2 ( 0 , G) fl ×  =1  2 ( 0 , G).In the literatures mentioned above the process W f, 0 is called the -dimensional set-indexed residual partial sums limit process with the control measure  0 .Hence, the process Z  itself and the −dimensional set-indexed Brownian pillow , respectively, with  0 ≡ 0 and  1 ≡ 1.The control measure  0 appears in the process determines the design under which the experiment was constructed; see [4] for detail.
It was shown by using the well-known continuous mapping theorem that the limiting power functions of size  KS and CvM type tests for testing the hypothesis are given, respectively, by the following complicated formulas: where ‖ ⋅ ‖ 2 R  stands for the Euclidean norm, whereas tW f, 0 and qW f, 0 are constants that satisfy By the difficulty of the computation of tW f, 0 as well as qW f, 0 and the power of the test as the dimension of the experimental region and  get large, the implementation of the test in practice becomes restricted.Approximation by Monte Carlo simulation has been proposed in [1][2][3].Some attempts of establishing concrete computation procedure by generalizing the principal component approach proposed, e.g., by MacNeill [5,6] and Stute [13] for some univariate Gaussian processes on a line, have led us to incorrect result.
Since analytical computation of Ψ W f, 0 ( tW f, 0 ,  g ) and Υ W f, 0 ( tW f, 0 ,  g ) is impossible, it is the purpose of the present paper to establish approximation procedure for that functions.As suggested in [14], p. 315, and [15], p. 423-424, studying the power function is of importance to be able to evaluate the performance of the test especially their rate of decay to .Therefore in this paper we investigate the upper and lower bounds for (6) by considering the result for the univariate Brownian sheet and Brownian pillow presented in Janssen [17] and Hashorva [18,19].Upper and lower bound for the power function of goodness-of-fit test involving multiparameter Brownian process have been studied by Bass [20].The RKHS of W f, 0 is crucial for our results.By Theorem 4.1 in [11] (factorization theorem) if there exists a family then the corresponding RKHS is given by Furthermore, the inner product and the corresponding norm on H W f, 0 are denoted, respectively, by ⟨⋅, ⋅⟩ . For examples, the RKHS of Z 0  is given by with such that h  () = ∫  ℓ  (t) 0 (t),  = 1, 2.
The rest of the present paper is organized as follows.In Section 2 we derive the upper and lower bounds for the power functions of  and V tests by applying the Cameron-Martin translation formula of the multivariate process W f, 0 .The rate of decay of the power to  is also discussed.Alternative method of obtaining the bounds of the power function is proposed in Section 3. In Section 4 we propose Neyman-Pearson test which is a most powerful test.The comparison of the rate of decay of the obtained power to  with those of the KS and CvM tests is also investigated.Justification of the result is also studied in Section 5 by simulation.The paper is closed with a concluding remark in Section 6.

Rate of Decay of the Power of Tests
Our final goal in this section is to obtain an expression for the rate of decay of both Ψ W f, 0 ( tW f, 0 ,  g ) and Υ W f, 0 (q W f, 0 ,  g ) to the preassigned number  ∈ (0, 1) representing the size of the test.First we derive their upper and lower bounds by generalizing the method proposed in [21] concerning bounds for the probability of shifted event; see also Theorem 7.3.in [11] for comparison.Second, we apply the technique studied in [17] to get the result.As reported in [17] and the references cited therein, they studied the upper and lower bounds for the power of signal detection test by applying Cameron-Martin density formula for a shifted measure.The rate of decay was obtained by means of mean value theorem.
Throughout this work let P be the probability distributions of W f, 0 and let P h be a probability measure on C  (B(G)), defined by Then the Cameron-Martin density of P h with respect to P for any h ∈ H W f, 0 is given by where L is a bilinear form, such that This general formula can be obtained by extending the formula for the univariate model presented either in [20], Theorem 5.1 of [11], and [17] or [22] to higher dimensional set-indexed Gaussian processes.
The following theorem is already well known in the literatures mentioned above; however the proof is given only for the case of Gaussian random vector in R  with zero mean and identity covariance matrix (canonical Gaussian Euclidean random vector); see [21] and [11], p. 53.In this paper we present again the theorem especially for the process W f, 0 on C  (B(G)).Although the result for W f, 0 is straightforward based on that of [11,21], to give information on how the inequality for higher dimensional set-indexed Gaussian process was derived, we insist to present the proof of the theorem; see the appendix of this work.
Theorem 1 (Li and Kuelbs [21] and Lifshits [11]).Let E be any subset of C  (B(G)) and (E) ∈ R be any constant, such that (E) = Φ −1 (P(E)).Then for any h ∈ H W f, 0 , it holds true that where Φ is the cumulative distribution function of the standard normal distribution.
The following corollary which gives an expression regarding the rate of decay of P(E − h) to P(E), for any , is an immediate implication of Theorem 1. Rate of decay describes how fast the distance between P(E − h) and P(E) vanishes, cf.[17][18][19].
Corollary 2. Let E be an arbitrary subset of C  (B(G)) and (E) ∈ R be any constant, such that (E) = Φ −1 (P(E)).Then under the assumption 0 < L(h, h), we have, for any h ∈ H W f, 0 , Proof.We apply the technique of proving Lemma 5 of [17].By (14) presented in Theorem 1 and by using the symmetry of Φ, it holds that for some mean value  ∈ ((E), where  is the probability density function of (0, 1).Since max {() : −∞ <  < ∞} = 1/ √ 2, then we have Conversely, by the inequality Φ((E) − L(h, h)/‖h‖ H W f, 0 ) ≤ P(E − h) of ( 14), we can derive the following result: for some mean value Since L(h, h) > 0, by the preceding result, we get which establises the proof.
When the model is either Hence, when we are dealing with the -dimensional setindexed Brownian sheet and -dimensional set-indexed Brownian pillow, Inequality (14), respectively, becomes The corresponding rate of decays can be obtained respectively as follows: In light of the preceding results, we can state the upper and lower bounds as well as the rate of decays for the power Since P is the distribution of W f, 0 , then Analogously, let Then Υ W f, 0 (q W f, 0 ,  g ) = P(F− g ).Thus by considering these two representations we have on the basis of Theorem 1 and Corollary 2 the following summary concerns the bounds for the power of the KS and CvM type tests.

Corollary 3. Suppose that 𝜑 g ∈ H W f,𝑃 0
; then, for  ∈ (0, 1), it holds that Furthermore, we have simple formulas for the rate of decay of where in the context of model check, the norm of  g related to the process Z  and Z 0  is given by Corollary 3 says that the rate of decay or convergence of the power function to  in the case of Z  as well as Z 0  depends on the norm of the trend.A Model with small norm trend leads to faster decay.Conversely, a model with large norm trend results in slower decay.For both models, the norm can be concretely calculated.It is clear that both tests achieve their sizes as the trends vanish.Indeed the work of Samorodnitsky [23] can be incorporated in the estimation of Ψ W f, 0 (,  g ), for any large real number .In Section 5 we demonstrate by simulation the behavior of the power functions of the KS and CvM tests as summarized in Corollary Corollary 3 to give empirical study regarding the rate of decay of the power functions.

Alternative Approaches
In this section other formulas for the upper and lower bounds of the power of KS and CvM tests involving the dimensional set-indexed Brownian sheet and pillow models are derived.Our results are obtained by generalizing the approach proposed in that studied in [18,19] who confined the investigation to one-dimensional Kolmogorov type boundary noncrossing probability involving the so-called univariate ordinary Brownian sheet and pillow.
To simplify the notation we restrict the attention to the case of two-dimensional experimental region where where and   is the th component of  W  ⊥ Σ −1/2 g, which is given by with  *  denoting the (, )th element of Σ −1/2 , say, for ,  = 1, . . ., .Furthermore, for the Z 0  model, we have where Proof.By using a rule for the probability of complement, we get for the Z  model where by using transformation of variables, it can be further expressed as Next, Cameron-Martin formula (12) for the -dimensional set-indexed Brownian sheet implicates Since ‖y()‖ R  < tZ  means − tZ  <   () < tZ  , then under the indicator 1{‖y()‖ R  < tZ 0  , ∀ ∈ B(G)} we get by recalling integration by parts formula on G, cf.[24,25] and the assumption that   is constant throughout the boundary of G; for the Z  model we get Thus, the lower bound in (30) is established.To prove the upper bound, we start with the following inequality: By applying the similar technique as that used in deriving the preceding result and the implication under the indicator 1{‖y()‖ ≤ (1/2) tZ  , ∀ ∈ B(G)} we have, by the integration by parts, the following inequality: completing the proof for the Z  model.To prove the lower and upper bounds (33) for the Z 0  model, we start with the equality Next by the integration by parts and the assumption that  g ∈ H Z 0  and   are constant on the boundary of G, we have under 1{‖y()‖ R  < tZ 0  , ∀ ∈ B(G)} and the fact lishing the lower bound in (33).The similar argument as that used in the case of Z  model can be applied in deriving the upper bound of Ψ Z 0  ( tZ 0  ,  g ) as follows: establishing the proof.Now we can derive other formulas for the rate of decay of Ψ Z  ( tZ  ,  g ) and Ψ Z 0  ( tZ 0  ,  g ) to  by applying the similar method as that utilized in deriving the formula in Corollary 3. However by Theorem 4 we lead to computationally more complicated results.

Corollary 5. Under the condition of Theorem 7, it holds true that
for some mean values ) . (47) Proof.From Inequality (30), we have, by applying the mean value theorem, for some mean value  laid in the interval Conversely, based on the lower bound formula (30), we get for some mean value  within the interval Thus it can be concluded that Ψ Z  ( tZ  ,  g ) −  is laid in the following closed interval: In particular, if the mean values  and  are taken to be the same, then establishing the proof.
Corollary 6.Under the condition of Theorem 7, we have for some  and  specified above.If  and  are chosen to be the same, then

Comparison to Neyman-Pearson Test
Our aim in this section is to establish nonrandomized Neyman-Pearson test for the hypothesis defined in the preceding section.It is well known in the literatures of test theory that Neyman-Pearson test constitutes a most powerful (MP) test for simple hypotheses; see, e.g., Theorem 3.2.1 in [15].If some criterion is satisfied, the test can be extended to a uniformly most powerful (UMP) test for composite hypotheses.In this section the behavior of the power function including the rate of decay to  will be investigated and compared to those of KS and CvM type tests studied in the preceding section.Let V be a linear subspace generated by a set of known and orthonormal regression functions { 1 , . . .,   ,  +1 , . . .,   } ⊂  2 ( 0 , G) ∩   (G) including W = [ 1 , . . .,   ].In this section we consider the hypothesis  0 : g ∈ W  against  1 : g ∈ V  instead of  0 : g ∈ W  against  1 : g ∉ W  .The former is actually the common frame work of model check for multivariate regression in which one is testing whether or not g ∈ W  while observing g ∈ V  ; see [26] for reference.Suppose there exist g 1 ∈ W  and g 2 ∈ V  ∩ W  ⊥ , such that g = g 1 ⊕ g 2 .It is enough to consider the simple hypotheses Hence the -dimensional set-indexed partial sums process of the residuals is given by Y = W f,P 0 , when  0 is true; otherwise The following theorem presents an MP test of size  for testing (59).Here we exhibit again the application of Cameron-Marin density formula of the shifted measure P  f 0 with respect to P, for  [4] investigated the asymptotic optimality of a test for the mean vector in multivariate regression by means of Neyman-Pearson test.
. Neyman-Pearson test of size  for testing (59) will reject  0 , if and only if Furthermore, suppose Γ W f,P 0 : V  → (0, 1) is the corresponding power function of the test.Then the value of the power, evaluated at any f ∈ V  , is given by and otherwise, Γ W f,P 0 (f) = .
Proof.Let  0 (Y) and  1 (Y) be the density of P  f 0 with respect to P under  0 and  1 , respectively.By Theorem 3.2.1 in [15], an MP test of size  for testing (59) will reject  0 , if and only if  0 (Y)/ 1 (Y) ≤ , for a constant  such that ), we get establishing the rejection region of the test.Next, we compute the power function for any f ∈ V  ∩ W  ⊥ .By the definition of Γ W f,P 0 and by the symmetry of Φ, we have The last formula results in Γ W f,P 0 (f) = , when f vanishes.The proof of the theorem is complete.
The test presented in Theorem 7 depends on the choice of f specified under  1 .For example, if we consider  1 : This means that the test cannot be extended as a uniformly most powerful (UMP) test for the composite alternative  1 : It is also not a UMP test for more specific one-sided alternatives  1 : g 2 > 0 or  1 : g 2 < 0.
As discussed in the preceding section, we are also interested in investigating the rate of decay of Γ W f,P 0 (f) to  = Γ W f,P 0 (0).Toward this topic the result of Theorem 7 leads us to the following important corollary.The proof is left since it can be handled by using the similar technique as in the proof of Corollary 3.

Corollary 8. Let f be an element of
and L( f 0 ,  f ) > 0. Then for every presigned  ∈ (0, 1), it holds that In the case L( f 0 ,  f ) < 0, we have where Corollary 8 states that how fast the power function Γ W f,P 0 (f) decays to  depends on some value determined by L( f 0 ,  f ) whose structure is influenced by the type of W f,P 0 .For comparison study suppose that the simple hypothesis ( 59 where for the -dimensional set-indexed Brownian pillow; we have Furthermore, we get, for fixed where the first inequality appears by Holder's inequality, whereas the second follows by the fact that Σ −

Simulation Study
In this section we investigate the behavior of Ψ Z 0  ( tZ 0  ,  f ) and Υ Z 0  (q Z 0  ,  f ) with respect to their lower and upper bounds derived in Corollary 3. The simulated model is represented by the trend plus noise process where Z 0 2 is the two-dimensional Brownian pillow and Such a model appears as the limit process of the twodimensional set-indexed partial sums processes of the residuals of two variate regression model for testing whether or not a constant model holds true.That is, we test the hypothesis that where W = [ 1 ], with  1 (, ) = so that we have after some computations The simulation results using a sample of size 50 × 50 with 1000 runs are exhibited in Table 1 and Figure 1 for  = 0.05.The figures presented in Table 1 represent the values of the power functions of the KS and CvM tests together with the associated values of the lower (L) and upper (U) bounds evaluated at each given value of  utilizing the formulas given in Corollary 3, where in this case L = Φ (Φ −1 () − 0.43309) , U = Φ (Φ −1 () + 0.43309) .
(81)  It is shown that the values of L are never exceeding the corresponding powers.Likewise, the values of U are also never preceding those of the corresponding power functions as suggested by the theory.Figure 1 presents the graphs of L (dotdash line), U (dotted line), Ψ Z 0  ( tZ 0  ,  g ) (smoothed line), and Υ Z 0  (q Z 0  ,  g ) (dashed line) scattered together in one panel.It can be seen that the curves of the power functions are laid within a band formed by the paired curve of L and U as they should be.

Concluding Remark
We have established the upper and lower bounds for the boundary crossing probability involving multivariate trend plus noise model.Our results give important contributions not only in the area of statistics, but also in other disciplines such as in finance mathematics and in statistical physics, where such probability model is also frequently encountered.It is important to note that the Cameron-Martin translation formula is valid if the trend function laid in the RKHS of the corresponding Gaussian process.In practice this is not always the case.Therefore further research must be conducted to be able to handle the problem appears in such situation.

)
In particular, if the mean values  and  are taken to be the same, then       Ψ Z  ( tZ  ,  g ) −        ≤   (2 Z ) is tested using the KS or CvM type test.Then by virtue of Corollary 2, we have Thus, in contrast to Corollary 8, the rate of decay of the KS and CvM type tests does not depend on f 0 at all.Consequently, compared to Neyman-Pearson test, KS and CvM test cannot detect whether to take f larger or less than f 0 in order to have faster or slower decay.The result presented throughout this section will become more visible when we look at the model involving dimensional set-indexed Brownian sheet or pillow.For example suppose we observe the model Y 2 ( 0 ,G) is bounded on V  ∩ W  ⊥ .Thus there exists  > 0, such that  is the uniform upper bound for |Γ Z 0  (f) − |.It is clear that  is also the uniform upper bounds of |Ψ Z 0  ( tZ 0  ,  f ) − | as well as |Υ Z 0  ( tZ 0  ,  f ) − |.