Maximum likelihood estimation for a bivariate Gaussian process under fixed domain asymptotics

We consider maximum likelihood estimation with data from a bivariate Gaussian process with a separable exponential covariance model under fixed domain asymptotic. We first characterize the equivalence of Gaussian measures under this model. Then consistency and asymptotic distribution for the microergodic parameters are established. A simulation study is presented in order to compare the finite sample behavior of the maximum likelihood estimator with the given asymptotic distribution.


Introduction
Gaussian processes are widely used in statistics to model spatial data.When fitting a Gaussian field, one has to deal with the issue of the estimation of its covariance.In many cases, a model is chosen for the covariance, which turns the problem into a parametric estimation problem.Within this framework, the maximum likelihood estimator (MLE) of the covariance parameters of a Gaussian stochastic process observed in R d , d ≥ 1, has been deeply studied in the last years in the two following asymptotic frameworks.
The fixed domain asymptotic framework, sometimes called infill asymptotics [1,2], corresponds to the case where more and more data are observed in some fixed bounded sampling domain (usually a region of R d ).The increasing domain asymptotic framework corresponds to the case where the sampling domain increases with the number of observed data and the distance between any two sampling locations is bounded away from 0. The asymptotic behavior of the MLE of the covariance parameters can be quite different in these two frameworks [3].
Consider first increasing-domain asymptotics.Then, generally speaking, for all (identifiable) covariance parameters, the MLE is consistent and asymptotically normal under some mild regularity conditions.The asymptotic covariance matrix is equal to the inverse of the (asymptotic) Fisher information matrix.This result was first shown by [4], and then extended in different directions by [5,6,7,8].
The situation is significantly different under fixed domain asymptotics.Indeed, two types of covariance parameters can be distinguished: microergodic and non-microergodic parameters [9,1].A covariance parameter is microergodic if, for two different values of it, the two corresponding Gaussian measures are orthogonal, see [9,1].It is non-microergodic if, even for two different values of it, the two corresponding Gaussian measures are equivalent.Non-microergodic parameters can not be estimated consistently, but misspecifying them asymptotically results in the same statistical inference as specifying them correctly [10,11,12,3].On the other hand, it is at least possible to consistently estimate microergodic covariance parameters, and misspecifying them can have a strong negative impact on inference.
Nevertheless, under fixed domain asymptotics, it has often proven to be challenging to establish the microergodicity or non-microergodicity of covariance parameters, and to provide asymptotic results for estimators of microergodic parameters.Most available results are specific to particular covariance models.When d = 1 and the covariance model is exponential, only a reparameterized quantity obtained from the variance and scale parameters is microergodic.It is shown in [13] that the MLE of this microergodic parameter is consistent and asymptotically normal.When d > 1 and for a separable exponential covariance function, all the covariance parameters are microergodic, and the asymptotic normality of the MLE is proved in [14].Other results in this case are also given in [15,16,17].Consistency of the MLE is shown as well in [18] for the scale covariance parameters of the Gaussian covariance function and in [19] for all the covariance parameters of the separable Matérn 3/2 covariance function.Finally, for the entire isotropic Matérn class of covariance functions, all parameters are microergodic for d > 4 [20], and only reparameterized parameters obtained from the scale and variance are microergodic for d ≤ 3 [21].In [22], the asymptotic distribution of MLEs for these microergodic parameters is provided, generalizing previous results in [23] and [24].
All the results discussed above have been obtained when considering a univariate stochastic process.There are few results on maximum likelihood in the multivariate setting.Under increasing-domain asymptotics [25] extend the results of [4] to the bivariate case and consider the asymptotic distribution of the MLE for a large class of bivariate covariance models in order to test the independence between two Gaussian processes.In [26], asymptotic consistency of the tapered MLE for multivariate processes is established, also under increasing domain asymptotics.In [27], some results are given on the distribution of the MLE of the correlation parameter between the two components of a bivariate stochastic process with a separable structure, when the space covariance is known, regardless of the asymptotic framework.In [28], the fixed domain asymptotic results of [14] are extended to the multivariate case, for d = 3 and when the correlation parameters between the different Gaussian processes are known.Finally, under fixed domain asymptotics, in the bivariate case and when considering an isotropic Matérn model, [29] show which covariance parameters are microergodic.
In this paper, we will extend the results of [13] (when d = 1 and the covariance function is exponential) to the bivariate case.First we will consider the equivalence of Gaussian measures, that is to say we will characterize which covariance parameters are microergodic.In the univariate case, [16] characterize the equivalence of Gaussian measures with exponential covariance function using the entropy distance criteria.We extend their approach to the bivariate case.It turns out, similarly as in the univariate case, that not all covariance parameters are microergodic.Hence not all covariance parameters can be consistently estimated.Then we establish the consistency and the asymptotic normality of the MLE of the microergodic parameters.Some our proof methods are natural extensions of those of [13] in the univariate case, while others are specific to the bivariate case.
The paper falls into the following parts.In Section 2 we characterize the equivalence of Gaussian measures, and describe which covariance parameters are microergodic.In Section 3 we establish the strong consistency of the MLE of the microergodic parameters.Section 4 is devoted to its asymptotic distribution.Some technical lemmas are needed in order to prove these results and, in particular, Lemma 4.1 is essential to prove the asymptotic normality results.The proofs of the technical lemmas are postponed to the appendix.Section 5 provides a simulation study that shows how well the given asymptotic distributions apply to finite sample cases.The final section provides a discussion and open problems for future research.

Equivalence of Gaussian measures
First we present some notations used in the whole paper.If A = (a ij ) 1≤i≤k,1≤j≤n is a k × n matrix and B = (b ij ) 1≤i≤p,1≤j≤q is a p × q matrix, then the Kronecker product of the two matrices, denoted by A ⊗ B, is the kp × nq block matrix In the following, we will consider a stationary zero-mean bivariate Gaussian process observed on fixed compact subset Note that σ 2 1 , σ 2 2 > 0 are marginal variances parameters and θ > 0 is a correlation decay parameter.The quantity ρ with |ρ| < 1 is the so-called colocated correlation parameter [30], that expresses the correlation between Z 1 (s) and Z 2 (s) for each s.For i = 1, 2, the covariance of the marginal process Z i (s) is Cov ψ (Z i (s l ), Z i (s m )) = σ 2 i e −θ|s l −sm| .Such process is known as the Ornstein-Uhlenbeck process and it has been widely used to model physical, biological, social, and many other phenomena.Denote by P ψ the distribution of the bivariate process Z, under covariance parameter ψ.As we consider fixed domain asymptotic, the process Z(s) is observed at an increasing number of points on a compact set T .Without loss of generality we consider T = [0, 1] and denote by 0 ≤ s 1 < . . .< s n ≤ 1 the observation points of the process.Let us notice that the points s 1 , . . ., s n are allowed to be permuted when new points are added and that these points are assumed to be dense in T when n tends towards infinity.The observations can thus be written as and the associated likelihood function is given by The aim of this section is to provide a necessary and sufficient condition to warrant equivalence between two Gaussian measures P ψ 1 and P ψ 2 with Specifically let us define the symmetrized entropy We assume in this section that the observation points are the terms of a growing sequence in the sense that, at each step, new points are added to the sampling scheme but none is deleted.This assumption ensures that I n (P ψ 1 , P ψ 2 ) is an increasing sequence.Hence we may define the limit I(P 1≤m,l≤n , j = 1, 2. By expanding (4) we find that In order to compute tr(R 1 R −1 2 ) and tr(R 2 R −1 1 ), we use some results in [31].The matrix R j can be written as follows, and R −1 j can be written as Then, we can write I n (P ψ 1 , P ψ 2 ) as For j, k = 1, 2, j = k, as is obtained by Taylor expansion, since max i ∆ i tends to 0, we have Since i ∆ i tends to 1, and Then the two Gaussian measures P ψ 1 and P ψ 2 are equivalent on the σ−algebra generated by Z if and only if σ 2 i,1 θ 1 = σ 2 i,2 θ 2 , i = 1, 2, and ρ 1 = ρ 2 . .Note that sufficient conditions for the equivalence of Gaussian measures using a generalization of the covariance model (1) are given in [29].A consequence of the previous lemma is that it is not possible to estimate consistently all the parameters individually if the data are observed on a compact set T .However the microergodic parameters σ 2 1 θ, σ 2 2 θ and ρ are consistently estimable.The following section is devoted to their estimation.

Consistency of the Maximum Likelihood Estimator
Let ψ = ( θ, σ2 1 , σ2 2 , ρ) ⊤ be the MLE obtained by maximizing f n (ψ) with respect to ψ.In the rest of the paper, we will denote by θ 0 , σ 2 i0 , i = 1, 2 and ρ 0 the true but unknown parameters that have to be estimated.We let var = var ψ 0 , cov = cov ψ 0 and E = E ψ 0 denote the variance, covariance and expectation under P ψ 0 .In this section, we establish the strong consistency of ρ, θσ 2  1 and θσ 2 2 , that is the MLE of the microergodic parameters.We first consider an explicit expression for the negative log-likelihood function The explicit expression is given in the following lemma whose proof can be found in the appendix.
Lemma 3.1.The negative log-likelihood function in Equation ( 5) can be written as The following theorem uses Lemma 3.1 in order to establish strong consistency of MLE of the microergodic parameters ρ, θσ 2  1 , θσ 2 2 .
Then, with probability one, ψ exist for n large enough and when n → +∞ Proof.The proof follows the guideline of the consistency of the maximum likelihood estimation given in [13].Hence consistency results given in ( 7), ( 8) and ( 9) hold as long as we can prove that there exist 0 < d < D < ∞ such that for every ǫ > 0, ψ and ψ, with where 2 ) ⊤ ∈ J can be any nonrandom vector such that In order to simplify our notation, let , k = 1, 2, i = 2, ..., n.By the Markovian and Gaussian properties of Z 1 and Z 2 , it follows that for each and from the proof of Theorem 1 in [13], uniformly in Moreover, from Cauchy-Schwarz inequality, and from Lemma 2(ii) in [13] uniformly in θ ≤ R and Combining ( 11), ( 12) and ( 13), we can write, and Let ρ = min { ρ, ρ} and combining ( 14) and ( 15), we can write, .

Asymptotic distribution
Before we state the main result on the MLE asymptotic distribution, we need to introduce some notation that will be used throughout this paper.Because of Theorem 3.2, there exists a compact subset S of (0, +∞) × (0, +∞) × (0, +∞) × (−1, 1) of the form Θ × V × V × R, such that a.s.ψ belongs to S for n large enough.We let O u (1) denote any real function The following lemma is essential when establishing the asymptotic distribution of the microergodic parameters.
Then for all n ∈ N, the (Y i,n ) i=2,...,n are independent with Using the previous lemma, the following theorem establishes the asymptotic distribution of the MLE of the microergodic parameters.Specifically we consider three cases: first when both the colocated correlation and variance parameters are known, second when only the variance parameters are known and third when all the microergodic parameters are unknown.

Theorem 4.2. With the same notation and assumptions as in
where where the derivative of the negative log-likelihood with respect to x = σ 2 1 , σ 2 2 , θ, ρ.From Lemma 4.1 and from Equation (3.11) in [13] we can write, with W k,i,n as in the proof of Theorem 3.2, Then from (20) we have .
Then ψ satisfies s θ ( ψ) = 0 and in view of ( 21), we get Hence (23) implies On the other hand, from the multivariate central limit theorem we get where where Then, computing the previous quadratic form, we get is proved.Now, we first prove ( 18) and ( 19) for ρ 0 ∈ (−1, 1) {0} and discuss the case ρ 0 = 0 at the end of the proof.To show (18), take differentiation with respec to ρ.
From the proof of Theorem 2 given in [13], and from arguments similar to those of the proof of Lemma 4.1, we get Then (25) implies Then ψ satisfies s ρ ( ψ) = 0 and in view of (26), we get Then we can write, from ( 22) and ( 27) Furthermore, By taking the inverse of the 2 × 2 matrix in (30), we get from (29): From (24) we can get where Then, we get that Let us now show (19).Similarly as for (25), we can show Then (32) implies Then ψ satisfies s σ 2 1 ( ψ) = 0 and in view of (33), we get Then we can write, from ( 22), ( 27) and ( 34) If all parameters are uknown, we get after some tedious algebra: Applying LU matrix factorization we get Hence we get Furthermore, we have Hence, from (37) and (36), we obtain Hence from (24) we can get where Then, we get: Then, using the multivariate Delta Method we get where Finally, we get In the case ρ 0 = 0, we can show that ( 31) and (38) are still true with the same proof.The only difference is that we multiply the second line of ( 28) and (35) by ρ.We skip the technical details.

Numerical experiments
The main goal of this section is to compare the finite sample behavior of the MLE of the covariance parameters of model ( 1) with the asymptotic distribution given in Section 4. We consider two possible scenarios for our simulation study: 1.The variances parameters are known and we estimate jointly ρ 0 and θ 0 .2. We estimate jointly all the parameters σ 2 01 , σ 2 02 , ρ 0 and θ 0 .Under the first scenario we simulate, using the Cholesky decomposition, 1000 realizations from a bivariate zero mean stochastic process with covariance model (1) observed on n = 200, 500 points uniformly distributed in [0, 1].We simulate fixing σ 2 01 = σ 2 02 = 1 and increasing values for the colocated correlation parameter and the scale parameter, that is ρ 0 = 0, 0.2, 0.5 and θ 0 = 3/x with x = 0.2, 0.4, 0.6.Note that θ 0 is is parametrized in terms of practical range that is the correlation is lower than 0.05 when the distance between the points is greater than x.For each simulated realization, we compute ρi and θi , i = 1, . . ., 1000, i.e. the MLE of the colocated correlation and scale parameters.Using the asymptotic distribution given in Equation ( 18), Tables 1, 2 compare the empirical quantiles of order 0.05, 0.25, 0.5, 0.75, 0.95 i=1 respectively, with the theoretical quantiles of the standard Gaussian distribution when n = 200, 500.The simulated variances of ρi and θi for i = 1, . . ., 1000 are also reported.
As a general comment, it can be noted that the asymptotic approximation given in Equation (18) improves and the variances of the MLE of ρ 0 and θ 0 decrease when increasing n from 200 to 500.When n = 500 the asymptotic approximation works very well.
Under the second scenario we set σ 2 01 = σ 2 02 = 0.5 and the other parameters as in Scenario 1.In this case we simulate, using Cholesky decomposition, 1000 realizations from a bivariate zero mean stochastic process with covariance model (1) observed on n = 500, 1000 points uniformly distributed in [0, 1].For each simulated realization, we obtain σ2 1i , σ2 2i ρi and θi , i = 1, . . ., 1000 the MLE of the two variances, the colocated correlation and scale parameters.Using the asymptotic distribution given in Equation (19), Tables 3, 4, 5 compare the empirical quantiles of order 0.05, 0.25, 0.5, 0.75, 0.95 i=1 respectively, for n = 500, 1000 with the theoretical quantiles of the standard Gaussian distribution.The simulated variances of σ2 1i θi , σ2 2i θi and ρi and for i = 1, . . ., 1000 are also reported.As in the previous Scenario, the asymptotic approximation given in Equation (19) improves and the variances of the MLE of ρ 0 and σ 2 0i θ 0 , i = 1, 2 reduce when increasing n from 500 to 1000.When n = 1000 the asymptotic approximation is quite satisfactory, with the exception of the case ρ 0 = 0.5 where some problems of convergence on the tails of the distributions can be noted, in particular when θ 0 = 3/0.4,3/0.6.

Concluding remarks
In this paper we considered the fixed domain asymptotic properties of the MLE for a bivariate zero mean Gaussian process with a separable exponential covariance model.We characterized the equivalence of Gaussian measures under this model and we established the consistency and the asymptotic distribution of the MLE of the microergodic parameters.Analogue results under increasing domain asymptotics are obtained by [25].It is interesting to note that the asymptotic distribution of the MLE of the colocated correlation parameter, between the two processes, does not depend on the asymptotic framework.
Our results can be extended in different directions.Let M(h, ν, θ) ν, θ > 0, be the Matérn correlation model.A generalization of the bivariate covariance model ( 1) is then the following model: with θ 12 = θ 21 , σ 1 > 0, σ 2 > 0, where in this case ψ = (σ 2 1 , σ 2 2 , θ 11 , θ 12 , θ 22 , ν, ρ) ⊤ .This is a special case of the bivariate Matérn model proposed in [30] and sufficient conditions in terms of ψ for the validity of this kind of model.Studying the asymptotic properties of the MLE of ψ would then be interesting.The main challenges in this case are the number of parameters involved and the fact that the covariance matrix cannot be factorized as a kronecker product.Moreover for ν = 0.5 the markovian property of the process cannot be exploited.
Another interesting extension is to consider the fixed domain asymptotic properties of the tapered maximum likelihood estimator in bivariate covariance models.This method estimation has been proposed as a possible surrogate for the MLE when working with large data sets, see [32,33].Asymptotic properties of this estimator, under fixed domain asymptotics and in the univariate case, can be found in [34], [24] and [23].Extensions of these results to the bivariate case would be interesting.Both topics are to be investigated in future research.

Appendix
Proof of lemma 3.1.

Lemma 4 . 1 .
With the same notations and assumptions as inTheorem 3.2, let
Proof.Let us introduce∆ i = s i − s i−1 for i = 2, . ..,n and note that n i=2 ∆ i ≤ 1 and lim n→∞ max 2≤i≤n 2 are an i.i.d.sequences of standard Gaussian random variables.Using Lemma 3.1 we write,

Table 2 :
. The authors give necessary For scenario 1: empirical quantiles, and variances of simulated MLE of θ 0 for different values of ρ 0 and θ 0 , when n = 500, 1000.