The influence of training sampling size on the expected error rate in spatial classification

In this paper we use the pluged-in Bayes discriminant function (PBDF) for classification of spatial Gaussian data into one of two populations specified by different parametric mean models and common geometric anisotropic covariance function. The pluged-in Bayes discriminant function is constructed by using ML estimators of unknown mean and anisotropy ratio parameters. We focus on the asymptotic approximation of expected error rate (AER) and our aim is to investigate the effects of two different spatial sampling designs (based on increasing and fixed domain asymptotics) on AER.


Introduction
Bayesian discriminant function (BDF) is known as an optimal classification rule in the sense of minimum risk, then populations are completely specified and the loss function is known. If populations are not completely specified unknown parameters could be estimated from training sample and pluged-in BDF. The expected error rate (ER) is the performance measure of PBDF, but the expressions for the ER are very complicated even for the simplest forms of PBDF, therefore, asymptotic approximations of the ER are used.
The first investigation of PBDF quality for spatial classification was done by Switzer (1980) [7]. Later some extensions were done in [1,8,9]. However, correlations between observations to be classified and training sample were assumed to equal zero in all these publication. The first extension rejecting assumption about spatial independence was done by Dučinskas (2009) [4]. Here only the trend parameters and variance is assumed to be unknown. The extension of the latter approximation to the case of complete parametric uncertainty (all means and covariance function parameters are unknown) was implemented in Dučinskas and Dreižienė (2011) [5]. In all recently mentioned publications the pluged-in Bayes discriminant function (PBDF) was constructed using maximum likelihood estimators of unknown mean and covariance parameters.
The asymptotic behavior of spatial covariance parameter estimators can be different under the different asymptotic spatial frameworks. We consider two types of sampling framework in spatial statistics. One is the fixed-domain asymptotic framework or infill asymptotic framework. Here more and more observations might be sampled in the same finite domain (Cressie, 1993) [2]. The other type is called increasing domain asymptotic framework in which the minimum distance between sampling points is bounded away from zero and thus the spatial domain of observation is unbounded (Zhang, Zimmerman, 2005) [10]. This is the spatial analogue of the asymptotics observed in time series.
In this paper we seek to investigate the influence of training sample increase on AER using two different spatial sampling frameworks. We use the AER expression in the case of the unknown mean parameters and unknown anisotropy ratio. This AER expression in closed form was derived in [3].

Spatial classification problem
This paper deal with Gaussian random field (GRF) observations {Z(s): s ∈ D ⊂ R p } and the main goal is to classify Z(s) into one of two populations where x(s) is a q × 1 vector of non random regressors and β j is a q × 1 vector of parameters, j = 1, 2. ε(s) is the error term with zero-mean and covariance function defined by model for all where θ ∈ Θ is a p × 1 parameter vector, Θ being an open subset of R p . Denote by T = (Z(s 1 ), . . . , Z(s n )) ′ training sample and S n = {s i ∈ D; i = 1, . . . , n} the set of locations where training sample T is taken and call it the set of training locations (STL).
We shall assume the deterministic spatial sampling design and all analyses are carried out conditionally on S n . S n is partitioned into the union of two disjoint subsets, i.e. S n = S (1) ∪S (2) , where S (j) is the subset of S n that contains n j locations of feature observations from Ω j , j = 1, 2.
For given training sample T , consider the problem of classification of the Z 0 = Z(s 0 ) into one of two populations when The model of training sample has the following form Denote by c 0 the covariance between Z 0 and T . Let t denote the realization of T .
Since Z 0 follows model specified in (1), the conditional distribution of Z 0 given T = t, Ω j is Gaussian with mean and variance where Under the assumption of complete parametric certainty of populations and for known finite nonnegative losses {L(i, j), i, j = 1, 2}, the BDF has the following form j)), j = 1, 2, where π 1 , π 2 (π 1 + π 2 = 1) are prior probabilities of the populations Ω 1 and Ω 2 , respectively.
Denote byβ,θ the estimators of corresponding parameters. Replacing parameters with their estimates in BDF (5) we form the PBDF with H = (I q , I q ) and G = (I q , −I q ), where I q denotes the identity matrix of order q. We will use the maximum likelihood (ML) estimators of parameters based on the training sample. The asymptotic properties of ML estimators established by Mardia and Marshall (1984) [6] under increasing domain asymptotic framework and subject to some regularity conditions are essentially exploited. Hence, the ML estimatorΨ is weakly consistent and asymptotically Gaussian [5]. Denote is the squared Mahalanobis distance between conditional distributions of Z 0 given T = t. Under assumptions (A1) and (A2) in Dučinskas, Dreižienė (2011) [5] approximation of ER in the case of estimated unknown mean parameters and estimated unknown covariance parameters is where R(Ψ ) is the risk of BDF, (i, j)-th element of J θ is tr(C −1 C i C −1 C j )/2. B = ∂α 0 /∂θ is the n × k matrix of partial derivatives evaluated at pointθ = θ. ϕ(·) denotes the standard normal distribution density function and (σ 2 0 ) (1) θ is first order partial derivatives ofσ 2 0 (4) evaluated at pointθ = θ. More details can be found in [5].

Numerical example
Numerical example is considered to investigate the influence of training sample increase on AER using different spatial frameworks. Assume that D is a regular 2-dimensional lattice with unit spacing. Consider the case s 0 = (4, 4) and eight fixed STL S m,n , m = 1, 2, where 1 denotes infill asymptotic sampling framework, 2 denotes increasing domain asymptotic sampling framework and n represents the size of training sample, n = 8, 16, 32, 98. For example S m,8 contains 8 neighbors of s 0 , S m,16 contains 16 neighbors of s 0 and so on. S m,n is partitioned into a union of two disjoint subsets, i.e. S m,n = S (1) ∪ S (2) , where S (j) is the subset of S m,n that contains n j locations of feature observations from Ω j , j = 1, 2 and n 1 = n 2 .  With an insignificant loss of generality the case with π j = 0.5 and L(i, j) = 1 − δ ij , i, j = 1, 2 is considered. Observations are assumed to arise from stationary GRF with different constant mean and common nugget-less covariance function given by C(h) = σ 2 r(h), where σ 2 is variance (sill) and r(h) = exp − h 2 x + λ 2 h 2 y /α is the exponential geometric anisotropic correlation function with anisotropy ratio λ and anisotropy angle ϕ = π/2. Here α denotes the range parameter.
We consider the case with unknown mean and anisotropy ratio parameters. Figure 3 shows the values of AER using infill asymptotic sampling frameworks and increasing domain asymptotic sampling frameworks. AER are calculated assuming Mahalanobis distance between marginal distributions ∆ = (µ 1 − µ 2 )/σ = 1, α=0.6, σ 2 = 1. The results show that less values of AER are obtained using increasing domain sampling framework. AER values are decreasing while training sample size increases for both sampling frameworks and for both isotropic and anisotropic cases (λ = 1 and λ = 2). Table 1 shows the ratio of AER calculated using an increasing domain asymptotic sampling framework (AER inc ) to infill asymptotic sampling framework (AER inf ). It is obvious that AER Inc /AER Inf increases while α is increasing for all training sample sizes. This leads to the conclusion that for greater α values and greater training sample size the infill asymptotic sampling framework gives lower values of AER in comparison with the increasing domain asymptotic sampling framework.