Statistical inference in functional semiparametric spatial autoregressive model

Abstract: Semiparametric spatial autoregressive model has drawn great attention since it allows mutual dependence in spatial form and nonlinear effects of covariates. However, with development of scientific technology, there exist functional covariates with high dimensions and frequencies containing rich information. Based on high-dimensional covariates, we propose an interesting and novel functional semiparametric spatial autoregressive model. We use B-spline basis function to approximate the slope function and nonparametric function and propose generalized method of moments to estimate parameters. Under certain regularity conditions, the asymptotic properties of the proposed estimators are obtained. The estimators are computationally convenient with closed-form expression. For slope function and nonparametric function estimators, we propose the residual-based approach to derive its pointwise confidence interval. Simulation studies show that the proposed method performs well.


Introduction
Spatial autoregressive model (SAR) and its derivatives have been widely used in many areas such as economics, political science, public health and so on. There are lots of literatures concerning about spatial autoregressive models such as Anselin [1], LeSage [19], Anselin and Bera [2], Lee and Yu [20], LeSage and Pace [21], Lee [22], Dai, Li and Tian [9]. In particular, Lee [22] utilised generalized method of moments to make inference about spatial autoregressive model. Xu and Li [25] investigated the instrumental variable (IV) and maximum likelihood estimators for spatial autoregressive model using a nonlinear transformation of dependent variable. However, spatial autoregressive model may not be flexible enough to capture nonlinear impact of some covariates since its parametric structure. In order to enrich model adaptability and flexibility, some semiparametric spatial autoregressive models have been proposed. For example, Su [31] studied a semiparametric SAR, which includes nonparametric covariates. Su and Jin [32] proposed partially linear SAR with both linear covariates and nonparametric explanatory variables. Sun et al. [33] studied a semiparametric spatial dynamic model with a profile likelihood approach. Wei and Sun [36] derived semiparametric generalized method of moments estimator. Hoshino [35] proposed a semiparametric series generalized method of moments estimator and established consistency and asymptotic normality of the proposed estimator.
However, with development of economic and scientific technology, huge amounts of data can be easily collected and stored. In particular, some types of data are observed in high dimensions and frequencies containing rich information. We usually call them functional data. When those types data are included in a model as covariates, it is common to use functional linear model (FLM). There exist vast literatures on estimation and prediction for FLM (See, for example, Reiss et al. [26], Ramsay and Dalzell [27], Delaigle and Hall [11], Aneiros-Pérez and Vieu [3]). Many methods were proposed to estimate the slop function such as Cardot et al. [5], Hall and Horowitz [14], Crambes et al. [8], Shin [28]. In particular, Hall and Horowitz [14] established minimax convergence rates of estimation. Cai and Hall [6] proposed functional principle components method and a reproducing kernel Hilbert space approach was used in Yuan and Cai [7].
In many applications of spatial data, there are often covariates with nonlinear effect and functional explanatory variables. This motivates us to propose an interesting and novel functional semiparametric spatial autoregressive model. The model is relatively flexible because it utilises functional linear model to deal with functional covariate and semiparametric SAR model to allow spatial dependence and nonlinear effect of scalar covariate. Recently, some models consider both functional covariates and spatial dependence. For instance, Pineda-Ríos [24] proposed functional SAR model and used least squares and maximum likelihood methods to estimate parameters. The functional SAR model considers spatial effect for error term instead of spatial effect for response variable. Huang et al. [12] considered spatial functional linear model and they developed an estimation method based on maximum method and functional principle component analysis. Hu et al. [13] developed generalized methods of moments to estimate parameters in spatial functional linear model. In the paper, we proposed a generalized method of moments estimator which is heteroskedasticity robust and takes a closed-form written explicitly.
The rest of the paper is organized as follows. Section 2 introduces the proposed model and the estimation procedure. The asymptotic properties of the proposed estimators are established in Section 3. Section 4 conducts simulation studies to evaluate the empirical performance of the proposed estimators. Section 5 gives some discussions about the model. All technical proofs are provided in appendix.

Model and notations
Consider the following novel functional semiparametric spatial autoregressive model, where Y is a response variable, ρ is an unknown coefficient of the spatial neighboring effect, W is the constant spatial weight matrix with a zero diagonal, Z = (Z 1 , ..., Z p ) is a p-dimensional covariate and θ is its coefficient. {X(t) : t ∈ [0, 1]} is a zero-mean and second-order (i.e. E|X(t)| 2 < ∞, ∀t ∈ [0, 1]) stochastic process defined on (Ω, B, P) with sample paths in L 2 [0, 1], the Hilbert space containing integrable functions with inner product x, y = 1 0 x(t)y(t)dt, ∀x, y ∈ L 2 [0, 1] with norm x = x, x 1/2 . The slope function β(t) is a square integrable function on [0, 1], U is a random variable, g(·) is an unknown function on its support [0, 1] without loss of generality. We assume E[g(·)] = 0 to ensure the identifiability of the nonparametric function. ε is a random error with zero mean and finite variance σ 2 , independent of Z, U and X(t).
Remark 1. The model (2.1) is more flexible to take different models. It generalizes both semiparametric spatial autoregressive model [32] and functional partial linear model [28] which correspond to the cases β(t) = 0 and ρ = 0, respectively. The model can be represented by We assume I − ρW could be inverted to make the presentation valid. Thus Y i is also influenced by its neighbours' covariates X j (t) as j i. Parameter ρ indicates the basic impact of the neighbours. Greater absolute value of ρ means that the response variable is more likely to be affected by its neighbours.
Then the model can be rewritten as Similar to Zhang and Shen [39], profiling out the functional approximation, we obtain Let Q = (WY, Z) and η = (ρ, θ ) . Applying the two stage least squares procedure proposed by Kelejian and Prucha [17], we propose the following estimator Consequently, we useβ(t) = B 1 (t)γ,ĝ(u) = B 2 (u)ζ as the estimator of β(t) and g(u).
For statistical inference based onη, consistent estimators of the asymptotic covariance matrices are needed. Define the following estimator where · is the L 2 norm for a function or the Euclidean norm for a vector. In order to make statistical inference about σ 2 , it need to get the value ω = E[(ε 2 1 −σ 2 ) 2 ]. Therefore, we use the following estimator ω to estimate ωω Similar to Zhang and Shen [39], we use an analogous idea for the construction of instrument variables. In the first step, the following instrumental variables are obtainedH , whereρ,γ andζ are obtained by simply regressing Y on pseudo regressor variables WY, Π. In the second step, instrumental variablesH are used to obtain the estimatorsη,γ andζ, which are necessary to construct the following instrumental variables H = (W(I −ρW) −1 (Z θ + D γ + B 2 (U)ζ), Z). Finally, we use the instrumental variables H to obtain the final estimatorsρ andθ.

Asymptotic theory
In this section, we derive the asymptotic normality and rates of convergence for the estimators defined in previous section. Firstly, we introduce some notations. For convenience and simplicity, c denote a generic positive constant, which may take different values at different places. Let β 0 (·) and g o (·) be the true value of function β(·) and g(·) respectively. K(t, s) = Cov(X(t), X(s)) denotes the covariance function of X(·). a n ∼ b n means that a n /b n is bounded away from zero and infinity as n → ∞. In the paper, we make the following assumptions.
C5 For matrixQ, there exsits a constant λ * c such that λ * c I −QQ is positive semidefinite for all n. C6 X(t) has a finite fourth moment, that is, E X(t) 4 ≤ c.
C7 K(t, s) is positive definite. C8 The nonparametric function g(·) has bounded and continuous derivatives up to order r(≥ 2) and the slope function β(t) ∈ C r [0, 1]. C9 The density of U, f U (u), is bounded away from 0 and ∞ on [0, 1]. Furthermore, we assume that Assumptions C1−C3 are required in the setting of spatial autoregressive model (see, for example, Lee [23], Kelejian and Prucha [18], Zhang and Shen [39]). They concern the restriction of spatial weight matrix and SAR parameter. Assumption C4 (see Du et al. [10]) is used to represent the asymptotic covariance matrix ofη. Moreover, assumption C4 requires implicitly that the generated regressors CZ and Z, deviated from their functional part of projection onto Π, are not asymptotically multicollinear. Assumption C5 is required to ensure the identifiability of parameter η. Assumptions C6−C7 are commonly assumed in functional linear model [14]. Assumption C6 is a mild restriction to prove the convergence of our estimator. Assumption C7 guarantees the identifiability of β(t). Assumption C8 ensures that β(·) and g(·) are sufficiently smoothed and can be approximated by basis functions in the spline space. Assumption C9 requires a bounded condition on the covariates. It is often assumed in asymptotic analysis of nonparametric regression problems (see, for example [15,37]). Assumption C10 is required to achieve the optimal convergence rate ofβ(·) andĝ(·). Let where V = X(t), β 0 (t) . The following theorems state the asymptotic properties of the estimators for parameter and nonparametric function.
The variance Ξ(t) and Λ(u) are involved in basic function and knots. Different basis functions and knots can get different variance estimators. Moreover, the variance expression contains unknown quantities. Replacing them by consistent estimators can lead to approximation errors. What's more, there may exist heteroscedasticity in error term and then the estimatorσ 2 is not consistent. Consequently, we propose the following residual-based method to construct piecewise confidence interval in practice.
It is crucial that spatial structure must be preserved during data resampling in models with spatial dependence [1]. Therefore, we employ the residual-based bootstrap procedure to derive the empirical pointwise standard error ofβ(t) andĝ(·). The procedure can be described as follows: (1) Based on the data sets {Y, Z, X(t), U} and spatial matrix W, one fits the proposed model and obtain the residual vectorε 1 = (ε 11 , ...,ε n1 ) . Then, we derive the centralized residual vectorε.
(3) Based on the new data sets {Y * , Z, X(t), U} and spatial matrix W, we fit the proposed model again to derive the estimatorβ * (t) andĝ * (u). Repeat the process many times. Thus, for given t and u, calculate empirical variance ofβ * (t) andĝ * (u) respectively. Consequently, we use the empirical variance to construct its confidence interval.
Z i1 and Z i2 are independent and follow standard normal distribution, θ 1 = θ 2 = 1, U i ∼ U(0, 1), The spatial weight matrix W = (w i j ) n×n is generated based on mechanism that w i j = 0.3 |i− j| I(i j), 1 ≤ i, j ≤ n with w ii = 0, i = 1, ..., n. A standardized transformation is used to convert the matrix W to have row-sums of unit. We set the following three kinds of error term: (1) In order to compare the different situations for magnitudes of ρ, we set ρ = {0.2, 0.7} with error term N(0, σ 2 ). Simulation results are derived based on 1000 replications.
To achieve good numerical performances, the order l 1 and l 2 of splines and the number of interior knots k 1 and k 2 should be chosen. To reduce the burden of computation, we use the cubic B-spline with four evenly distributed knots (i.e., k 1 = k 2 = 2) for slope function β(·) and nonparametric function g(·) respectively. These choices of k 1 and k 2 are small enough to avoid overfitting in typical problem with sample size not too small and big enough to flexibly approximate many smooth function. We use the square root of average square errors (RASE) to assess the performance of estimatorsβ(·) andĝ(·) respectively .., n 2 } and n 1 = n 2 = 200 are grid points chosen equally spaced in the domain of β(·) and g(·) respectively. Tables 1-3 show simulation results with different kinds of error terms. Table 4 presents different magnitudes of ρ with error term N(0, 1). They show the bias (Bias), standard deviation (SD), standard error (SE) and coverage probability (CP) with nominal level of 95% for estimator and the mean and standard deviation (SD) of RASE j ( j = 1, 2) forβ(·) andĝ(·). The simulation results can be summarized as follows: (1) The estimatorsρ,θ 1 ,θ 2 ,σ 2 are approximately unbiased and the estimated standard errors are close to sample standard deviations in normal error distribution. The empirical coverage probabilities approximate the nominal level of 95% well.
(3) For error term 0.75t (3) and (1+0.5U i )N(0, 1), the estimatorsρ,θ 1 ,θ 2 are approximately unbiased and the estimated standard errors are close to sample standard deviations. In addition, the mean and standard deviation for RASE of estimated coefficient functionβ(·) andĝ(·) are decreasing. It indicates that parametric and non parametric estimators perform well in non-normal error term.
(4) From Table 1 and Table 4, as basic spatial effect ρ increases, the SE and SD ofρ decrease. For the different magnitudes of ρ, the Bias and SD of parametric estimators forθ 1 andθ 2 , and the mean of RASE forβ(·) andĝ(·) remain stable. It means that the magnitudes of ρ do not affect the other parametric and nonparametric estimators. Note: It shows that the bias (Bias), standard deviation (SD), standard error (SE) and coverage probability (CP) with nominal level of 95% for estimator and the mean and standard deviation (SD) of RASE forβ(·) andĝ(·) from 1000 repetitions. Note: It shows that the bias (Bias), standard deviation (SD), standard error (SE) and coverage probability (CP) with nominal level of 95% for estimator and the mean and standard deviation (SD) of RASE forβ(·) andĝ(·) from 1000 repetitions.    Figure 1. It displays the true curve β(t) and g(u) (red solid line), the estimated curveβ(t) and g(u) (green dotted line) and ponitwise 2.5 and 97.5 percentile of the estimated functionβ(·) andĝ(u) (light green line) in 500 replications with sample size n = 300 respectively. In the firgure, the left one shows estimatorβ(t) and the right one shows estimatorĝ(u) with error term N(0,1).

Discussion
In this paper, an interesting and novel functional semiparametric spatial autoregressive model is proposed. The model considers functional covariates based on semiparametric spatial autoregressive model. The slope function and nonparametric function are approximated by B-spline basis function.
Then generalized method of moments is proposed to estimate parameters. Under mild conditions, we establish the asymptotic properties for proposed estimators.
In order to use our model in practical applications, firstly, response variable needs spatial dependence. Secondly, there are covariates with nonlinear effect and functional variables. A problem of practical interest is to extend our model to take into account functional covariates and single index function simultaneously. What's more, making a test about spatial dependence and nonlinear effect of covariates is an important issue. Those topics are left for future work.
Proof of Theorem 3. The proof is similar to Theorem 3 in [10] and we omit here.
Proof of Theorem 4. By the definition of l(γ, ζ) in the proof of Theorem 2, we have   Sinceβ(t) − β * (t) = B 1 (t)(γ − γ 0 ) and for any t ∈ (0, 1), as n → ∞, by the law of large numbers, the slutsky's theorem and the property of multivariate normal distribution, we obtain that where Ξ(t) = lim n→∞