splm : Spatial Panel data models in R

splm is an R package for estimating and testing various spatial panel data speciﬁ-cations. We consider the implementation of both maximum likelihood and generalized moments estimators in the context of ﬁxed as well as random eﬀects spatial panel data models. This paper is a general description of splm and all functionalities are illustrated by application to Munnel (1990) data on 48 US States observed over 17 years (we restrict the time range to observations between 1970 and 1974).


Introduction
Spatial econometrics is an important tool in economics as well as in regional sciences and geography (Baltagi, Kelejian, and Prucha 2007b).Spatial panel data models are probably one of the most promising topics in the spatial econometrics literature.Recently a series of theoretical papers have appeared developing estimation procedures for different models. 1lthough there are libraries in R, Matlab and Stata to estimate cross-sectional spatial models (see e.g.Bivand 2001Bivand , 2002Bivand , 2006;;Bivand and Gebhardt 2000;Bivand and Portnov 2004;Piras 2009), there are few software procedures available to estimate spatial panel data models.
This paper is a description of splm, an R library that implements some of the recently developed estimation procedures for spatial panel data models along with diagnostic testing.Both random effects (RE) and fixed effects (FE) models are considered.FE models are estimated only by maximum likelihood (ML), while for RE models we implement both ML and generalized moments (GM) estimators.We also implement three conditional specification (zero-restriction) Lagrange multiplier tests from Baltagi et al. (2007c).For completeness, we add all the tests proposed in Baltagi et al. (2003).Finally, we include the implementation of the GM procedure to estimate the simultaneous system of spatially interrelated cross sectional equations presented in Kelejian and Prucha (2004).
The remainder of the paper is organized as follows: Section 2 describes the data structure.Section 3 discusses the definition of classes and methods.Section 4 is devoted to the ML estimation of various spatial panel data specifications.In particular, section 4.1 discusses and illustrates spatial FE models, while section 4.2 deals with the estimation of RE models.Section 5 describes the implementation of the generalized moments estimators proposed in Kapoor et al. (2007).The simultaneous equation model is described in section 5.2.Conclusions and future developments close the paper.

Data structures
Panel data relate to a cross section of observations (individuals, groups, countries, regions) repeated over several time periods.Spatial panels refer to observations associated to a particular position in space.Data can be observed either at point locations (e.g.housing data) or aggregated over regular or irregular areas (e.g.countries, regions, counties, states).The structure of the interactions between each pair of spatial units is represented by means of a spatial weights matrix.
The spatial weights matrix W is a N × N positive and symmetric matrix.An observation appears both in rows and columns.Hence the non-zero elements of the matrix express for each observation the locations that belong to its "neighborhood".As a consequence, the element w ij indicates the intensity of the relationship between cross sectional units i and j.By convention, the diagonal elements w ii are all set to zero to exclude self-neighbors.The weights matrix is generally used in row standardized form.
A possible source of confusion when developing ad-hoc routines stems from the different notation that characterizes spatial panel data models compared to traditional panel data models.On one hand, classical panel data models are ordered first by cross-section and then by time period (i.e. with time being the "fast" index).On the other hand, spatial panels are stacked first by time period and then by cross-section.In splm this is treated transparently for the user.The internal ordering of the estimation functions is usually (but not always) the spatial panel data one.Nonetheless, data can be supplied according to the conventions implemented in the plm package for panel data econometrics (Croissant and Millo 2008).Three possibilities are indeed available to input a data set in splm: • a data.framecontaining the individual index (as its first variable) and the time index (second variable).The index argument should be left to its default value of NULL • a data.frameand a character vector composed of the names of the variables to be used as (space and time) indices • an object of the class pdata.frame.
pdata.frames are special objects created to deal with panel data.They are part of a general infrastructure made available with plm and meant to handle lag and difference operations.
The methods available in splm are geared towards static panels; nonetheless, defining data as a pdata.framemight simplify the inclusion of (time) lags of the regressors.
The spatial weights matrix W can be a matrix object (with the estimators performing a minimal check for dimension compatibility) or a listw object from the class defined in spdep.This last class is a structured and efficient format and has the advantage of being well established in the R environment.Functionalities for switching between the two formats are available as functions listw2mat and mat2listw from the spdep package.

Classes and methods for spatial panel models
The two main goals of splm are estimation and diagnostic testing of spatial panel data models.
On one hand, the existing class htest serves the testing purpose well.On the other hand, the characteristics of spatial panel models require a modification of the structure and methods of the panel model classes available from plm.By and large, this is because spatial panel models involve the estimation of extra coefficients (e.g. the coefficient for the spatial lag term in the fixed effects spatial lag model or the error correlation coefficient and the variance components in the random effects specifications).
The new class splm inherits the general structure of lm objects.The splm object is a list of various elements including: the estimated coefficients, the vector of residuals and fitted.values, the most recent call and a model element containing the data employed in the estimation.As it is common for most models estimated by maximum likelihood, splm also comprises a logLik component with the value of the log-likelihood at the parameter optimum.This can be easily extracted and reused for testing or model selection purposes.
Some elements from lm objects have been excluded though.These omissions are partly due to the nature of the estimation process (which does not use, for instance, the "qr" decomposition) and partly to operations possible on simple linear regression models but not on panel data models.On the other hand, specific elements have been added to contain spatial and other covariance parameters.In addition to the usual vcov element giving the coefficients' variance covariance matrix, the element vcov.arcoefcontains the variance of the estimated spatial autoregressive coefficient (if any, else NULL) and the element vcov.errcompcontains the covariance matrix of the estimated error covariance coefficients.
A new class is defined for the summaries of splm objects.Consistently with lm and plm objects, the method provides diagnostic tables for the elements of splm objects.print methods are also available with a minimal description of the model object (including call, coefficients and covariance parameters).
Additionally, extractor methods have been defined for the relevant elements of model objects.Along with the usual coef, residuals and vcov, extractor methods are provided for the covariance matrices of the estimated spatial autoregressive coefficient and covariance components.
The availability of these extractors is consistent with the general modeling framework of the R project and favors the interoperability of splm objects with generic diagnostics based on Wald tests.In particular we refer to the functions waldtest in lmtest (for joint zero-restrictions) and lht in car (for generic linear restrictions).
Finally, an extractor method for fixed effects and a summary method for displaying them are also available.
In the following, all functionalities are illustrated by application to the well-known Munnel (1990)

ML estimators
In this section we discuss the extension of the traditional error component model to a spatial framework.The individual effects will in turn be considered fixed parameters to be estimated and random variables.For the fixed effect specification, following Elhorst (2003) we consider models with only time period effects, with only spatial effects and models with both spatial and time period fixed effects.All these specifications are extended to a spatially lagged dependent variable (fixed effects spatial lag model) or to include spatial error autocorrelation (fixed effects spatial error model).
As for the random effects model, our point of departure will be the general specification presented in Baltagi et al. (2007c).The most general formulation of the model allows for random effects, spatial and serial correlation.In other words, they consider a model with serial correlation on each cross-sectional observation over time as well as spatial dependence between cross-sectional observations at each time period.Through the presence of random effects, the model also permits heterogeneity across spatial units.Restrictions on the model parameters give rise to different nested specifications.Although we do not discuss the estimation theory for all of them, we illustrate some of the implemented specifications.

Fixed effects model
The traditional fixed effects model (e.g., Baltagi, 2008, pp. 14-17, andArellano, 2003, pp. 11-18) can be extended to include a spatially lagged dependent variable among the regressors as well as a spatially autocorrelated error term.The simplest representation of a pooled linear regression model with spatial fixed effects takes the following form: where i = 1, . . ., N refers to individuals, t = 1, . . ., T is the time index, and α i are the individual specific effects for which it holds that i α i = 0 to be separately identifiable from the constant term.For large N consistent estimation of the individual fixed effects is not possible because of the incidental parameter problem.Elhorst (2003) has pointed out that when the interest is primarily in the regression parameters vector β an extension of the fixed effects model to a spatial context may still be appropriate.More in detail, a fixed effect spatial lag model can be written in stacked form as where ρ is the spatial autoregressive coefficient, W a non-stochastic spatial weight matrix, ι T a column vector of ones of dimension T and ε i ∼ N (0, σ2 ). 2 On the other hand, a fixed effects spatial error model can be written as where λ is the spatial autocorrelation coefficient and ε is a well-behaved error term.The general estimation theory for both models is based on maximum likelihood and resembles the cross-sectional case.Next we give a brief overview of the estimation strategy for both models.

FE spatial lag model
The presence of the spatial lag introduces a form of endogeneity that violates the assumption of the standard regression model that the regressors are uncorrelated to the error term.ML accounts for endogeneity.Elhorst (2003) suggests to transform the variables in 2 by subtracting the individual means and use these transformed variables to maximize the likelihood function.More in detail, the demeaned variables are obtained subtracting the average for each cross-section over time.As a consequence, the fixed effects and the constant term (as well as other variables that do not vary over time) are wiped out from the model.Formally, the transformation can be written as where ) is an N T ×N T matrix well-known in the standard panel data literature. 3he log-likelihood function of model 2 is: where e = y − ρ(I T ⊗ W N )y − Xβ and ln |I N − ρW | is the Jacobian determinant. 4ollowing Anselin (1988), Elhorst (2009) suggests a concentrated likelihood approach for maximizing 6.The estimation procedure is substantially analogous to the one employed in the cross-sectional case.First, two auxiliary regressions of the demeaned variables y * and (I N ⊗ W N )y * on the demeaned regressors X * are performed.The corresponding residuals (say e * 0 and e * 1 ) are combined to obtain the concentrated likelihood: with C a constant not depending on ρ.A numerical optimization procedure is needed to obtain the value of ρ that maximizes 7.As a final step, estimators for β and σ 2 are computed from the first order conditions of the likelihood function by replacing ρ with its ML estimator.In analogy with the cross sectional model, the estimator for β can also be seen as the generalized least square estimator of a linear regression model with disturbance variance matrix σ 2 Q N T .5 Finally, statistical inference on the parameters of the model can be based on the following expression for the asymptotic variance covariance matrix (Elhorst 2009;Elhorst and Freret 2007): where T T = T tr( W W + W W ) and W = W (I N − ρW ) −1 .The computational burden involved in the calculation of the asymptotic standard error of the spatial parameter can be very costly for large problem dimensions (mainly because of the trace term).On the other hand, the block of the coefficient covariance matrix relative to the parameter vector β does not present particular computational difficulties.6Fixed effects can be recovered by

FE spatial error model
The estimation strategy for the cross-sectional spatial error model can be easily extended to the panel context.In fact, the log-likelihood function for the spatial error model follows directly as a special case of the general theory for maximum likelihood estimators with non-spherical error covariance described in Magnus (1978).Again a concentrated likelihood approach can be taken up but an iterative procedure is needed to estimate the parameters of the spatial error model.The general idea is to iterate between ML and GLS until a convergence criterion is met.As before, the model is transformed according to 5, to eliminate fixed effects.
More formally, the log-likelihood function for model 4 can be written as: with e = y − Xβ and A N = (I N − λW ).Given λ, estimators for β and σ 2 are derived from the first order conditions as and where the notation indicates the explicit dependence of the residuals on λ.By substituting expressions 10 and 11 back into 9, the concentrated log-likelihood function can be derived as: where C is a constant not depending on λ and A N has the same meaning as before.
The estimation procedure can be summarized as follows.Estimating the demeaned model by OLS, one can plug the estimated residuals into 12 in order to obtain an initial estimate of λ.
The initial estimate for λ can be used to compute a (spatial) FGLS estimator of the regression coefficients, the error variance and a new set of estimated GLS residuals.An iterative procedure may then be employed: The concentrated likelihood and the GLS estimators for, respectively, λ and β are alternately computed until convergence.The asymptotic variance covariance matrix of the parameters takes the following form (Elhorst 2009) where W = W (I N − λW ) −1 .Considerations made for the spatial lag case apply also here.
Finally, spatial fixed effects can be recovered by Illustration spfeml permits the estimation of both fixed effects spatial lag and error models through the argument model.The argument method sets the technique for the calculation of the determinant.The default (eigen) is to express the Jacobian in terms of the eigenvalues of the spatial weights matrix.The argument effects accepts four different options, namely: pooled to include only the (optional) constant term, spfe to include cross-sectional specific effects, tpfe to consider time period specific effects, and, finally, sptpfe to include both spatial and time period fixed effects.As an example, to estimate a model with only cross-sectional fixed effects:7 > res <-spfeml(fm, data = Produc, listw = mat2listw(usaww), effects = "spfe", + method = "eigen") Warning: x may not contain an intercept if fixed effects are specified > summary(res)

Spatial panel fixed effects lag model
Call: spfeml(formula = fm, data = Produc, listw = mat2listw(usaww), effects = "spfe", method = "eigen") The function spfeml generates an object of class splm for which an appropriate summary method is defined.The summary method gives information about the call, a summary of the residuals and the table of estimated coefficients (where rho is the coefficient of the spatially lagged dependent variable).Fixed effects can be extracted using the function effects: The result is an object of class effects.splmfor which print and write methods are defined.The print method displays the type of effects (with significance levels) and the constant term.The write method is used to write the corresponding matrix to disk.The name of the file can be controlled by the argument filename, that by default takes the generic name of "effects.csv".

Random Effects Models
Following Baltagi et al. (2007c), our point of departure is the following panel data regression model where y it is the observation on cross-sectional unit i in time period t, and X it is a k × 1 vector of observations on the non-stochastic exogenous regressors.The disturbance vector is the sum of random regional effects and spatially autocorrelated residuals.In vector form this can be written as The remaining disturbance term follows a first-order serially autocorrelated process u t , ε t , ν t and e t are all N × 1 columns vectors, µ is the random vector of i.i.N (0, σ 2 µ ) region specific effects; λ (|λ| < 1) is the spatial autoregressive coefficient and ρ (|ρ| < 1) is the serial autocorrelation coefficient.As usual, W indicates the N × N matrix of known spatial weights whose diagonal elements are set to zero. 8Finally, e it ∼ N (0, σ 2 e ), v i0 ∼ N (0, σ 2 e /(1 − ρ 2 )) and µ and ε are assumed to be independent.
The disturbance term can also be rewritten, in matrix notation, as where B = I N − λW, ι T is a vector of ones, and I T an identity matrix where T indicates the dimension.The model allows for serial correlation on each spatial unit over time as well as spatial dependence between spatial units at each time period.The presence of random effects accounts for possible heterogeneity across spatial units.
Depending on the restrictions on the parameters one can differently combine error features giving rise to various nested specifications (see Table 1).In particular, when both λ and ρ are zero but σ 2 µ is positive, the model reduces to a classical random effects panel data specification.When λ is zero, the resulting model accounts for random effects with serially autocorrelated residuals.On the other hand, when ρ is zero and λ and σ 2 µ are not, the specification reduces to a random effects model with spatially autocorrelated residuals.Table 1 summarizes all possible specifications implemented in the library.
Table 1: Different model specifications that can be generated as special cases of the general specification.
For a model with spatially autocorrelated error components, OLS is inefficient even when σ 2 µ = 0. Analogously, OLS on a random effects model (even without spatial components) is also inefficient.The model should then be estimated by maximum likelihood.In the present section we discuss the estimation approach to the richest specification, i.e. the one allowing for random effects, serial and spatial correlation.Estimation of the simpler models can always be obtained as special cases of the procedure illustrated below and therefore will not be discussed in detail.To derive the expression for the likelihood, Baltagi et al. (2007c) use a Prais-Winsten transformation of the model with random effects and spatial autocorrelation.Following their simplifying notation, we define .
Scaling the error covariance matrix by the idiosyncratic error variance σ 2 ε and denoting φ = , the expressions for the scaled error covariance matrix Σ and for its inverse Σ −1 and determinant |Σ| can be written respectively as 9 Therefore, one can easily derive the expression of the likelihood: Baltagi et al. (2007c) for details on the derivation of the variance covariance matrix.
We implement an iterative procedure to obtain the maximum likelihood estimates (for details on the implementation see Millo and Piras 2009).Starting from initial values for λ, ρ and φ, we obtain estimates for β and σ 2 e from the first order conditions: The likelihood can be concentrated and maximized with respect to λ, ρ and φ.The estimated values of λ, ρ and φ are then used to update the expression for Σ −1 .These steps are then repeated until convergence.In other words, for a specific Σ the estimation can be operationalized by a two steps iterative procedure that alternates between GLS (for β and σ 2 e ) and concentrated likelihood (for the remaining parameters) until convergence. 10As an example, let us consider the case in which either ρ or λ are zero.To initialize the procedure we should start from initial values to obtain an estimate of the variance covariance matrix to be employed in the initial GLS.The concentrated likelihood (based on GLS residuals) is then maximized with respect to φ and either ρ or λ.The iterations between GLS and concentrated likelihood continue until convergence.From an implementation point of view there are at least a couple of different ways to proceed.As explained in detail in Millo and Piras (2009), we decided to include the GLS step within the objective function to be maximized (i.e. the function to be used as argument of the optimizer).In other words, the GLS is part of the optimization process of the likelihood.11Statistical inference could be based on the expression of the information matrix.However, to lessen the computational burden we obtain standard errors for β from GLS, and we employ a numerical Hessian to perform statistical inference on the error components. 12

Illustration
The main function to deal with ML estimation of spatial random effects models is spreml.The argument errors allows the estimation of the models presented in Table 1.Seven options are available, namely: semsrre, semsr, srre, semre, re, sr and sem.semsrre corresponds to the full model presented in this section.semsr corresponds to a model with serially and spatially correlated disturbances but no random effects (φ = 0); srre is a model with serial correlation and random effects (λ = 0); semre excludes serial correlation; re is a traditional random effects model, sr a panel regression with serially correlated errors and sem a pooled model with spatially autocorrelated residuals.
We provide two options to set the initial values of the parameters managed through the argument initval. 13The first option is to specify a numeric vector of initial values.As an alternative, when initval is set to estimate the initial values are retrieved from the estimation of nested specifications.As an example, when estimating semsrre, the initial value for the serial correlation parameter is taken to be the estimated ρ from a panel regression with serially correlated errors.Analogously, the initial value of λ is the estimated spatial autocorrelation coefficient from the sem model; and, finally, an initial value for φ is obtained estimating a random effect model.
Munnell's data lead to the following results for the most general model: > semsrremod <-spreml(fm, data = Produc, w = usaww, errors = "semsrre") > summary(semsrremod) As customary in the R environment, the summary method prints a short description of the model, the most recent call, a summary of the residuals and the table of estimated coefficients.The splm specific part of the output (between the summary of the residuals and the table of the estimated coefficients) reports on the estimated error components along with standard errors from the numerical hessian.

Spatial random effects model
The recent literature on spatial panel data has identified two spatial autoregressive processes.One specification was introduced in section 4.2 and is generally estimated by ML techniques.Another specification assumes that spatial correlation applies to both the individual and remainder error components (Kapoor et al. 2007).Although the two data generating processes look similar, they do imply diverse spatial spillover mechanisms governed by a different structure of the implied variance covariance matrix.This second specification is estimated by GM methods.
The model can be written as: with intuitive notation.The disturbance term follows a first order spatial autoregressive process of the form: where for each time period ε N is an N × 1 vector of innovations, W is the spatial weights matrix and ρ the corresponding spatial autoregressive parameter.
To further allow for the innovations to be correlated over time, Kapoor et al. (2007) postulate an error component structure for the innovation vector in 21, that is: where µ N is the N × 1 vector of cross-sectional specific effects; ν N for each time period a vector of innovations that vary both over cross-sectional units and time periods; ι T is a vector of ones and I N an N × N identity matrix.14Kapoor et al. (2007) maintain the assumption that the error components ν it are identically and independently distributed with mean zero, variance σ 2 ν and finite fourth moments.The error components µ it are also identically and independently distributed with mean zero, variance σ 2 µ and finite fourth moments.Finally, the two processes are independent to each other.Under some additional assumptions, the variance covariance matrix of ε N can be expressed as: where The estimation procedure is a combination of the traditional panel data literature on error component models and the GM approach to spatial models.Kapoor et al. (2007) suggest a generalization of the generalized moment estimator suggested in Kelejian and Prucha (1999) for estimating the spatial autoregressive parameter (ρ) and the two variance components of the disturbance process (σ 2 1 and σ 2 ν ).Specifically they define three sets of GM estimators based on six moment conditions.
The first set of GM estimators is based only on a subset of the moment conditions (the first three equations) and assigns equal weights to each of them.This first set of estimators should be therefore intended as initial estimators.
The second set of GM estimators uses all moment conditions and an optimal weighting scheme.It is indeed well known from the theory of GM estimators that for asymptotic efficiency the ideal weighting matrix is the inverse of the variance covariance matrix of the sample moments at the true parameter values.Kapoor et al. (2007) derive this matrix under the assumption of normally distributed innovations.They point out that, although the use of such matrix is not strictly optimal in absence of normality, it can be viewed as a reasonable approximation of the true and more complex variance covariance matrix.
The third set of GM estimators is motivated by computational difficulties.The elements of the asymptotic variance covariance matrix of the sample moments involve a computational count of up to O(n 3 ).Although one could take advantage of the particular structure of W , the computation of such matrix can still be difficult in many cases.The third set of GM estimators still uses all moment conditions but a simplified weighting scheme.
Using any of the previously defined estimators for the spatial coefficient and the variance components, a feasible GLS estimator of β can be defined based on a spatial Cochrane-Orcutt type transformation of the original model.However, following the classical error component literature, a convenient way of calculating the GLS estimator is to further transform the (spatially transformed) model premultiplying it by I N T − θQ 1 , where θ = 1 − σ ν /σ 1 .The feasible GLS estimator is then identical to an OLS calculated on the "doubly" transformed model.

Illustration
The function that performs the GM estimation of the model described in this section is spregm.The argument method allows to opt for one of the three sets of GM estimators.The default is to perform the initial estimator.If the argument method is set to fulweigh the second estimator (i.e. the one involving the full expression of the variance covariance matrix of the moments conditions) is performed.Finally, to obtain the third estimator the argument method should be set to weigh.On Munnel's data this would lead to: > GM <-spregm(fm, data = Produc, w = usaww, method = "fulweigh") > summary(GM)

Residuals:
Min The summary method after a short description of the model, prints the most recent call, a summary of the residuals and the table of estimated coefficients.The output also contains a printing of the estimated spatial coefficient, the variance components σ 2 ν and σ 2 1 and θ.One of the main advantages of the GM approach compared to ML is that the former is computationally less intensive than the latter (mostly because it does not involve the computation of Jacobian terms).The function spregm can deal with the estimation of very large datasets.As an example, we estimated a model with N=10,000 cross-sectional observations over T=20 time periods.Considering K=11 explanatory variables, the time to perform the second set of GM estimators was slightly more than 28 seconds on an Intel Core Duo MacBook with 4 GB of Memory and a processor speed of 2.4 GHz.Kelejian and Prucha (2004) generalize results given by Kelejian and Prucha (1998) in a single equation framework to a simultaneous system of spatially interrelated cross sectional equations.The model relates to a single cross-section of observations.Spatial dependence arises from two sources.On one hand, the error terms are assumed to be both spatially autocorrelated and correlated across equations.On the other hand, the value of the dependent variable in each equation depends on its spatial lag (and the lags of all other dependent variables).The system of equations can be represented as:

Spatial simultaneous equations
with Y = (y 1 , . . ., y m ), X = (x 1 , . . ., x m ), U = (u 1 , . . ., u m ), Ȳ = (ȳ 1 , . . ., ȳm ) and ȳj = Wy j , j = 1, . . .m. y j is an N × 1 vector of cross-sectional observations on the dependent variable in equation j, x l is an N × 1 vector of observations on the lth exogenous variable, u j is the N × 1 disturbance vector in equation j.Finally, W is an N × N weights matrix of known constants, and B, C and Λ are corresponding matrices of parameters.
In addition to allowing for general spatial lags in the endogenous variables, the model also allows for spatial autocorrelation in the disturbances.In particular, it is assumed that the disturbances are generated by the following spatial autoregressive process: where . ., ūm ) and ūj = Wu j .ε j and ρ j denotes, respectively, the vector of innovations and the spatial autoregressive parameter in equation j.
In imposing exclusion restrictions on the previous model, Kelejian and Prucha (2004) define β j , γ j and λ j as the vectors of non-zero elements of the jth column of B, C and Λ. Equivalently, Y j , X j and Ȳj are the corresponding matrices of observations on the endogenous, exogenous, and spatially lagged endogenous variables that appear equation j.The system can be rewritten as: where Z j = (Y j , X j , Ȳj ) and δ j = (β j , γ j , λ j ) .The innovation are generated as ε = (Σ * ⊗I)v where Σ * is a non-singular m × m matrix and the random variables in v are identically and independently distributed with mean zero and unitary variance.Kelejian and Prucha (2004) derives both limited and full information instrumental variables estimators.The limited information estimation is a straightforward extension of the GS2SLS procedure in the context of a single equation model (Kelejian and Prucha 1998).In particular, each equation in the system is estimated by GS2SLS.The matrix of instruments used in the estimation procedure will typically be H = [X, WX, W 2 X] ( Kelejian, Prucha, and Yuzefovich 2004;Lee 2003).
The GS2SLS estimator does not account for potential cross equation correlation.On the other hand, the GS3SLS estimator introduced in the paper utilizes the full system information.To introduce the feasible GS3SLS estimator, it proves helpful to stack the equations in the following way: ) and where ρ = (ρ 1 , . . ., ρ m ) and δ = (δ 1 , . . ., δ m ) are vectors of parameters.Since, E(εε ) = Σ ⊗ I, a system instrumental variable estimator can be defined as: where Ẑ * (ρ) = diag m j=1 Ẑ * j (ρ j ), Ẑ * j (ρ j ) = PZ * j (ρ j ), P = H(H H) −1 H , and ρ is the generalized moment estimator of the limited information GS2SLS procedure.Finally, Σ is a matrix whose elements are σjl = n −1 ε j εl , where εj = y * j (ρ) − Z * j (ρ) δj .Kelejian and Prucha (2004) prove that the small sample distribution of the feasible GS3SLS estimator can be approximated as: Small sample inference on the parameters can be based on this approximation.

Illustration
spsegm is the function that implements the estimation method for the spatial simultaneous equation system.The specification for the formula object is quite different than the traditional one.All the responses for all equations should be organized in a matrix and specified on the left hand side of the formula object.In a similar fashion, all the explanatory variables from all equations should be specified on the right hand side of the formula.The number of equations in the system is retrieved from the dimension of the matrix containing the responses (i.e.number of columns).In the actual implementation, equations can only be different in terms of regressors.In other words, all endogenous variables and spatial lags of endogenous variables are included among the explanatory variables of each equation in the system.The crucial argument to define the specification of each equation (i.e. the columns of X to be included) is spec.It should either be default or a list type of object.When spec is set to default all the equations have the same explanatory variables including the intercept as in the following example: 15 > W <-nb2listw(cell2nb(10, 10)) > y <-matrix(rnorm(200), 100, 2) > X <-matrix(runif(200, -10, 10), 100, 2) > ynseq <-paste("y", seq(1, ncol(y))) > colnames(y) <-ynseq > xnseq <-paste("y", seq(1, ncol(X))) > colnames(X) <-xnseq > data <-cbind(y, X) > data <-data.frame(data)> se <-spsegm(y ~X, spec = "default", listw = W) > summary(se) Symultaneous Equation Model: 15 The intercept can be eliminated from all the equations at the same time in the traditional way.(1, 2, 3), c(1, 7), c(4,  5, 6)), listw = W) Internally, a constant is added automatically to the columns of X.The first equation contains three explanatory variables (other than the spatial lags and the endogenous variables).The number one (always) refers to the intercept while two and three to the first and second column of X.Similarly, the second element of the argument spec implies that the second equation only contains an intercept term (1) and the last column of X (7).Finally, the third equations does not include the intercept but the third, forth and fifth columns of the matrix of regressors.
The output is an object of class splm.The summary method recognizes that the object has been produced by spsegm and loops over the number of equations (and the number of explanatory variables in each equation if necessary) to print the results.

LM tests
Since the seminal work of Breusch and Pagan (1980) Lagrange Multiplier (LM) tests have been extensively employed to test for random effects and serial or cross-sectional correlation in panel data models.Requiring only the estimation of the restricted specification, LM tests are particularly appealing in a spatial random effects setting because of the computational difficulties related to the ML approach.
In this section we describe several LM-type tests for spatial panels.We start from Baltagi et al. (2003) that derive joint, marginal and conditional tests for all combinations of random effects and spatial correlation.We then move to joint and conditional tests derived by Baltagi et al. (2007c) for further allowing serial correlation.
Let us rewrite the model as follows where the disturbance vector is assumed to have random spatial effects, spatially autocorrelated residual disturbances and a first-order autoregressive disturbance term.In vector form, it can be written as: where and ν t = ρν t−1 + e t (33) Baltagi et al. (2003) derive joint, marginal and conditional LM tests for a restricted version of the model (assuming ρ = 0); whereas Baltagi et al. (2007c) focus on the unrestricted model.

Tests for random effects and spatial error correlation
Under the assumption that ρ = 0, the error term can be rewritten for simplicity as: with B = I N − λW .The variance covariance matrix of the autocorrelated residuals reads The hypotheses under consideration are: 1. H a 0 : λ = σ 2 µ = 0 under the alternative that at least one component is not zero 2. H b 0 : σ 2 µ = 0 (assuming λ=0), under the one-sided alternative that the variance component is greater than zero 3. H c 0 : λ = 0 assuming no random effects (σ 2 µ = 0), under the two-sided alternative that the spatial autocorrelation coefficients is different from zero 4. H d 0 : λ = 0 assuming the possible existence of random effects (σ 2 µ may or may not be zero), under the two-sided alternative that the spatial autocorrelation coefficient is different from zero 5. H e 0 : σ 2 µ = 0 assuming the possible existence of spatial autocorrelation (λ may or may not be zero)and the one-sided alternative that the variance component is greater than zero The joint LM test for the first hypothesis of no random effects and no spatial autocorrelation (H a 0 ) is given by: and ũ denotes OLS residuals.Equation 36 is the point of departure for the derivation of the marginal LM tests used to verify H b 0 and H c 0 .The marginal LM test of no random effects assuming no spatial correlation is given by This one sided LM test can be poor even in large samples (the asymptotic critical value can be far from the exact critical values).An alternative is to use a standardized version that centers and scales the one-sided LM statistics: Analogously, the marginal LM test of no spatial autocorrelation assuming no random effects is given by where, g = tr ] and e = tr[( B B) 2 ].A one-sided test can be defined by taking the square root of 44 based on ML residuals.The test statistics should be asymptotically distributed N (0, 1).

Illustration
The main function to perform the joint, marginal and conditional tests for random effects and spatial error correlation is bsktest.The argument can either be a formula describing the model to be estimated or an object of class splm.In this last case, the function will check that the residuals are indeed coming from the correct estimation.The type of test to be performed can be controlled through the argument test.In the following example we perform the standardized test in equation 38.The alternative hypothesis is one of no random regional effects.The function bsktest returns an object of class htest.

> test1
Baltagi, Song and Koh SLM1 marginal test data: log(gsp) ~log(pcap) + log(pc) + log(emp) + unemp SLM1 = 0.0052, p-value = 0.4979 alternative hypothesis: Random Regional Effects 6.2.Tests for random effects, spatial and serial error correlation Baltagi et al. (2007c) consider all possible combinations of joint, marginal and conditional tests for the full model: • the joint test for λ = ρ = σ 2 µ = 0, (J ) • the marginal tests for λ, ρ, andσ 2 µ assuming in turn that the other two are zero (M.1-3 ) • the joint tests for any combination of two of the parameters assuming the third one is zero (M.4-6 ) • the marginal tests for λ, ρ, andσ 2 µ assuming in turn that the other two may or may not be zero (C.1-3 ) • the joint tests for any combination of two of the parameters assuming the third one may or may not be zero (C.4-6 )  Baltagi et al. (2007c) 17 show that the test statistic for the joint hypothesis M.4 (λ = ρ = 0) assuming no random effects is simply the sum of the marginal tests M.1 (λ = 0) and M.2 (ρ = 0).Additionally, M.5 is the Baltagi et al. joint test outlined in section 6.1; and M.6 is the joint test for random individual effects and serial correlation derived in Baltagi and Li (1995).
As a result of the previous discussion, we only consider the three-way joint test J and the one-way conditional tests C.1-3.
The corresponding null hypotheses are: 1. H a 0 : λ = ρ = σ 2 µ = 0 under the alternative that at least one component is not zero (J ) 2. H h 0 : λ = 0, assuming ρ = 0, σ 2 µ > 0: test for spatial correlation, allowing for serial correlation and random individual effects (C.1 ) 3. H i 0 : ρ = 0 , assuming λ = 0, σ 2 µ > 0: test for serial correlation, allowing for spatial correlation and random individual effects (C.2 ) 4. H j 0 : σ 2 µ = 0, assuming λ = 0, ρ = 0: test for random individual effects, allowing for spatial and serial correlation (C.3 ) The joint LM test for H a 0 is given by: , G is a matrix with bidiagonal elements equal to one and ũ denotes OLS residuals.Under H a 0 , LM J is distributed as χ 2 3 .The conditional C.1 test for H h o gives rise to the following statistic, asymptotically distributed as χ 2 1 under H h 0 : where is the score vector (evaluated at the null), û a vector of ML residuals obtained from the estimation of the model with individual error components and serial correlation, g = 1 is the score evaluated at the null, J −1 22 is the corresponding element of the information matrix 19 and û is the vector of estimated residuals from the ML estimation of a panel model with spatially and serially correlated errors but no individual error components.The LM µ/λρ test statistic is asymptotically distributed as χ 2 1 under H j 0 :

Illustration
The main function to perform the joint, marginal and conditional tests for random effects, serial and spatial error correlation is bsjktest.Due to the particular nature of the restricted models underlying the test, the main argument can only be a formula.In the first example we illustrate how to perform the joint J test in equation 45 for any of the three effects

Conclusions
The analysis of spatial panel data is a sub-field of econometrics that has lately been experiencing a fast methodological progress.Applied applications though are hindered by the lack of readily available software.The R environment is ideal for its development because of the vast infrastructure already in place for analyzing spatial data.
splm is a new package for estimation and diagnostic testing of various spatial panel models.Supported estimation techniques are Maximum Likelihood as well as Generalized Moments.
Lagrange Multiplier tests are also provided.
The available techniques cover a good part of the recent developments in spatial panel data literature, providing easy access to estimating and tests procedures not yet available in any commercial software.Some of the functionalities in splm are also available as Matlab or Stata code, but this is the first attempt to provide a comprehensive tool within an organized statistical programming environment.
Whenever possible, the package is consistent with the standard conventions of the R environment and in particular it borrows functionalities from spdep and plm.A new class had to be defined for spatial panel model objects, along with methods for providing the standards expected by the average R user.We also achieved interoperability with generic functions, e.g.those available in other packages such as car or lmtest.
The main developments in the foreseeable future should be directed to the inclusion of new methodologies (e.g. Lee and Yu 2009a,b;Pesaran and Tosetti 2007, among others).Furthermore, while the performance of the GM estimators can already be considered satisfactory, fulfilling any practically relevant task in seconds, code optimization for speed is on the agenda for the computationally demanding ML procedures.

D
+ (T − 2)(1 − ρ), and b has been defined above.The conditional C.2 test for H i 0 is based on the following statistic, asymptotically distributed as χ 2 1 under the null:LM ρ/λµ = D(ρ) 2 J −1 33 (47)where J −1 33 is the corresponding element of the information matrix, 18 G JT )⊗ Z + ( JT G JT ) ⊗ Z(B B) −1 Z]û (48) with Z = [T σ 2 µ I N + σ 2 e (B B) −1 ]the score evaluated at the null and û the vector of ML residuals from the estimation of a panel model with individual error components and serial correlation.Both g and b assume the same expression as before.The conditional C.3 test for H j o is based on the following statistic:LM µ/λρ = D(σ 2 µ ) . The second example focus on the conditional C.1 test for spatial error correlation allowing for serial correlation and random effects.The function returns an object of class htest.> bsjktest(fm, data = Produc, w = usaww, test = "J") Baltagi, Song, Jung and Koh joint test (J) data: log(gsp) ~log(pcap) + log(pc) + log(emp) + unemp LM = 415.7477,df = 3, p-value < 2.2e-16 alternative hypothesis: random effects or serial corr.or spatial dependence in error term > bsjktest(fm, data = Produc, w = usaww, test = "C.1")Baltagi, Song, Jung and Koh C.1 conditional test 18 For the expression of the information matrix see Baltagi et al. (2007c) (equation 3.10) 19For the full expression of the information matrix seeBaltagi et al. (2007c) (Section 3.4)   data: log(gsp) ~log(pcap) + log(pc) + log(emp) + unemp LM = 77.1509,df = 1, p-value < 2.2e-16 alternative hypothesis: spatial dependence in error terms, sub RE and serial corr.
data on public capital productivity in 48 US States observed over 17 years, available in the Ecdat package.The spatial weights matrix based on binary contiguity for the US states is included in the library.For simplicity, we restrict the time range to observations between 1970 and 1974.Munnell's model is a Cobb-Douglas production function relating the gross social product (gsp) of a given state to the input of public capital (pcap), private capital (pc) and labor (emp); state unemployment rate (unemp) is meant to capture business cycle effects.The model formula is defined once for all and includes a constant term: > fm <-log(gsp) ~log(pcap) + log(pc) + log(emp) + unemp eff) As before, time period fixed effects and the intercept can be recovered as follows: summary(err) Spatial panel fixed effects error model Call: spfeml(formula = fm, data = Produc, listw = mat2listw(usaww), model = "error", effects = "tpfe") Using the same function, simply changing the argument errors, results for the spatial error model with random effects are obtained: X is a 8, 100 × 6 matrix generated from an uniform distribution.The length of the argument spec (3) is consistent with the number of columns of the matrix of the responses.
Breusch and Pagan (1980)s in R(Anselin 1988)in the context of a pooled model with no serial correlation or individual effects.On the other hand, M.2 (test for ρ = 0) is analogous, for large T , to the well-knownBreusch  and Godfrey (1981)serial correlation test.Finally, M.3 is simply theBreusch and Pagan (1980)random effects test.
Baltagi et al. 2007c)ished testing procedures in the literature (as observed inBaltagi et al. 2007c).M.1 (test for λ = 0) is the LM test for spatial error correlation derived by Anselin splm: