Comparing Implementations of Estimation Methods for Spatial Comparing Implementations of Estimation Methods for Spatial Econometrics Econometrics

Recent advances in the implementation of spatial econometrics model estimation techniques have made it desirable to compare results, which should correspond between implementations across software applications for the same data. These model estimation techniques are associated with methods for estimating impacts (emanating eﬀects), which are also presented and compared. This review constitutes an up-to-date comparison of generalized method of moments (GMM) and maximum likelihood (ML) implementations now available. The comparison uses the cross-sectional US county data set provided by Drukker, Prucha, and Raciborski (2011c, pp. 6–7). The comparisons will be cast in the context of alternatives using the MATLAB Spatial Econometrics toolbox, Stata , Python with PySAL (GMM) and R packages including spdep , sphet and McSpatial .


Introduction
Researchers applying spatial econometric methods to empirical economic questions now have a wide range of tools, and a growing literature supporting these tools.In the 1970s and 1980s, it was typical for researchers to use tools coded in Fortran or other general programming languages, or to seek to integrate functions into existing statistical and/or matrix language environments (Anselin 2010).The use of spatial econometrics tools was widened by the ease with which methods and examples presented in Anselin (1988b) could be reproduced using SpaceStat, written in Gauss (Aptech 2007), and shipped as a built runtime module.It was rapidly complemented by the Spatial Econometrics toolbox for MATLAB (MATLAB 2011), provided as source code together with extensive documentation (see also LeSage and Pace 2009). 1 This toolbox is under active development, and accepts contributed functions, thus broadening its appeal.In addition Griffith and Layne (1999) gave code listings for model fitting techniques using SAS (SAS Institute 2008) and SPSS (IBM Corporation 2010).A suite of commands for spatial data analysis for use with Stata (StataCorp 2007) was provided by Maurizio Pisati, and distributed using the standard contributed command system (Pisati 2001).
The thrust of SpaceStat was largely been taken over by GeoDa (Anselin, Syabri, and Kho 1 http://www.spatial-econometrics.com/. 2006), and more recently by OpenGeoDa. 2The Python (van Rossum 1995) spatial analysis library (PySAL; Rey and Anselin 2010) 3 has also been launched.Since the R (R Development Core Team 2012) language and environment became available in the 1990s, collaborative code development has proceeded with varying speed.Initial attempts to implement spatial econometrics techniques in R in the spdep (Bivand 2013) package were checked against SpaceStat, and subsequently against Maurizio Pisati's Stata code and GeoDa by comparing results for the same input data and spatial weights (Bivand and Gebhardt 2000;Bivand 2002).A broad survey of the analysis of spatial data in the R environment is given by Bivand (2006) and Bivand, Pebesma, and Gómez-Rubio (2008).
In the spirit of Rey (2009), this comparison will attempt to examine some features of the implementation of functions for fitting spatial econometrics models in Stata with those in the Spatial Econometrics toolbox for MATLAB, in R and in Python.We have chosen only to compare implementations for which the estimations can be scripted, and from which the output can be transferred back to R in binary form.A consequence of this restriction is that we have not included OpenGeoDa.Because Millo and Piras (2012) provide recent comparative results for implementations of spatial panel models in R and MATLAB, we restrict our consideration to cross-sectional models, and within that to cross-sectional models implemented in at least two application languages.This results in our putting Bayesian methods for spatial econometric models aside until they become available in other languages than MATLAB.Finally, we have chosen not to consider tests for residual autocorrelation or for model specification, 4 or other diagnostic or exploratory techniques or measures, feeling that model estimation is of more immediate importance.
Initially, we describe the framework used for our comparative study, and the data set chosen for use.Next we define the models to be compared, and then move to comparisons for generalised method of moments (GMM) estimators.The GMM presentation is a substantial extension of Piras (2010), as many theoretical results have been published since then, and have been incorporated into the sphet package, as well as made available in Stata and PySAL.Following the comparison of GMM implementations, we examine implementations of maximum likelihood estimators, focussing on the consequences of details in the choices of numerical methods across the alternatives.Before concluding, we compare the provision of functions for calculating emanating effects (Kelejian, Tavlas, and Hondroyiannis 2006), also known as impacts (LeSage and Fischer 2008;LeSage and Pace 2009), simultaneous spatial reaction function/reduced form (Anselin and Lozano-Gracia 2008), equilibrium effects (Ward and Gleditsch 2008), and unwittingly touched upon by Bivand (2002) in trying to create a prediction method for spatial econometrics models.

Comparative study
The comparative study was constructed around unified R scripts.The first script prepared the data from the input data set for export to MATLAB in a text file, to Stata as a dta file and to Python as a dbf file written as part of an ESRI Shapefile.The data to be exported were run through model.framefirst to generate the intercept where needed and any dummy variables The analysis has been carried out on an Intel Core i7 64-bit system with 8GB RAM under Windows 7 Enterprise SP1.The software used was Stata 12.1, MATLAB R2011b with the March 2010 version of the Spatial Econometrics Toolbox,5 R 2.15.2 (R Development Core Team 2012) with packages spdep 0.5-56, sphet 1. 4-00, and McSpatial 1.1.1 (McMillen 2012)and their contemporary dependencies, and Python 2.7 (32-bit) with PySAL 1.4.Some of the PySAL model estimation functions may also be accessed from GeoDaSpace, but we have used PySAL directly here.
We can see from the comparison of OLS results for the selected data set shown in Table 2 that the linear algebra output of the applications used is identical, and we can assume that any differences found in comparing spatial econometrics techniques will then not be caused by divergences in linear algebra implementations.There are, however, other underlying technologies that may differ, in particular numerical optimisation.From examining source code where available, it appears that the GM methods in PySAL use the SciPy fmin_l_bfgs_b, function in the optimize module, based on L BFGS B version 2.1 from 1997,6 a quasi-Newton function for bound-constrained optimization.Nash and Varadhan (2011) provide a helpful guide to available numerical optimisation technologies, and to those available in R. In sphet, use is made of the nlminb function, which is a reverse-communication trust-region quasi-Newton method from the Port library,7 with enhanced scaling features.The same function is used by default for fitting in spdep when more than one parameter is to be optimised; optionally, other optimisers, including the R implementation of L BFGS B version 2.1.For bounded line search in spdep, use is made of the optimize function, based on Brent (1973).In the R McSpatial package (McMillen 2012), optimize is used first to conduct a line search, and then nlm, a Newton-type algorithm (Schnabel, Koontz, and Weiss 1985) is used to optimise all the parameters in a maximum likelihood model, possibly modifying the outcome of the line search.The GM functions in the Spatial Econometrics toolbox use an included function minz contributed by Michael Cliff; for bounded line search, the MATLAB fminbnd function also based on Brent (1973) is used, while when more than one parameter is to be optimised, the MATLAB fminsearch function is used -it is an implementation of the Nelder-Mead simplex algorithm.Finally, the Stata implementations of spatial econometric models use optimize mechanisms described in the help page for maximize, referred to in Drukker, Prucha, and Raciborski (2011b); Drukker et al. (2011c).The default numerical optimizer is "nr", a Statamodified Newton-Raphson algoritm, but other algorithms may be chosen (Gould, Pitblado, and Poi 2010).

Data set
We were fortunate to be able to use the simulated US Driving Under the Influence (DUI) county data set used in Drukker, Peng, Prucha, and Raciborski (2011a); Drukker et al. (2011c,b) for our comparison.The data used is simulated for 3109 counties (omitting Alaska, Hawaii, and US territories), and uses simulations from variables used by Powers and Wilson (2004).The counties are taken from an ESRI Shapefile downloaded from the US Census. 8The dependent variable dui is defined as the alcohol-related arrest rate per 100,000 daily vehicle miles traveled (DVMT).The explanatory variables include police (number of sworn officers per 100,000 DVMT); nondui (non-alcohol-related arrests per 100,000 DVMT); vehicles (number of registered vehicles per 1,000 residents), and dry (a dummy for counties that prohibit alcohol sale within their borders, about 10% of counties).A further dummy variable elect takes values of 1 if a county government faces an election, 0 otherwise, and has 295 non-zero entries.Descriptive statistics for the simulated DUI data set are shown in    (2004, p. 331) found that "there is no significant relationship between prohibition status and the DUI arrest rate when controlling for the proportionate number of sworn officers and the non-DUI arrest rate per officer" when examining data for 75 counties in Arkansas.Their best model has an adjusted R 2 of 0.397, while the simulated data has a higher value of 0.850, and the explanatory variables with the exception of nondui are all significant.As can be seen easily, all four implementations, R lm, Stata reg, MATLAB Spatial Econometrics Toolbox ols, and Python PySAL OLS give identical results, so confirming that all four applications are handling the same data.We were perplexed to find that the results from fitting the ML SARAR model (to be described below) given by Drukker et al. (2011c, p. 8), did not match those we obtained in R or Stata.
We initially adopted row-standardised spatial weights, W, where the county contiguities c ij , taking values of 1 if contiguous, sharing at least one boundary point, and 0 otherwise, are row-standardised by dividing by row sums: It turned out (personal communication, Rafal Raciborski) that the spatial weights used in estimation in Drukker et al. (2011c) were in fact minmax-normalised, where the required operation is: where m is the minmax value, the minimum of the pair of maxima of row and column sums: With symmetric neighbours as seen in this case of contiguities, the maximum row and column sums are equal by definition, here 14, so the weights are all 1/14 if counties are contiguous, and 0 otherwise.Using these weights, we could reproduce the ML SARAR estimation results given by Drukker et al. (2011c, pp. 8), and could confirm that the underlying contiguities were the same as those used in the Stata documentation.
It is also possible that the spatial dependence in the simulated DUI data set was introduced using minmax-normalised spatial weights, as the standard deviates of Moran's I statistic are 2.374 and 2.434 respectively for the dependent variable and the least squares residuals using minmax-normalisation, and 1.623 and 1.554 respectively using row standardisation; the strength of the spatial dependency is much greater in the former case.We, however, chose to use row standardisation in this comparison, partly because row standardisation is more frequently encountered in applied work, and partly because weak spatial dependence may be more challenging for implementations that stronger dependence.

A general spatial model
The present discussion is almost entirely based on Kelejian and Prucha (2010), Drukker, Egger, and Prucha (2013), Arraiz, Drukker, Kelejian, and Prucha (2010) and Drukker et al. (2011b) that provide some important extensions to Kelejian andPrucha (1998, 1999).The model is quite general and it allows for some of the right hand side variables to be endogenous.Specifically, the point of departure will be the following Cliff-Ord spatial model: where y is an n × 1 vector of observations on the dependent variable, Y is an n × p matrix of observations on p endogenous variables, X is a n × k matrix of observations on k exogenous variable, W is an n × n observed and non-stochastic spatial weighting matrix and, consequently, Wy is an n × 1 variable that is generally referred to as the spatial lag variable; π and β are corresponding parameters; and ρ Lag is the spatial autoregressive coefficient.Given the presence of Y, the model can be viewed as a representation of a single equation of as system of equations.9 The error vector u follows a spatial autoregressive process of the form: where ρ Err is a scalar parameter generally referred to as the spatial autoregressive parameter, M is an n × n spatial weighting matrix that may or may not be the same as W,10 Mu is an n × 1 vector of observation on the spatially lagged vector of residuals.
An alternative, more compact way to express the same model is: where Z = [Y, X, Wy] is the set of all (endogenous and exogenous) explanatory variables, and δ = [π , β , ρ Lag ] is the corresponding vector of parameters.Finally, the assumption on which the maximum likelihood relies is that ε ∼ N (0, σ 2 ).In the GMM approach, the estimation theory is developed both under the assumptions that the innovations ε are homoskedastic, and heteroskedastic of unknown form.

Notation
Here we adopt the notation ρ Lag for the spatial autoregressive parameter on the spatially lagged dependent variable y, and ρ Err for the spatial autoregressive parameter on the spatially lagged residuals.In Ord (1975), ρ is used for both parameters, but subsequently two schools have developed, with Anselin (1988b) and LeSage and Pace (2009) (and many others) using ρ for the spatial autoregressive parameter on the spatially lagged dependent variable y, and λ for the spatial autoregressive parameter on the spatially lagged residuals.Kelejian andPrucha (1998, 1999) (and many others particularly in the econometrics literature) adopt the opposite notation, using λ for the spatial autoregressive parameter on the spatially lagged dependent variable y, and ρ for the spatial autoregressive parameter on the spatially lagged residuals.Our choice is an attempt to disambiguate in this comparison, not least because the different software implementations use one or other scheme, so inviting confusion.
The names used for models also vary between software implementations, so that the general model given in Equation 4 is also known as a SARAR model, with first order autoregressive processes in the dependent variable and the residuals.The same model is termed a SAC model in LeSage and Pace (2009) and the MATLAB Spatial Econometrics Toolbox.This model and all its derivatives are simultaneous autoregressive (SAR) models in the sense used in spatial statistics (see also Ripley 1981, p. 88), in contrast to conditional autoregressive (CAR) models often used in disease mapping and other application areas.Other issues in the naming of models will be mentioned in the next session.

Restrictions on the general model
The general model (Equation 4) may be restricted by setting π = 0 to remove the endogenous variables.All of the models considered when comparing maximum likelihood implementations, and many GMM implementations, impose this restriction.The spatial lag model is formed as a special case with ρ Err = 0, and the spatial error model with ρ Lag = 0.The spatial error model with no endogenous variables is: This model is known as SEM (spatial error model) in LeSage and Pace (2009), and SARE (spatial autoregressive error model) in Drukker et al. (2011c).
The spatial lag model with no endogenous variables is: This model is known as SAR (spatial autoregressive model) in both LeSage and Pace (2009) and Drukker et al. (2011c), but may be confused with the spatial statistics abbreviation SAR -simultaneous autoregressive model; all spatial econometric SAR models are spatial statistics SAR models, but the reverse is not true.
In fitting spatial lag models and by extension any model including the spatially lagged dependent variable, it has emerged over time that, unlike the spatial error model, the spatial dependence in the parameter ρ Lag feeds back, obliging analysts to base interpretation not on the fitted parameters β, but rather on correctly formulated impact measures, as discussed in references given on p. 2.
This feedback comes from the data generation process of the spatial lag model (and by extension in the general model).Rewriting: where I is the n × n identity matrix.This means that the expected impact of a unit change in an exogenous variable r for a single observation i on the dependent variable y i is no longer equal to β r , unless ρ Lag = 0.The awkward n × n S r (W) = ((I − ρ Lag W) −1 Iβ r ) matrix term is needed to calculate impact measures for the spatial lag model, and similar forms for other models including the general model, when ρ Lag = 0.11

Comparing GMM implementations
The estimation of the general SARAR(1,1) model consists of various steps.Given the simultaneous presence of the endogenous variables on the right hand side of Equation 4 and the spatially autocorrelated residuals, these steps will alternate IV and GMM estimators.These estimators will be based on a set of linear and quadratic moment conditions of the form: where H is an n × p non-stochastic matrix of instruments, and A is an n × n weighting matrix whose specification will be introduced later.At this point, it is useful to introduce the spatial Cochrane-Orcutt transformation of the model, that is: where y = y − ρ Err My and Z = Z − ρ Err MZ.
As a preview of the estimation steps, an initial IV estimator of δ leads to a set of consistent residuals.This vector of residuals constitutes the base for the derivation of the quadratic moment conditions that provide a first consistent estimate for the autoregressive parameter ρ Err .12An estimate of δ is obtained from the transformed model after replacing the true value of ρ Err with its consistent estimate obtained in the previous step.Finally, in a new GM iteration, it is possible to obtain a consistent and efficient estimate of ρ Err based on generalized spatial two stage least square residuals.The asymptotic variance-covariance matrix for the coefficients is then calculated using the estimate of δ, the residuals, and the estimate of the spatial coefficient ρ Err .

SARAR model
For the case of no additional endogenous variables other than the spatial lag, the "ideal" instruments should be expressed in terms of E (Wy).This is simply because the best instruments for the right hand side variables are the conditional means and, since X and MX are non-stochastic, we can simply focus on the spatial lags Wy and MWy (see equation 11).
Given the reduced form of the model it follows that the best instruments can be expressed in terms of the E (Wy) = W(I − ρ Lag W) −1 Xβ (Lee 2003(Lee , 2007;;Kelejian, Prucha, and Yuzefovich 2004;Das, Kelejian, and Prucha 2003).Given that the roots of ρ Lag W are less than one in absolute value, Kelejian and Prucha (1999) suggested to generate an approximation to the best instruments (say H) as the subset of the linearly independent columns of where q is a pre-selected finite constant and is generally set to 2 in applied studies. 13The inclusion of instruments involving M in the instrument matrix H is only needed for the formulation of instrumental variable estimators applied to the spatially Cochrane-Orcutt transformed model.At a minimum the instruments should include the linearly independent columns of X and MX.In a more general setting where additional endogenous variables are present, since the system determining y and Y is not completely specified, the optimal instruments are not known.In this case it may be reasonable to use a set of instruments as above where X is augmented by other exogenous variables that are expected to determine Y. Unless additional information is available, it is not recommended to take the spatial lag of these additional exogenous variables.
The starting point for the estimation of ρ Err are the two following quadratic moment conditions expressed as functions of the innovation ε where the matrices A s are such that tr(A s ) = 0. Furthermore, under heteroskedasticity it is also assumed that the diagonal elements of the matrices A s are zero.The reason for this is that simplifies the formulae for the variance-covariance matrix (i.e., only depends on second moments of the innovations and not on third and fourth moments).Specific suggestions for A s are given below.In general, such choices will depend on whether or not the model assumes heteroskedasticity.
As we already mentioned, the actual estimation procedure consists of several steps, which we will review next.Our review of the estimation procedure will be intentionally brief, as we refer the interested reader to more specific literature (Kelejian and Prucha 2010;Drukker et al. 2013;Arraiz et al. 2010;Drukker et al. 2011b).
Step 1.1 -2SLS estimator The initial set of estimates are obtained from the untransformed model using the matrix of instruments H, yielding to: where Z = PZ = [X, Wy], Wy = PWy, P = H(H H) −1 H and H was specified above.
The estimate δ can then be used to obtain a initial estimate of the regression residuals, say ũ.
The GMM estimator of the next step is based on the assumption that this regression residuals are consistent.
Step 1.2 -Initial GMM estimator Let δ be the initial estimator of δ obtained from Step 1.1, let ũ = y − Z δ, and ū = M ũ.Then, we can operationalize the sample moments corresponding to Equation 14, that is: It is convenient to rewrite m( δ, ρ Err ) as an explicit function of the parameters ρ Err and ρ 2 Err as in the following expression Therefore, the initial GMM estimator for ρ Err is obtained by simply minimizing the following relation: The expression above can be interpreted as a nonlinear least square system of equations.The initial estimate is thus obtained as a solution of the above system.
We finally need an expression for the matrices A s .Drukker et al. (2013) suggest, for the homoskedastic case, the following expressions:14 and On the other hand, when heteroskedasticity is assumed, Kelejian and Prucha (2010) recommend the following expressions for A 1 and A 2 : and Step 2.1 -GS2SLS estimator Using the estimate of ρ Err obtained in Step 1.2, one can take a Cochrane-Orcutt transformation of the model as in Equation 11and estimate it again by 2SLS.The entire procedure is known in the literature as generalized spatial two stage least square (GS2SLS): where Z = PZ , Z = Z − ρErr WZ, y = y − ρErr Wy and P = H(H H) −1 H .
Step 2.2 -Consistent and efficient GMM estimator In the second sub-step of the second step, an efficient GMM estimator of ρ Err based on the GS2SLS residuals is obtained by minimizing the following expression: where Ψρ Err ρ Err is an estimator for the variance-covariance matrix of the (normalized) sample moment vector based on GS2SLS residuals.This estimator differs for the cases of homoskedastic and heteroskedastic errors.
For the homoskedastic case, the r, s (with r, s=1,2) element of Ψρ Err ρ Err is given by: where ãr = Tα r , T = H P, For the heteroskedastic case, the r, s (with r, s=1,2) element of Ψρ Err ρ Err is given by: where, Σ is a diagonal matrix whose ith diagonal element is ε2 , and ε, and ãr are defined as above. 16aving completed the previous steps, a consistent estimator for the asymptotic VC matrix Ω can be derived.The estimator is given by n Ω where: where Some of the element of the VC matrix were defined before.Once more, the estimators for Ψ δδ (ρ Err ), and Ψ δρ Err (ρ Err ) are different for the homoskedastic and the heteroskedastic cases.
In the homoskedastic case: In the heteroskedastic case:

Homoskedasticity with and without additional endogenous variables
There are various implementations of the GMM general model (under homoskedasticity) that are presented in Table 3.17 Some of them are based on the Kelejian and Prucha (1999)  The results for all of them are reported in Table 3. Specifically, the results based on the Kelejian and Prucha (1999) moment conditions correspond to columns one, four, and six.A glance at this columns reveals that, while the estimated coefficients obtained from the function gstsls in R and PySAL GM_Combo are identical (up to the sixth digit), those obtained from the Spatial Econometrics Toolbox function sac_gmm are slightly different.The discrepancies with sac_gmm in MATLAB are related to a different sets of instruments.The SE Toolbox uses two different sets of instruments: one for estimating the "original" model, one for estimating the same model after the Cocran-Orchutt transformation.In the first step of the procedure, the matrix of instruments contains all of the linearly independent columns of X, WX, and W 2 X and it is the same for all three implementations.However, to estimate the transformed model, the matrix of instruments employed by MATLAB includes, along with the intercept (untransformed), the transformed exogenous variables (say X ), and the spatial lags of them (say, WX and W 2 X ). 18The R and PySAL implementations use X (which may or may not include an intercept), and then spatial lags of X.Finally, there are also differences in reported standard errors between the three implementations.These differences pertain to a degrees of freedom correction in the variance-covariance matrix operated in R and MATLAB.However, the same correction is available as an option from PySAL. 19t should also be noted that of the three available implementations, the SE Toolbox is the only one that produces inference for ρ Err . 20et us turn now to the results shown in columns two, three, and five.From table 3 it can be observed that, apart from a different decimal for the intercept calculated in Stata, all implementations otherwise match exactly.The only major differentiation among the three implementations is the possibility of setting a different matrix A 1 in PySAL.As noted in Anselin (2013), there may be a problem with one of the sub-matrix (i.e., Ω δρ Err ) of the variance-covariance matrix of the estimated coefficients.The standard result that the variance-covariance matrix must be block-diagonal between the model coefficients and the error parameter may be invalidated by certain choices of A 1 (e.g., the one used by Drukker et al. 2013).The options of choosing alternative A 1 in PySAL are meant to avoid this problem.
As in Drukker et al. (2011b), the set of explanatory variables includes police.Undoubtably, the size of the police force may be related with the arrest rates (dui).As a consequence, police is treated as an endogenous variable.Drukker et al. (2011b) also assume that the dummy variable elect (where elect is 1 if a county government faces an election, 0 otherwise) constitute a valid instrument for police.(2011b) try to find some possible explanation to this.On the one hand, the positive coefficient may be explained in terms of coordination effort among police departments in different counties.On the other hand, it might well be that an enforcement effort in one of the counties leads people that live close to the border to drink in neighboring counties.The estimate of ρ Err is also significant but negative.As we will see later, those results are also in line with the evidence from the maximum likelihood estimation of the model.
One thing should be noted here.When we discussed the instrument matrix, we anticipated that when additional endogenous variables were present (as it is in this case), the optimal instruments are unknown.We also anticipated that, unless additional information was available, we did not recommended to include the spatial lag of these additional exogenous variables in the matrix of instrument.However, results reported in Table 4 do consider the spatial lags of elect.This is because, unlike R and PySAL, there is no option in Stata not to include them.For comparison purposes then, the default values of the corresponding arguments in R and PySAL were set to include those spatial lags.

Heteroskedasticity with and without additional endogenous variables
When the errors are assumed to be heteroskedastic of unknown form, the results are those presented in Tables 5 (when no additional endogeneity is assumed) and When the endogeneity of the police variable is taken into account, the implementations in R and PySAL are again identical, and Stata presents the same minor differences (see Table 6).

W and M are different
As we anticipated, W and M need not to be the same in all applications.Results for this case are presented in this section and, since PySAL does not include this feature, are limited to the implementations of R and Stata.From an implementation perspective, the main complication in this case is the definition of the instrument matrix.While if W and M are identical the instrument need to be specified only in terms of W, when the two spatial weighting matrix are different the instruments should combine them as in Equation 13.
Table 7 summarizes the estimation results when police is treated as an endogenous variable and W and M are not the same. 21Since the endogeneity of the police variable is accounted for, the default value to compute the lagged "additional" instruments (i.e., lag.instr) was changed in R. The results from the two implementations are almost identical except for a difference in the last decimal digit for the estimate of ρ Err .Most likely, this is again due to small differences in the optimization routines adopted by the two environments.When the residuals are assumed to be heteroskedastic, similar results are observed, and, therefore, we decided not to report them in the paper.

Spatial lag model
The to estimate the model.
When homoskedasticity is assumed, the variance covariance matrix can be estimated consistently by σ2 ( Z Z) −1 (23) where σ2 = n −1 n i=1 ũ2 i and ũ = y − Z δ.On the other hand, when heteroskedasticity is assumed the estimation of the coefficients variance covariance matrix assumes the following sandwich form: where Σ is a diagonal matrix whose elements are the squared residuals.

Homoskedasticity with and without additional endogenous variables
From Table 8  Given that we are considering the same matrix of instruments, the coefficient values of all implementations agree exactly.This had to be expected given that the OLS also matched.
There are difference though in the reported standard errors.In the two (R and SE toolbox) functions, the error variance is calculated with a degrees of freedom correction (i.e., dividing by n − k), while in the other two implementations is simply divided by n.
Table 9 shows the results for the lag model when police is treated as endogenous.While all the coefficients are matching, difference in the standard error are produced by the same degrees of freedom correction.

Heteroskedasticity with and without additional endogenous variables
Apart from MATLAB SE Toolbox, all other implementations (including the two R functions stsls and spreg) allow the estimation of the lag model under heteroskedastic innovations.Of course, the estimated coefficients are not different from the homoskedastic case, and the only variation is in the standard errors.However, the standard errors under heteroskedasticity are equal across the four models, and, therefore, we are not reporting them in the paper. 22

Spatial error model
In the spatial error model (i.e., ρ Lag = 0), we can still use most of the formulas above.The first step of the estimation procedure is either OLS (when π = 0), or IV, when π = 0 and there are endogenous variable (other than the spatial lag) in the model.After estimating ρ Err in the GMM step, we can then take the spatial Cochrane-Orcutt transformation.The resulting model can be then estimated by two stage least squares using the matrix of instruments H, where H is made up of, at least, the linearly independent columns of X, and MX.

Homoskedasticity with and without additional endogenous variables
22 These and other results are available from the authors upon request.Three of them are based on the Kelejian and Prucha (1999) moment conditions (columns one, four, and six), the others are based on Drukker et al. (2013) moment conditions (columns two, three, and five).Let us focus first on the results obtained using Kelejian and Prucha (1999) moment conditions.All three implementations produce the same identical estimated coefficients.There are differences in terms of the standard errors.We refer to the fact that, while GMerrorsar and sem_gmm produce an estimate for the standard error of the spatial coefficient, the GM_Error function in PySAL does not.The reason is that in their original contribution Kelejian and Prucha (1999)  Moving now to the three implementations that are based on Drukker et al. (2013) moment conditions, we observe that, while Stata and spreg (available from R) present exactly the same results, some distinctions are observed in PySAL.The reason for these differences were anticipated before.In a spatial error model, the second term in Equation 22 limits to zero (when there are only exogenous variables in the model).The implementations in R and Stata produce an estimate of this term, whilePySAL does not.However, as it can be appreciated from the figures in Table 10, the differences (with this specific dataset) are very minor.

R
Table 11 displays the results produced by three implementations when police is instrumented.The only three softwares in which this feature is available are R, Stata, and PySAL.A glance at the table reveals that the results across implementations are very different.The differences between R and Stata are very minor and they can be attributable to differences in optimization routines.The differences with PySAL seem to be found in the different specification of the instrument matrix.Most likely, PySAL is missing the spatial lag of the exogenous variable (i.e., MX), and in only including the additional instrument (elect) and its spatial lag.

Heteroskedasticity with and without additional endogenous variables
The same evidence is confirmed also when heteroskedasticity is assumed and, therefore, results are not reported in the paper but are available from the authors.We will take the time for one final remark.In our theoretical review of the general spatial model we anticipated the possibility of obtaining a consistent and efficient estimator of ρ Err based on 2SLS rather than GS2SLS.

HAC estimation in a spatial framework
Lastly, we are going to review a slightly different form of the model based on the assumptions that the error term follows where R is an n × n unknown non-stochastic matrix, and ε is a vector of innovations.The asymptotic distribution of the corresponding IV estimators involves the VC matrix: where Ω = RR denotes the VC matrix of ε.Kelejian and Prucha (2007) suggest estimating the individual r, s elements of Ψ as where the subscripts refer to the elements of the matrix of instruments H and the vector of estimated residuals ε.The Kernel function K() is defined in terms of the distance measure d * ij , the distance between observations i and j.The bandwidth d is such that if d * ij ≥ d, the associated Kernel is set to zero (K(d * ij /d) = 0).In other words, the bandwidth together with the Kernel function limits the number of sample covariances.
Based on Equation 27, the asymptotic variance covariance matrix ( Φ) of the S2SLS estimator of the parameters vector is given by: Piras ( 2010) mentioned that the implementation was compared with a code used by Anselin and Lozano-Gracia (2008) and that the two codes gave the same results.However, at the time Piras (2010)  Table 13: HAC estimation when police is treated as endogenous.Results for two implementations.DUI data set (standard errors in parentheses).
When there are no additional endogenous variables other than the spatial lag, the results from the two implementation match well and results are omitted.Some interesting differences are observed when police is treated as endogenous.In this case while the default in R is not to take the lags of elect; PySAL will include these lags in the matrix of instruments.This is reflected in the results shown in Table 13 4. Comparing maximum likelihood estimation 24 After reading the coordinates for each location, we used the function knearneigh to generate the distances and, after some transformation between objects, the final kernel using the R function read.gwt2dist.In PySAL, the function pysal.kernelW_from_shapefilewas used to directly read the shape file in and generate the kernel.In both cases, the centroids of the counties were used as point locations, but the distances for the nearest neighbour procedure were calculated as though the coordinates were projected, while they are in fact geographical.There are many available options for the kernel both in R and PySAL.
Having compared implementations of GMM estimators, we move to cross-section maximum likelihood (ML) estimation, recalling that ML estimation for spatial panel models was compared for MATLAB and R implementations in Millo and Piras (2012).Since Python PySAL has no ML implementations, it will not be considered.None of the ML implementations make provision for instrumenting endogenous right hand side variables, nor for accommodating heteroskedasticity -this is provided for in Bayesian implementations in the Spatial Econometrics toolbox.
In Section 1.1 (p.4), we described the numerical optimisers used in the various applications.In many cases, the numerical optimisation functions can return numerical Hessians for use as estimators of the covariance matrix of model coefficients, which may be used instead of analytical, asymptotic covariance matrices.In other cases, the numerical Hessian may be found by examining the form of the function being optimised around the optimum, for example using finite-difference Hessian algorithms.In implementations in the MATLAB Spatial Econometrics Toolbox, use is made of fdhess taken from the CompEcon Toolbox by Paul Fackler, but with the relative step size hard-coded to 1.0 • 10 −8 in sar, sdm and sac, but to 1.0 • 10 −5 in sem as in the original CompEcon function.In R, use is made of fdHess from the nlme package with a default relative step size of 6.055 • 10 −6 used without modification.In these comparisons in R, we will usually use analytical, asymptotic covariance matrices, but numerical Hessians are used sometimes for comparison.
An extensive treatment of maximum likelihood estimation for spatial models is given by LeSage and Pace (2009, pp. 45-75).Further discussion of the general model, and issues arising in fitting two parameters when the surface of the function shows little structure may be found in Bivand (2012).Here we compare implementations in MATLAB, Stata and R, looking first at spatial lag models, from which some general conclusions are drawn about the implementations, before going on to the spatial error model and the general two-parameter spatial model.

Spatial lag model
The log-likelihood function for the spatial lag model is: and by extension the same framework is used for the spatial Durbin model when [W(WX)] are grouped together.Since β can be expressed as (X X) −1 X (I − ρ Lag W)y, all of the cross-product terms can be pre-computed as cross-products of the residuals of two ancilliary regressions: y = Xβ 1 +ε 1 and Wy = Xβ 2 +ε 2 , and the sum of squares term can be calculated much faster than the log-determinant (Jacobian) term of the n × n sparse matrix I − ρ Lag W; see LeSage and Pace (2009) for details.The legacy method for computing the log-determinant term is to use eigenvalues of W: using ρ Lag to represent either parameter, and where ζ i are the eigenvalues of W (Ord 1975, p. 121); other methods are reviewed in Bivand, Hauke, and Kossowski (2013).
One discrepancy that we can account for before presenting any further results is that the log-likelihood values at the optimimum differ between two implementations: 1551.08 in R McSpatial sarml and a similar value in the SE toolbox sar function (there is a small difference because the SE toolbox optimiser differs somewhat, and is discussed below), and -2628.58 in the other two: R spdep lagsarlm and Stata spreg ml.
The reason appears to be that π in the log likelihood calculation is not multiplied by 2 in the first two cases, but is in the second two.From Table 14, we see that the coefficient estimates of the R lagsarlm and Stata spreg ml implementations agree exactly for the chosen number of digits shown.The R sarml implementation differs slightly in coefficient estimates for the intercept and for ρ Lag , but uses a different numerical optimiser.All these three optimise the same objective function, and reach the same optimum given the stopping value used by the optimiser.These three were also using eigenvalues to compute the log-determinant values; Stata spreg ml and R sarml only use eigenvalues, while R lagsarlm offers a selection of methods (Bivand et al. 2013), but here used eigenvalues.We will return below to differences in standard errors after explaining why the SE toolbox sar function yields different coefficient estimates.
The SE toolbox uses a pre-computed grid of log determinant values, choosing the nearest value of the log determinant from the grid rather than computing exactly for the current proposed value of ρ Lag at each call to the log likelihood function.In order to investigate the consequences of this design choice, which has its motivation in picking from a grid of ρ Lag and log-determinant values during Bayesian estimation, additions were edited into the sar function to return internal values so that they could be examined, to capture values from the 5 10 15 20 25 30 −0.52 −0.50 −0.48 −0.46 −0.44 function calls log determinant q q q q q q q q q q q q q q q q q q q q q q q q q q q gridded LU spline LU eigenvalues ρ Lag = 0.043 ρ Lag = 0.044 The method used by default for info.lflag= 0 in sar_lndet is to call lndetfull, which creates a coarse grid of log-determinant values for ln |I − ρ Lag W| for ρ Lag at 0.01 steps over a given range using the sparse matrix LU method.These log-determinant values are then spline-interpolated to a finer grid at 0.001 steps of ρ Lag in sar_lndet.
In info.lflag= 3, the spline-interpolation was changed to return a fitted spline object, which could then be used in the objective function f_sar to interpolate using ppval for the current proposed value of ρ Lag , so returning a much closer log-determinant value than that available from the look-up table.Finally, info.lflag= 5 was added to provide the standard eigenvalue method, calculating the log-determinant for the current proposed value of ρ Lag .
Figure 1 shows the behaviour of the optimiser for info.lflagtaking values of 0 -the gridded LU log-determinant values (gridded LU in Table 15), 3 -values interpolated from a spline object fitted to a coarse grid of LU log-determinant values (spline LU in Table 15), and 5 -values calculated using the eigenvalues of W (eigen/asy in Table 15).In the info.lflag= 0 case, behaviour is further inhibited by rounding effects in choosing values from the look up table, and by the inconsistency caused by the sum-of-squared error term in the objective function being calculated using the current proposed value of ρ Lag , while the log-determinant value is for the selected grid slot.In this default scenario, the value of the log determinant oscillates between two values of ρ Lag spanning the optimum, and the line search terminates after a series of jumps with tighter tolerance passed as info.convg as 1.0 • 10 −8 ; its default tolerance set in sar is 1.0 • 10 −4 , but using a tighter tolerance did not improve performance.When permitted to compute the log-determinant for the current proposed value of ρ Lag in the two alternative cases, the Brent line search function used in sar converges quickly.Table 15 shows that when the MATLAB SE Toolbox sar function uses eigenvalues to calculate the log-determinant, and is forced to return analytical, asymptotic coefficient standard errors (right column), the output agrees exactly with that of R lagsarlm using eigenvalues and returning analytical, asymptotic coefficient standard errors (first column).The coefficient estimates for the interpolated LU log-determinants in column three also agree, as the line search has estimated ρ Lag at the same value.Here, however, the standard errors are calculated using the numerical Hessian estimated at the optimum; sar by default switches between asymptotic and numerical Hessian standard errors at n > 500, but the criterion was changed to 4000 in the final column.The second column contains the results shown above in Table 14, final column.We have thus accounted for the differences in results of coefficient estimation between the MATLAB SE Toolbox sar function and the other implementations, and proceed to differences in standard errors.The standard errors reported by R sarlm are taken from the Hessian returned by the optimization function nlm.Stata spreg ml by default uses a modified Newton-Raphson method nr, reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n.Table 16 compares the analytical, asymptotic standard errors in the third column with those returned using numerical Hessian values.The differences that we observe can be explained through these two different approaches, either analytical standard errors calculated using asymptotic formulae, or standard errors calculated from the numerical Hessian.

Other ML estimators
The log-likelihood function for the spatial error model is: As we can see, the problem is one of balancing the log determinant term ln(|I − ρ Err W|) against the sum of squares term.When ρ Err approaches the ends of its feasible range, the log determinant term may swamp the sum of squares term.
β may be concentrated out of the sum of squared errors term, for example as: where The relationship between the log-determinant term and the sum of squares term in the log likelihood function in the spatial error model is analogous to that in the spatial lag model, but the sum of squares term involves more computation in the case of the spatial error model.In all cases, a simple line search may be used to find ρ Lag or ρ Err , and other coefficients may be calculated using an ancilliary regression once this has been done.
The general model is more demanding, and requires that ρ Lag and ρ Err be found by constrained numerical optimization in two dimensions by searching for the maximum on the surface of the log-likelihood function, which is like that of the spatial error model with additional terms in I − ρ Lag W, here assuming that the same spatial weights are used in both processes: The tuning of the constrained numerical optimization function, including the provision of starting values, reasonable stopping criteria, and also the choice of algorithm may all affect the results achieved.As we can see from Table 17, the computed coefficients agree adequately for the spatial error model implementations for R and Stata.There are minor differences in the standard errors between R and Stata, because of the use of the numerical Hessian to calculate the standard errors in Stata.The SE toolbox estimates differ somewhat because of the use of gridded log determinant values explained above for the spatial lag model case, and are not presented here.
The R spdep function sacsarlm can be used to estimate the general spatial model; results for sacsarlm and Stata spreg ml are given the two right-hand columns in Table 17.Again, the coefficient estimates are in good agreement even given the flatness of the objective function seen in Figure 2, and the closeness of ρ Err to zero (optimum of −2628.6 at the values of (ρ Lag , ρ Err ) given in Table 17 third column).This estimation success is assisted by the choice of starting values for numerical optimisation, without which outcomes may vary.

Implementing impact measures
In addition to the fitting of spatial econometric models, associated measures are needed to assist in their interpretation, in particular the impact of changes in right hand side variables in models including the spatially lagged dependent variable.In Section 1 above, we reviewed the need to calculate these emanating effects, which we will term impacts, though the cumbersome S r (W) matrix, where r is the index of the right hand side variable.This matrix term may be approximated using traces of powers of the spatial weights matrix as well as analytically.
The average direct impacts are represented by the sum of the diagonal elements of the matrix divided by n for each exogenous variable, the average total impacts are the sum of all matrix elements divided by n for each exogenous variable, while the average indirect impacts are the differences between these two vectors of impacts.
In Stata, the average total impacts are available by predicting from the estimated model using the original data, assigning the result to a new variable.Choosing variable r, x r is incremented by one, and a new prediction made, once again assigning the result to a new variable.The mean of the difference between the two predictions is then the required measure (Drukker et al. 2011c, pp. 10-15).For the spatial lag model estimated by maximum likelihood, and the police variable, the value is 0.625310; one may calculate average total impacts for all models including the spatially lagged dependent variable in Stata irrespective of estimation method.
R direct SE direct R total SE total β police 0.598350 0.598349 0.625310 0.625310 0.598157 nondui 0.000249 0.000249 0.000261 0.000261 0.000249 vehicles 0.015717 0.015717 0.016425 0.016425 0.015711 dry 0.106165 0.106165 0.110948 0.110948 0.106131 Table 18: Direct and total impacts calculated using W traces for spatial lag models estimated using maximum likelihood in R, lagsarlm and MATLAB Spatial Econometrics toolbox (SE), sar with local modifications (eigenvalue log determinants); fitted β coefficient values shown for comparison, DUI data set.
In spdep, impacts methods are available for ML and GM spatial lag and general spatial model objects.The methods can use either dense matrices or truncated series of traces, so the impacts for a single model fit may be examined using dense or sparse procedures, and using different methods for computing the traces.The same methods are available for estimation functions in the sphet package, including the spreg function.Similarly, the MATLAB Spatial Econometrics toolbox model estimation functions report impacts, in their original form as the mean values of simulations to be presented below.Here the calculated impact values for the fitted values of the β coefficients are returned in addition.Table 18 shows the correspondence of the output values for the four variables in the DUI model, together with the fitted β coefficient values; in all cases, the total impacts are somewhat larger than these coefficients.
Estimated spatial models provide ways of inferring about the significance of the right hand side variables.When the spatially lagged dependent variable is present, the fitted β coefficient values and their standard errors do not provide a satisfactory basis for inference if ρ Lag = 0.This problem may be resolved by drawing samples from the estimated model, using a multivariate Normal distribution centred on the fitted values of [ρ Lag , β], their covariance matrix, and then calculating the impacts of the sample coefficients.As mentioned, MATLAB Spatial Econometrics toolbox model estimation functions perform by default simulation, and report the impacts as the means of the simulated coefficient impact distributions.The simulations themselves may ablo be returned, and their distribution by variable analysed to permit inference.R impacts methods for fitted model objects also permit the performance of simulations, returning the simulated impacts as mcmc objects defined in the coda package (they are Monte Carlo simulations, but because the Spatial Econometrics toolbox originally used MCMC simulations, the initial implementation had followed that lead, and used an mcmc object for output).
Figure 3 shows that the distributions of direct and total impacts in both MATLAB and R for variables nondui and dry are very similar, and that zero falls in the middle of the simulated distributions for nondui (the same conclusion as that seen in Table 14).For police and vehicles, the two implementations give very similar distributions, but those of direct impacts overlap the total impacts distributions only in the lower half.The simulation results permit us to infer from estimated spatial models including the spatially lagged dependent variable.

Concluding remarks
In conclusion, we can report that although there are some differences between results yielded when using available software implementations of spatial econometrics estimation methods on the same data set, it has been possible to establish why these differences arise.Some differences relate to differing interpretations of the underlying literature, others to choices of techniques used in implementations.Most of the methods proposed in the literature and considered here can be used in most of applications, and in most cases will give the same or very similar results.
Fortunately, comparing functions in the MATLAB Spatial Econometrics toolbox, Python PySAL functions and the R spdep and sphet packages is eased by the fact that the code is open source, and so open to scrutiny.We have also benefited from answers to questions given by developers of these implementations, and by developers of Stata spatial econometrics functions.
What remains is to encourage researchers who use these and other software applications to take active part in discussion lists, where more experienced users can offer advice to those starting to discover the attractions of using spatial econometrics tools to tackle empirical economic questions.Once more real-world examples of the application of, for instance, impact measures, have been published, the usefulness of such advances will become more evident.
Having multiple implementation in different application languages provides users with more choice, and, as we have seen, constitutes a "reality check" that gives insight into the ways that formulae can be rendered into code.

Figure 1 :
Figure 1: Spatial Econometrics Toolbox log determinant artefacts during line search; comparison of the gridded method with two alternatives.

Figure 2 :
Figure 2: Contour plot values of the general spatial model log likelihood function for a 40 × 40 grid of values of (ρ Lag , ρ Err ) showing values > −4000, DUI data set, calculated with R spdep function sacsarlm.

Table 1 :
Descriptive statistics for the dependent variable and the non-dummy explanatory variables, simulated DUI data set.

Table 2
shows the stylised results from estimating the relationship between the simulated dependent variable and four explanatory variables using least squares.Powers and WilsonR lmStata reg MATLAB SE ols Python PySAL OLS

Table 2 :
OLS estimation results for four implementations, simulated DUI data set (standard errors in parentheses).

Table 3 :
moment conditions (i.e." the R function gstsls available from spdep, the Spatial Econometrics Toolbox function sac_gmm, and PySAL GM_Combo).The remaining implementations are based on the Drukker et al. (2013) moment conditions (i.e." the function spreg available from sphet, the Stata function spreg gs2sls, and PySAL GM_Combo_Hom).SARAR estimation results for six implementations, DUI data set (standard errors in parentheses).

Table 4 :
Table4presents the results when this additional endogeneity is taken into account.Results from three different implementation are presented, namely the function spreg available from sphet under R, the function spreg available from Stata (setting the option to het), and the function GM_Combo_Het available from PySAL.A glance at Table4shows that all implementation gives the same results.As pointed out in LeSage and Pace (2009), the estimated vector of parameters in a spatial model does not have the same interpretation as in a simple SARAR estimation when police is treated as endogenous.Results for three implementations, DUI data set (standard errors in parentheses) linear regression model.The estimate of ρ Lag is positive and significant thus indicating spatial autocorrelation in the dependent variable.Drukker et al.

Table 5 :
Table6(when police is treated as endogenous).From the observation of Table5it can be noticed that the implementations in R and PySAL are identical (up to the sixth decimal), and that Stata only presents very minor differences.Those differences relate to the value of the ρ Err estimated coefficient (obtained trough the non-linear least square algorithm), and to the standard error of the intercept.SARAR estimation with heteroskedasticity.Results for three implementations, DUI data set (standard errors in parentheses).

Table 6 :
estimation of the spatial lag model in Equation 8 can be easily approached by two stage least squares.If the only restriction is ρ Err = 0 and there are exogenous variables other than SARAR estimation with heteroskedasticity when police is treated as endogenous.Results for three implementations, DUI data set (standard errors in parentheses) the spatial lag in the model (i.e., if π = 0), additional instruments are then needed in order

Table 7 :
SARAR estimation when police is treated as endogenous and W and M are different.Results for two implementations, DUI data set (standard errors in parentheses).

Table 8 :
we can see that there are five implementations available of the spatial lag model when considering homoskedastic innovations and no additional endogenous variable.The first two columns of the table report results obtained from R. There are multiple functions that allow the estimation of the spatial lag model available from R under the spdep and sphet packages.stslswasthe first function made available as part of the package spdep.With the development of sphet, the wrapper function spreg also allows the estimation of the spatial lag model.The other three columns of Table8report the results obtained from Stata, PySAL, and the SE Toolbox, respectively.Lag model estimation results for five implementations.DUI data set (standard errors in parentheses).

Table 9 :
Lag model estimation when police is treated as endogenous.Results for three implementations.DUI data set (standard errors in parentheses).

Table 10 :
Error model estimation results for six implementations.DUI data set (standard errors in parentheses).

Table 10
show the spatial error model estimation results for six different implementations.These implementations include two R functions (GMerrorsar available from spdep, and spreg available from sphet), one Stata function (spreg gs2sls), two different implementation based on PySAL (GM_Error and GM_Error_Hom), and, finally, the SE Toolbox for MATLAB (sem_gmm).

Table 11 :
do not derive the joint asymptotic distribution of the parameters of the model and, therefore, GM_Error do not produce any inference on ρ Err . 23Error model estimation when police is treated as endogenous.Results for three implementations.DUI data set (standard errors in parentheses).

Table 12
reports the results when considering this additional step (i.e., step1.c).Stata does not present this feature.Results from R and PySAL are very similar and again difference must be due to the estimation of (α).

Table 12 :
Error estimation results when police is treated as endogenous.Results for two implementations (step 1.c).DUI data set (standard errors in parentheses).
was published, no other public implementation was available.Here we compare standard error estimates using a Triangular kernel with a variable bandwidth of the six nearest neighbours.24

Table 14 :
If we convert the R McSpatial value of by subtracting n 2 log(π) (line 65 in file McSpatial/R/sarml.R), and adding n 2 log(2π), we get −2628.58.Similarly, correcting the SE toolbox value (line 453 file spatial/sar models/sar.m),we get a value close to that of Stata and R spdep.The same kind of difference appears in other reported SE toolbox log likelihood values.Maximum likelihood spatial lag model estimation results for four implementations, DUI data set (standard errors in parentheses).

Table 15 :
Maximum likelihood spatial lag model estimation results for alternative Spatial Econometric Toolbox log-determinant implementations, DUI data set (standard errors in parentheses).

Table 16 :
Maximum likelihood spatial lag model standard error estimation results for alternative implementations, DUI data set.

Table 17 :
(Drukker et al. 2011c)on uses a grid search for initial values of (ρ Lag , ρ Err )(Drukker et al. 2011c), the Spatial Econometrics toolbox uses the generalized spatial twostage least squares estimates, with the option of a user providing initial values, and the spdep implementation for row-standardised spatial weights matrices uses either four candidate pairs of initial values at 0.8(L, U ), (0, 0), 0.8(U, U ), and 0.8(U, L), where L anf U are two-element vectors of bounds on (ρ Lag , ρ Err ), a full grid of nine points at the same settings, or user provided initial values; optimizers may be chosen by the user.Maximum likelihood spatial error and general spatial model estimation results for two implementations, DUI data set (standard errors in parentheses).
Monte Carlo tests for impacts from spatial lag models (direct shown in blue, total in black), implementations in Spatial Econometrics toolbox and R spdep, DUI data set.