Skip to main content
Log in

Some useful details about the Moran coefficient, the Geary ratio, and the join count indices of spatial autocorrelation

  • Original Paper
  • Published:
Journal of Spatial Econometrics

Abstract

Popular spatial autocorrelation (SA) indices employed in spatial econometrics include the Moran Coefficient (MC), the Geary Ratio, (GR) and the join count statistics (JCS). Properties of these first two quantities rely on spatial weights matrix definitions [e.g., binary 0–1 (rook or queen adjacencies), nearest neighbors, inverse inter-point distance, row standardized], which may cause confusion about output from different software packages; to date, JCS calculations have been using only binary 0–1 definitions. The MC and GR expected values for linear regression residuals also merit closer examination; although the mean and other details of the sampling distribution for the MC are well-known, at least the details of those for the GR are not. The (MC + GR) sum furnishes a potential diagnostic for georeferenced data normality, one that warrants much further explication and scrutiny. The Moran scatterplot is a widely used graphic tool for visualizing SA; this paper formally introduces its Geary scatterplot counterpart (first appearing informally in 2019), together with some comparisons of the two. Meanwhile, established relationships between the JCS and the MC and the GR need additional inspection, too, especially in terms of their sampling variances. Preliminary analyses summarized in this paper also address derived asymptotic properties as well as links with the single spatial autoregressive parameter of the simultaneous autoregressive (SAR; spatial error) and autoregressive response (AR; spatial lag) model specifications. This paper describes selected little-known features of these standard SA indices, furthering a better understanding of, and a more complete set of details about, them. Results from a myriad of empirical spatial economics landscapes [e.g., Puerto Rico, Jiangsu Province, Texas, Houston (Harris County), and the Dallas-Fort Worth (DFW) metroplex] and a variety of planar surface partitionings (including the square and hexagonal tessellations, and randomly generated graphs) illustrate highlighted theoretical and conceptual traits. These include a corroboration of the contention in the literature that the MC more closely aligns with spatial autoregression, and the GR more closely aligns with geostatistics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

Data publicly available via the US Census or other popular data source web pages.

Code availability

Computations were made using standard SAS and R routines.

Notes

  1. Using matrix notation, the MC for vector Y is \(\frac{{\text{n}}}{{{\mathbf{1}}^{{\text{T}}} {\mathbf{C1}}}}\frac{{{\mathbf{Y}}^{{\text{T}}} \left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right){\mathbf{C}}\left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right){\mathbf{Y}}}}{{{\mathbf{Y}}^{{\text{T}}} \left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right){\mathbf{Y}}}}\), where superscript T denotes the matrix transpose operator. Its GR parallel is \(\frac{{{\text{n}} - 1}}{{{\mathbf{1}}^{{\text{T}}} {\mathbf{C1}}}}\frac{{{\mathbf{Y}}^{{\text{T}}} \left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right)\left( {\left\langle {{\mathbf{C1}}} \right\rangle_{{{\mathbf{diag}}}} - {\mathbf{C}}} \right)\left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right){\mathbf{Y}}}}{{{\mathbf{Y}}^{{\text{T}}} \left( {{\mathbf{I}} - {\mathbf{11}}^{{\text{T}}} /{\text{n}}} \right){\mathbf{Y}}}}\), where <  > diag denotes a diagonal matrix.

  2. This database is informative partly because it underlines the role geographic resolution plays in SA assessment, illustrating that the county resolution tends to yield appealing results. Its numbers of areal units also span the primary range for most empirical examples. It includes the often encountered striping/banding affiliated with lower bound zero attribute values. One of its most attractive traits here is its ability to exemplify a case for which the Geary scatterplot is preferable to its Moran scatterplot alternative.

  3. H0: ρSA = 0, and HA: ρSA ≠ 0, where ρSA denotes the population SA.

  4. Griffith (2003) presents the foundational theory of Moran eigenvector spatial filtering (MESF), the methodology used to construct ESFs, with Griffith et al. (2019) furnishing its implementation treatment. In brief, MESF extracts eigenvectors from the doubly centered spatial weights matrix appearing in the MC numerator—these are map patterns representing a spectrum of distinct SA natures and degrees—and then uses these vectors as covariates in standard linear and generalized linear regression techniques to filter SA from residuals and transfer it to what becomes a nonconstant intercept term. This specification directly relates to auto-normal model specifications, and renders regression residuals that mimic being independent. It extends standard linear and nonlinear regression theory to geospatial data analysis, while preserving the desired statistical parameter estimator properties of unbiasedness, efficiency, consistency, and sufficiency. The eigenvectors involved are mutually orthogonal and uncorrelated, which facilitates using automated stepwise vector selection for the construction of a given ESF, a problem plagued by over- and under-fitting obstacles (addressable with cross-validation) as well as computational complexity arising from the screening of nearly n eigenvectors. The user-friendly freeware SAAR (https://github.com/hyeongmokoo/SAAR; Koo et al. 2018) and ESF_Tool (https://github.com/esftool/esftool) implement this methodology, as does the spmoran R package module.

  5. Kelejian and Prucha (2001) extend this MC diagnostic testing opportunity to limited dependent variable model specifications (e.g., the Tobit and bi/multinomial regression), as well as for the relatively popular spatial AR-SAR specification. Their work offers a blueprint for extending the GR testing option promoted in this paper to this same variety of residuals, particularly given results reported in Griffith (2010).

  6. Pre- and post-multiplying a spatial weights matrix by the projection matrix (I11 T/n) creates double centering; see Borg and Groenen (2005, p. 262).

  7. The diagonal of zeros is replaced by a diagonal of negative row sums; see Bapat (2010, Chapter 4).

  8. For example, if all areal units have the same number of neighbors [i.e., \(\mathop \sum \limits_{{{\text{j}} = 1}}^{{\text{n}}} {\text{c}}_{{{\text{ij}}}}\) = k \(\forall\) i], then this fraction reduces to (n – 1)/n, which asymptotically converges on 1; for n = 100, a reasonable sample size, it approximately equals 0.99.

  9. The ability to indicate the correct behavior of a given dataset more often than would be possible by pure chance, before performing a battery of rigorous diagnostics.

References

  • Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht

    Google Scholar 

  • Anselin L (1996) The Moran Scatterplot as an ESDA tool to assess local instability in spatial association. In: Fischer M, Scholten H, Unwin D (eds) Spatial analytical perspectives on GIS in environmental and socio-economic sciences. Taylor & Francis, London, pp 111–125

    Google Scholar 

  • Anselin L (2019) A local indicator of multivariate spatial association: extending Geary’s c. Geogr Anal 51:133–150

    Google Scholar 

  • Arbia G (2006) Spatial econometrics. Springer, Berlin

    Google Scholar 

  • Bapat R (2010) Chapter 4: Laplacian matrices, Graphs and Matrices. Springer, London, pp 45–55

    Google Scholar 

  • Boots B, Royle G (1991) A conjecture on the maximum value of the principal eigenvalue of a planar graph. Geogr Anal 23:276–282

    Google Scholar 

  • Borg I, Groenen P (2005) Modern multidimensional scaling: theory and applications, 2nd edn. Springer, New York

    Google Scholar 

  • Burridge P (1980) On the Cliff-Ord test for spatial correlation. J Roy Stat Soc B 42:107–108

    Google Scholar 

  • Chun Y, Griffith D (2013) Spatial statistics and geostatistics. SAGE, Thousand Oaks, CA

    Google Scholar 

  • Cliff A, Ord J (1973) Spatial autocorrelation. Pion, London

    Google Scholar 

  • Cliff A, Ord J (1981) Spatial processes. Pion, London

    Google Scholar 

  • Comber A, Brunsdon C, Radburn R (2011) A spatial analysis of variations in health access: linking geography, socio-economic status and access perceptions. Int J Health Geogr 10(1):1–11

    Google Scholar 

  • Geary R (1954) The contiguity ratio and statistical mapping. Inc Stat 5(3):115–146

    Google Scholar 

  • Griffith D (2003) Spatial autocorrelation and spatial filtering: gaining understanding through theory and scientific visualization. Springer-Verlag, Berlin

    Google Scholar 

  • Griffith D (2009) Spatial autocorrelation. In: Kitchin R, Thrift N (eds) International encyclopedia of human geography. Elsevier, Oxford, pp 308–316

    Google Scholar 

  • Griffith D (2010) The Moran coefficient for non-normal data. J Stat Plan Inference 140(11):2980–2990

    Google Scholar 

  • Griffith D (2017) Some robustness assessments of Moran eigenvector spatial filtering. Spat Stat 22:155–179

    Google Scholar 

  • Griffith D (2018) Generating random connected planar graphs. GeoInformatica 22:767–782

    Google Scholar 

  • Griffith D (2019) Negative spatial autocorrelation: one of the most neglected concepts in spatial statistics. Stats 2:388–415

    Google Scholar 

  • Griffith D, Chun Y, Li B (2019) Spatial regression analysis using eigenvector spatial filtering. Elsevier, Cambridge, MA

    Google Scholar 

  • Griffith D, Layne L (1999) A casebook for spatial statistical data analysis. Oxford, NY

    Google Scholar 

  • Griffith D, Li B (2017) A geocomputation and geovisualization comparison of Moran and Geary eigenvector spatial filtering, in CPGIS Publication Committee. In: Proceedings of the 25th international conference on geoinformatics, geoinformatics 2017. SUNY/Buffalo, Buffalo, NY, August 2–4, p 4

  • Griffith D, Paelinck JH (2007) An equation by any other name is still the same: on spatial econometrics and spatial statistics. Ann Reg Sci 41(1):209–227

    Google Scholar 

  • Griffith D, Paelinck JH (2011) Non-standard spatial statistics and spatial econometrics. Springer-Verlag, Berlin

    Google Scholar 

  • Griffith D, Agarwal K, Chen M, Lee C, Panetti E, Rhyu K, Venigalla L, Yu X (2022) Geospatial socio-economic/demographic data: the existence of spatial autocorrelation mixtures in georeferenced data—Part I & Part II. Transact GIS 26(1):72–87

    Google Scholar 

  • Hepple L (1998) Exact testing for spatial autocorrelation among regression residuals. Environ Plan A 30(1):85–108

    Google Scholar 

  • Kelejian H, Piras G (2017) Spatial econometrics. Academic Press, London

    Google Scholar 

  • Kelejian H, Prucha I (2001) On the asymptotic distribution of the Moran I test statistic with applications. J Econ 104(2):219–257

    Google Scholar 

  • Koo H, Chun Y, Griffith D (2018) Integrating spatial data analysis functionalities in a GIS environment: spatial analysis using ArcGIS Engine and R (SAAR). Trans GIS 22:721–736

    Google Scholar 

  • LeSage J, Pace R (2009) Introduction to spatial econometrics. CRC/Chapman & Hall, Boca Raton, FL

    Google Scholar 

  • Leung Y, Mei C-L, Zhang W-X (2000) Testing for spatial autocorrelation among the residuals of the geographically weighted regression. Environ Plan A 32:871–890

    Google Scholar 

  • Li H, Calder C, Cressie N (2007) Beyond Moran’s I: testing for spatial dependence based on the spatial autoregressive model. Geogr Anal 39:357–375. https://doi.org/10.1111/j.1538-4632.2007.00708.x

    Article  Google Scholar 

  • Luo Q, Griffith D, Wu H (2017) The Moran coefficient and the Geary ratio: some mathematical and numerical comparisons. In: Griffith D, Chun Y, Dean D (eds) Advances in geocomputation: geocomputation 2015the 13th international conference. Springer, Berlin, pp 253–269

    Google Scholar 

  • Luo Q, Griffith D, Wu H (2019) Spatial autocorrelation for massive spatial data: verification of efficiency and statistical power asymptotics. J Geogr Syst 21:237–269

    Google Scholar 

  • Mays G, Smith S (2009) Geographic variation in public health spending: correlates and consequences. Health Serv Res 44(5p2):1796–1817

    Google Scholar 

  • Paelinck J, Klaassen L (1979) Spatial econometrics. Saxon House, Farnborough

    Google Scholar 

  • Potter K, Koch F, Oswalt C, Iannone B III (2016) Data, data everywhere: detecting spatial patterns in fine-scale ecological information collected across a continent. Landscape Ecol 31:67–84

    Google Scholar 

  • Sauer J, Stewart K, Dezman Z (2021) A spatio-temporal Bayesian model to estimate risk and evaluate factors related to drug-involved emergency department visits in the greater Baltimore metropolitan area. J Subst Abuse Treat 131:108534

    Google Scholar 

  • Sokal R, Oden N, Thomson B (1998) Local spatial autocorrelation in a biological model. Geogr Anal 30:331–354

    Google Scholar 

  • Tait M, Tobin J (2017) Three conjectures in extremal spectral graph theory. J Comb Theory Ser B 126:137–163

    Google Scholar 

  • Tiefelsdorf M, Griffith D, Boots B (1999) A variance stabilizing coding scheme for spatial link matrices. Environ Plan A 31:165–180

    Google Scholar 

  • Wang F (2020) Why public health needs GIS: a methodological overview. Ann GIS 26(1):1–12

    Google Scholar 

  • Wennberg J, Cooper M (1998) The Dartmouth atlas of health care in Pennsylvania. American Hospital Association, Chicago

    Google Scholar 

  • Wiedermann W, Hagmann M (2016) Asymmetric properties of the Pearson correlation coefficient: correlation as the negative association between linear regression residuals. Commun Stat Theory Methods 45:6263–6283

    Google Scholar 

  • Zhang Y, Baicker K, Newhouse J (2010) Geographic variation in the quality of prescribing. N Engl J Med 363(21):1985

    Google Scholar 

Download references

Acknowledgements

Compilation of the publicly available data obtained via The Dartmouth Atlas DATA website https://data.dartmouthatlas.org/mortality was funded by the Robert Wood Johnson Foundation, as well as The Dartmouth Clinical and Translational Science Institute under the auspices of award number UL1TR001086 from the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health (NIH), and, in part, by the National Institute of Aging under the auspices of award number U01 AG046830.

Funding

No funding source.

Author information

Authors and Affiliations

Authors

Contributions

Griffith did all SAS computations, and Chun did all R computations. Griffith did the initial paper draft, and Chun and Griffith repeatedly revised through to its current version. Chun spearheaded the HSA empirical analysis.

Corresponding author

Correspondence to Yongwan Chun.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Selected MC + GR simulation experiment results

Tables

Table 6 Simulated MC + GR intervals for identically distributed random variables, rook adjacency definition, n ≈ 72, 10,000 replications

6 and

Table 7 Simulated MC + GR intervals for mixtures of random variables, rook adjacency definition, n ≈ 72, 10,000 replications

7 furnish numerical evidence gleaned from simulation experiments espousing a reliability interval for the sum MC + GR offering heuristic guidance about georeferenced data containing various natures and degrees of SA. The distilled general interval reported in this paper is 0.95 ≤ MC + GR ≤ 1.05 (i.e., 1 ± 0.05). A useful next research step would be to refine this proposition, expressing both its upper and lower bounds as functions of n, ρ, and λ1(C). These are some of the data facets appearing to introduce variation across the columns of these two tables.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Griffith, D.A., Chun, Y. Some useful details about the Moran coefficient, the Geary ratio, and the join count indices of spatial autocorrelation. J Spat Econometrics 3, 12 (2022). https://doi.org/10.1007/s43071-022-00031-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43071-022-00031-w

Keywords

JEL Classification

Navigation