Bayesian Multivariate Spatial Models for Lattice Data with INLA

The INLAMSM package for the R programming language provides a collection of multivariate spatial models for lattice data that can be used with the INLA package for Bayesian inference. The multivariate spatial models implemented include diﬀerent structures to model the spatial variation of the variables and the between-variables variability. In this way, ﬁtting multivariate spatial models becomes faster and easier. The use of the diﬀerent models included in the package is illustrated using two diﬀerent datasets: the well-known North Carolina SIDS data and mortality by three causes of death in Comu-nidad Valenciana (Spain).


Introduction
The integrated nested Laplace approximation (INLA, Rue, Martino, and Chopin 2009) provides an alternative to traditional Markov chain Monte Carlo (MCMC, Gilks, Gilks, Richardson, and Spiegelhalter 1996) for Bayesian inference. The INLA methodology focuses on estimating the posterior marginals of the model parameters instead of the joint posterior distribution. INLA is implemented in the INLA package for the R programming languange, that provides a simple way to fit models via the inla() function, which works in a similar way as other functions to fit regression models, such as glm() or gam().
The INLA package implements several likelihoods, priors and latent effects that can be used to build models. It is also capable of fitting models with several likelihoods, which can be useful for multivariate modelling. However, multivariate spatial models are not included. The INLAMSM package adds a number of multivariate latent effects that implement well-known multivariate spatial models for areal data. By fitting these models with INLA instead of MCMC computing times should be reduced.
Lattice data are made of I areas, usually related to some administrative boundaries, where the data are collected. We will assume that values of K variables are obtanied from each area. In addition, an adjacency structure is defined by considering that areas with a shared boundary are neighbors. The analysis of this type of datasets is often made by resorting to multivariate regression models. This paper is organized as follows. In the remainder of this introduction we review current software for multivariate spatial modelling. Section 2 gives a description of the different multivariate spatial models are implemented in the INLAMSM package. Two detailed applications of the INLAMSM package and the corresponding results can be founded in Section 3. Finally, a summary and discussion are provided in Section 4.
The R programming language provides a number of standalone packages for multivariate spatial analysis. In the specific case of analysing spatial point patterns, some R packages are available, such as spatstat (Baddeley, Turner et al. 2005) and spatialkernel (GÃşmez-Rubio, Zheng, Diggle, Sterratt, Peng, Murdoch, and Rowlingson 2017). Package spatstat is able to model multitype point pattern as well as handle a good deal of the models and methods for the analysis of spatial and spatio-temporal point patterns, such as estimators of the space-time inhomogeneous K-function and pair correlation function. Package spatialkernel performs edge-corrected kernel density estimation and binary kernel regression estimation for multivariate spatial point patterns.
Regarding geostatistical data, gstat (Pebesma and Wesseling 1998;Pebesma 2004) provides functions to fit both univariate and multivariate models using different types of kriging. Package spBayes (Finley, Banerjee, and Carlin 2007) is able to fit a wide variety of Gaussian spatial process models for univariate as well as multivariate point-referenced data using efficient MCMC algorithms.
Finally, in the case of areal data, the R package CARBayes (Lee 2013) offers the possibility of fitting a wide class of CAR models using MCMC methods. This package provides a number of univariate and multivariate likelihoods, and it also includes a multivariate CAR model with an inverse-Wishart and a CAR prior proposed by Leroux, Lei, and Breslow (2000) to estimate the variability between the different variables and the spatial variation, respectively.
In addition to these packages, the BUGS language (Spiegelhalter, Thomas, Best, Gilks, and Lunn 1996;Lunn, Thomas, Best, and Spiegelhalter 2000) is a flexible framework for the implementation of multivariate spatial models for lattice data. Package R2WinBUGS (Sturtz, Ligges, and Gelman 2005) provides a set of functions to call the WinBUGS software from R. Recentely, Stan (Carpenter, Gelman, Hoffman, Lee, Goodrich, Betancourt, Brubaker, Guo, Li, and Riddell 2017) develops a flexible language for Bayesian inference that can be also used to fit multivariate spatial models. The Rstan package (Stan Development Team 2019) is a convenient interface between R and Stan to fit models with ease. Both BUGS and Stan rely on MCMC algorithms for model fitting and inference.
A computationally efficient alternative to software based on MCMC algorithms is described in Rue et al. (2009). This has been implemented in an R package called INLA, which is often referred to as R-INLA to distinguish it from the main INLA methodology. INLA advantages summarize in the easy way of implementing hierarchical models and the low computing time that is needed to fit spatial or spatio-temporal models (Lindgren, Rue et al. 2015;Bakka, Rue, Fuglstad, Riebler, Bolin, Illian, Krainski, Simpson, and Lindgren 2018). This software is designed for Bayesian inference on latent Gaussian markov random field models which include (generalized) linear mixed and spatial and spatio-temporal models. INLA can deal with lattice data as well as geostistical data by means of an approximation to continuous spatial processes based on stochastic partial differential equations (SPDE Lindgren, Rue, and Lindström 2011). This approach can be used to fit log-Gaussian Cox processes (Simpson, Illian, SÃÿrbye, and Rue 2016) for spatial and spatio-temporal point patterns as well (Lindgren et al. 2011). Multivariate models can be fit by considering different likelihoods with shared terms. See, for example, Lindgren et al. (2011), Krainski, Gómez-Rubio, Bakka, Lenzi, Castro-Camilo, Simpson, Lindgren, and and Gómez-Rubio (2019).
Recently, some R packages have been developed using R-INLA as a base, for instance, inlabru (Bachl, Lindgren, Borchers, and Illian 2019) uses INLA for computing the bayesian inference for fitting spatial density surfaces and estimating abundance in a spatial point process, gridded and georeferenced context.
As our review has shown, there are only few packages that can handle multivariate spatial models in a Bayesian context. Furthermore, in the case of analysing multivariate areal data using Bayesian inference, there are options based on MCMC and INLA. R-INLA is a wide used alternative to MCMC and in addition provides smaller computation times when fitting spatial models. However, there are not a set of functions to handle a multivariate areal models using INLA. Developing these latent effects is the main goal of this paper, which will enable the user to include multivariate spatial effects in their models.

Models
INLAMSM has different functions implemented which correspond to different multivariate spatial latent effects. These differ in structure, complexity and number of hyperparameters. However, all of them are defined in a multivariate spatial context in which i = 1, . . . , I represent the spatial areas and k = 1, . . . , K is used to index the variables measured in region i.
Random effects can be represented using a matrix Θ with entries θ ik , i = 1, . . . , I, k = 1, . . . , K. Hence, the columns of Θ represent the spatial random effects associated to variable k.
A particular application of these multivariate spatial models is disease mapping, as described in the examples in Section 3. In this context, the number of the observed cases of the k disease at the i spatial areas, Y ik , is modelled as Poisson where E ik and R ik represent the expected cases and the relative risk of the i spatial area and the k disease, respectively. Then, the logarithm of the relative risk is the sum of two terms where a k is the intercept of the k disease and θ ik is the term that models the spatial variability.
Following Martinez-Beneito (2013), the variability of a multivariate spatial latent effect can be divided in two: variability between variables and variability whithin each variable, (i.e., the spatial variability corresponding to the values of different areas for a particular variable). Both variabilities are modelled with their respective variance matrices, that have different number of hyperparameters.
Finally, before explaining any model, some useful specifications are defined. Let θ i· (i = 1, ..., I) denote the i-th row of the matrix Θ and Θ ·j (j = 1, ..., J) denote the jth column of the matrix Θ. In addition, on a matrix A = [A ·1 : · · · : A ·J ], the operator vec(·) is defined by Note that all latent effects will be defined using the rgeneric latent effect, which defines it using the representation of the Gaussian Markov random fied. This is described in the INLA documentation, that can be accessed with inla.doc("rgeneric"), and Gómez-Rubio (2019). This representation is essentially a multivariate Gaussian vector with a sparse precision matrix. Hence, vec(Θ) will be defined using a Normal distribution with zero mean and a highly structured precision matrix Σ, i.e., Note that INLA works with the precision and that this is why Σ −1 will be required instead of the variance matrix Σ.

Simple Improper MCAR
One of the simplest options to model the variability within-and between-variables is using an improper CAR distribution for the spatial structure and a diagonal matrix for the covariance matrix between variables, respectively. Specifically, Θ is modelled as where D = diag(n 1 , ..., n I ) is a diagonal matrix with values n i (the number of neighbours of region i) and W is an adjacency matrix with entries W ij . Each entry W ij is equal to 1 if units i and j are neighbours and 0 otherwise.
Let τ k be the marginal precision of the k-th variable. When variables are independent of each other, the between-variables precision matrix Λ is a k × k diagonal matrix defined as Thus, this model has K hyperparameters, {τ k } K k=1 , equal to the total number of variables. This model is considered (computationally) simple because the number of hyperparameters is the lowest among the multivariate spatial models, as it will be seen below. Nevertheless, the implementation of this model in the INLAMSM package allows to easily compare a naïve model with simple independent spatial patterns with other more complex alternatives that we may want to fit.
Hyperparameters are internally represented in INLA using the log-scale so that they are not bounded, and the vector of hyperparameters is (log(τ 1 ), . . . , log(τ K )). Prior distributions are assigned to the standard deviations instead of the precisions (Gelman 2006). In particular, a uniform improper distribution between 0 and +∞ is assigned to each standard deviation: Note that INLA will report the results in the internal scale but we have included several functions in the package to transform the log-precisions into some other more meaningful scale. See Section 3 for details.

Simple Proper MCAR
A proper CAR distribution can be used to model the within-variability instead of the intrinsic version. Here, a common spatial autocorrelation parameter α for all the variables is introduced. Thus Θ is modelled as where matrices D and W are defined similarly as above.
The matrix used to model the between-variables variability keeps the same structure, thus Λ is defined as a diagonal matrix with entries (τ 1 , . . . , τ K ). Now, this model has K + 1 hyperparameters which are spatial autocorrelation parameter, α, and the K precisions, {τ k } K k=1 . Internally, the vector of hyperparameters is (α * , log(τ 1 , . . . , log(τ K )). Hyperparameter α * is defined by transforming α as follows: where α min and α max define the bounds of the domain of α.
As in the previous model, uniform prior distributions are set on the standard deviations, and a uniform prior on α is considered: Once again, the simple Proper MCAR distribution is a naïve implementation of independent spatial patterns as the simple improper MCAR. Nevertheless, this model could be used as benchmark for comparing multivariate proper CAR models assuming dependence between variables.

Improper MCAR model
A diagonal precision matrix Λ −1 is a naïve way to model the between-variables variability which leads to a low computation time. However, setting the off-diagonal elements to zero assumes that variables are independent of each other. The off-diagonal elements model the relationship between every pair of variables, which could be useful when looking for similar spatial patterns of different variables.
Therefore, a dense precision matrix and an intrinsic conditional autoregressive distribution can be chosen to model the between-and whithin-variables variability, respectively. This can be regarded as a generalization of the univariate intrinsic conditional autoregressive model. In particular, Θ is modelled similarly as the simple IMCAR: Now Λ is a dense symmetric matrix. The parameterization of Λ follows that of other latent effects implemented in INLA and the hyperparameters are the precisions of the variables and the correlations between any pair of variables. Hence, instead of dealing with the structure of Λ, Λ −1 is defined as follows Here, ρ jk is the correlation coefficient of the diseases j and k and τ k is the marginal precision of the k disease. Therefore, the set of hyperpameters contains K marginal precisions and K * (K − 1)/2 correlation parameters. In this case, a joint prior distribution is considered for Λ −1 instead of setting a prior distribution for each hyperparameter. Thus, Λ −1 follows a Wishart distribution as where r is the number of degrees of freedom and R −1 is a fixed symmetric positive definite matrix of size k × k. In our implementation, r is equal to 2k + 1 and R is the k × k identity matrix.
The vector of hyperparameters is now the precisions plus the correlations in the lower triangular matrix of Λ −1 (columnwise). However, these parameters are re-scaled so that the log-precisions are used and the correlation parameters are transformed using ρ * = logit((ρ + 1)/2), for any correlation ρ. Hence, the vector of hyperparameters in the internal scale is

Proper MCAR model
A proper CAR distribution can be used instead of the intrinsic CAR in order to model the within-variable variability. This alternative corresponds to the generalization of the univariate proper conditional autoregressive model. As in the above case, a dense Λ matrix is used to model the between-variables variability. Specifically, θ is modelled as where α is a common spatial autocorrelation parameter and Λ is a dense symmetric matrix with Λ −1 defined as in the previous model.
In this case, the set of hyperparameters comprehends one common spatial autocorrelation parameter α and K marginal precisions, {τ k } K k=1 , K * (K − 1)/2 correlation parameters, {ρ jk } K j,k=1 . As the above section, a Wishart distribution is considered as a joint prior distribution for Λ −1 matrix, while a uniform prior is consider for α, i.e., In this case, the vector of hyperparameters is made of the spatial autocorrelation α, the precisions and the correlations in the lower triangular elements of Λ −1 (columnwise). As in previous models, all hyperparameters are transformed to take values in the (−∞, +∞) interval. Hence, the vector of hyperparameters is (α * , log(τ 1 ), . . . , log(τ K ), ρ * 21 , ρ * 31 , . . .).

M-model
Botella-Rocamora, Martinez-Beneito, and Banerjee (2015) describe a unifying modeling framework for disease mapping when the number of diseases is potentially large. Here, spatial effects are linear combinations of proper CAR spatial effects. In particular, we will consider K proper CAR spatial effects defined by Here, φ k is a vector of length I.
The value of the spatial random effect θ j = (θ 1j , . . . , θ Ij ) for variable j is taken as Hence, matrix M with entries m ij defines the loadings of the different CAR spatial effects for each disease.
The distribution of these random effects is given by Here, matrices (Σ w ) j are the variance matrices of the K proper CAR spatial effects, i.e., In addition, Botella-Rocamora et al. (2015) shows that the between-diseases variance matrix is M M. The prior of this model is on M M, and it is a Wishart with parameters K and τ I. Parameter τ is a fixed precision which is set to 0.001, but it can also be considered as another parameter to be estimates in the model (Corpas-Burgos, Botella-Rocamora, and Martinez-Beneito 2019).
The vector of hyperparameters in this model is made of the autocorrelation parameters (convenientely transformed) followed by the columns of matrix M, for which no transformation is required.

Summary of the models
To summarize the different models implemented, and their associated functions, Table 1 displays some basic information about the different models. Note how all function names follow a similar pattern for consistency, and that the latent models differe in the number of hyperparameters. These depend on the number of variables k.

Examples
In this section, two examples are developed with the INLAMSM package. The first one is on the well-known North Carolina SIDS data (Cressie and Read 1985), which is used to show how the different models implemented in the INLAMSM package are fit. The second example is based on the study of three causes of death in Comunidad Valenciana (Spain), a region that comprises 540 municipalities and provides a more challenging dataset to fit spatial models than the North Carolina SIDS data (that only has 100 areas). Here, we focus on investigating the dependence among the different diseases.

North Carolina SIDS data
In the first example the North Carolina SIDS data (Cressie and Read 1985) have been considered to test the methods implemented in the INLAMSM package. This dataset includes counts of number of live births and number of sudden infant deaths for the 100 counties of North Carolina for two time periods: from July 1st, 1974 to Jun 30th, 1978 and from July 1st, 1979 to June 30th, 1984. This dataset is available in a shapefile in R package spData (Bivand, Nowosad, and Lovelace 2019b), which will be loaded using package rgdal (Bivand, Keitt, and Rowlingson 2019a).

R> library(spdep)
R> #Compute adjacency matrix, as nb object adj and sparse matrix W R> adj <-poly2nb(nc.sids) R> W <-as(nb2mat(adj, style = "B"), "Matrix") The model that will be fit in this case is Here, O ik is the number os SIDS cases in county i and period k, E ik the expected counts, α k a period-specific intercept and θ ik the multivariate spatial effect, which is defined using one of the models implemented in the INLAMSM package. Note that other covariates and effects could be included in the linear predictor as well.
The expected number of cases E ik are computed by multiplying the period-specific mortality rate r k by the number of births N ik : In the next lines of R code the expected counts for both time periods are computed, as well as the standardized mortality ratio (SMR), O ik /E ik and the proportion of non-white births, which could be used as a covariate for both time periods.

Define the multivariate spatial latent effect using one of the functions in the INLAMSM
package. This is done via the inla.rgeneric.define function, that will take the function of the multivariate spatial model to be included plus any other required arguments, such as the adjacency matrix, the number of time periods, etc.
2. Fit the model with INLA using a formula that includes the newly defined latent effect.
For example, for the simple IMCAR model this will be defined as follows: This model can be summarized as follows:

R> summary(SIMCAR)
Call: c("inla(formula = OBS~-1 + PERIOD + f(idx, model = model), family = \"poisson\", ", " data = d, E = EXP, control.predictor = list(compute = TRUE))" ) Note that hyperparameters Theta1 and Theta2 correspond to the log-precisions of the two latent effects. In this case, the two effects are independent of each other and this model is equivalent to fitting two models using an instrinsic CAR model (implemented as the besag latent effect in INLA).
Next, the fitted values of the SIMCAR model can be compared to the SMR for both periods as seen in Figure 1.

Mortality in Comunidad Valenciana (Spain)
The next example is based on simulated data of the mortality by cirrhosis, lung and oral cancer in Comunidad Valenciana (Spain). This dataset has been obtained from Martinez-Beneito and Botella-Rocamora (2019) and it has been generated to mimick the spatial pattern of the real data, that cannot be distributed due to confidentiality constraints. The original files are available at http://github.com/MigueBeneito/DisMapBook.
Here, the number of deaths by these three causes are available at the municipality level in Comunidad Valencia (Spain), as well as the expected number of cases that have been computed using internal standardization. Hence, the aim now is to estimate the spatial pattern of the different diseases as well as their possibles relationships.
Given that in the previous example we have already described how to fit all the new models in the R package, we will focus now on the models that include a term to model the covariance for several diseases. That is, only the IMCAR, PMCAR and M-model will be fit.
Data are available in several RData files that can be loaded as follows: R> #Load Observed data R> load("ObsTrivariate-mod.Rdata") R> #Load Excpected data R> load("ExpTrivariate.Rdata") R> #Load boundaries R> load("CVAL.Rdata") The adjacency matrix is computed using: R> #Compute adjacency matrix, as nb object adj and sparse matrix W R> adj <-poly2nb(CVAL) R> W <-as(nb2mat(adj, style = "B"), "Matrix") Next, observed and expected data are put together in a data.frame, together with an index variable to be passed in the definition of the latent effects: R> #Data R> d <-data.frame (OBS = as.vector(unlist(apply(Obs.mv3,2,list))), + EXP = as.vector(unlist(apply(Exp.mv3, 2, list))) + ) R> # Index for latent effect R> d$idx <-1:length(d$OBS) In order to fit the different models, the following parameters are defined: R> # Model parameters R> #Number of diseases R> k <-3 R> # Range of autoc. parameter R> alpha.min <-0 R> alpha.max <-1 Then, similarly as in the previous example, the latent effects are defined and the models are fit:  Table 2 shows the computing times for the models fit to both examples. All models have been run on a Mac OS X computer with an Intel Core i5 processor (2,7 GHz), 4 cores and 16BG of RAM. Now the models take longer to run than in the previous example because there are three diseases and about 5 times more areas (as Comunidad Valenciana has 540 municipalities in the cartography that we have used). Botella-Rocamora et al. (2015) report times of about 16 minutes to fit the multivariate spatial models with WinBUGS on problems of similar size. Hence, INLA can fit the same models in a fraction of the time.  The maps in Figure 3 show the different estimates of the spatial distribution for the different causes of death. In general, the three models produce similar point estimates of the relative risks. These will provide a transformation of the model parameters so that they are not in the internal scale anymore and make inference easier. Spatial autocorrelation parameters are transformed to be in the range between α min and α max , and log-precisions are transformed to be variances. Correlation hyperparameters are transformed to be between −1 and 1.

Model
Hence, summary statistics for the models are: Note that the first hyperparameters in the PMCAR model is the spatial autocorrelation, which is very close to one. All the other parameters are the variances and correlation parameters.
In order to recover the variance matrix, the off-diagonal entries need to be computed. Note that these depend on three parameters and that this involves multivariate inference, so that the marginals alone are not enough.
For approximate multivariate posterior inference, INLA can draw samples from the (approximate) joint posterior of the hyperparameters using function inla.posterior.sample, but this is based on the internal representation of the model, which is what we have used in order to get the posterior means of the variance matrices. Hence, our approach saves computational time.
The internal representation of the model stores different ensembles of values of the hyperparameters {γ g } G g=1 and associated values of the log-posterior density. Instead of sampling with inla.posterior.sample we will re-scale the log-posterior densities to obtain weights associated to {γ g } G g=1 , so that each value of γ g is transformed accordingly and the posterior mean of the transformed hyperparameters are computed using the weights.
For the IMCAR and PMCAR models, the posterior means of the entries of the between diseases variance matrix are: and disease 3 (oral cancer) than disease 1 (cirrhosis) with any of the two other diseases. This makes sense as lung and oral cancer are known to be highly correlated (Botella-Rocamora et al. 2015).

Discussion
The INLAMSM package builds on top of the INLA package and implements a number of multivariate spatial latent effects. Hence, this package allows an easy and simple definition of these multivariate effects to be used within a formula term to fit multivariate spatial models to lattice data. Implementation of new latent effects for multivariate data is straightforward with the rgeneric latent effect include in the INLA package. This only requires the specification of the latent effect as a GMRF, which means that the mean, precision matrix and priors for the hyperparameters need to be provided. Once the model is implemented, it is easy to include it in the model formula to fit multivariate models with INLA.
In addition, the latent models implemented in the package can be used as templates to implement new multivariate spatial models for lattice data. In the future, we hope to increase the number of multivariate spatial models in the package.
In the examples provided to illustrate the use of the package we have considered a small dataset to fit all possible models. Times required to fit the models are small. The second example deals with a region with a larger number of areas which shows that our package can be used together with INLA to fit multivariate models.
Despite our focus on multivariate spatial models for disease mapping, it is worth mentioning that the multivariate models implemented in the INLAMSM package be used in other contexts. Furthermore, these multivariate spatial models can also be used to build temporal and spatio-temporal models by using a temporal adjacency matrix.