A Bayesian Hierarchical Model for the Spatial Analysis of Carbon Monoxide Pollution Extremes in Mexico City

Air pollution by carbon monoxide is a serious problem that affects many cities around the world, and the theory of extreme values has played a crucial role in the study of this issue. In this paper, we proposed a Bayesian hierarchical spatial model of extreme values to evaluate the risk of extreme events of air pollution due to carbon monoxide in the metropolitan area of Mexico City. Spatial trends are modeled through of a Gaussian process for the generalized extreme value (GEV) distribution parameters, and prediction maps are produced for each of these. .e results show a marginal spatial behavior for the location, scale, and shape parameters of GEV distribution, which indicate the existence of local variations that would not be possible to model using only stationary models. A return map of the maximum concentrations with a return period of one year is obtained. We found that the return levels for a one-year return period of CO concentration above 8 ppm in the Metropolitan Area of the Valley of Mexico are concentrated in the central part of this region, and the areas with the lowest estimates are distributed in the periphery. In addition, a quantile-quantile (QQ) plot between the theoretical and empirical quantiles was provided, which showed a very good fit of data to the proposed model.


Introduction
Carbon monoxide (CO) is a highly toxic odorless and colorless gas that is mainly emitted during the combustion of fuels or any other organic materials in an atmosphere with a limited amount of oxygen [1]. It is considered as one of the biggest pollutants in the Earth's atmosphere and one of the biggest environmental problems in Latin America [2]. Its presence in the atmosphere in high concentrations is fatal for humans. Automobiles are the biggest producers of CO, and intoxication by this pollutant is one of the most common types of poisoning. It can disable oxygen transport to cells and cause dizziness, headaches, nausea, unconsciousness, and even death when exposed to high levels [3].
e Mexican norm NOM-021-SSA1-1993 standard states that this pollutant must not exceed 11.00 parts per million (ppm), which is equivalent to a moving average of 12,595 μg/ m 3 over eight hours once a year, in order to protect the population at large [4].
On the other hand, in recent years, the statistical modeling of extreme events has been applied to numerous fields, such as hydrology and the environment, and the extreme value theory has been used extensively. Resnick et al. [5][6][7] presented the mathematical support for the extreme value theory, and Beirlant et al. [8,9] presented its statistical basis. Within this theory, modeling with spatial extremes has recently become of great interest and has given rise to two different methods of modeling. e first is based on the max-stable processes which are an infinite-dimensional generalization of the multivariate extreme value theory and which are the only possible limit models for properly renormalized block maxima and whose block size increases to infinity [9][10][11][12][13]. In this first method, recent applications have been developed in both the univariate and multivariate cases and both frequentist and Bayesian approach. For example, Vettori et al. [14] proposed a new class of multivariate max-stable processes that extend the Reich-Shaby model [15] to the multivariate setting, and their model admits a hierarchical structure which facilitates Bayesian inference. ey fit this model to the maxima of air pollution concentration and temperatures in the Los Angeles area. e contribution of Vettori et al. [14] is within this first method in the multivariate case and using the Bayesian approach. e second method for modeling spatial extremes, which the contribution of our work focused, uses Bayesian hierarchical models with a latent spatial structure, in which the spatial variations of the parameters in the marginal distribution can be described. One of the first applications of latent processes in the spatial extremes approach was presented by Coles and Casson [16,17], who used a Bayesian hierarchical model to model wind speed data of hurricanes on the U.S. Gulf Coast.
Recent applications of this second approach can be found in [18], who developed a hierarchical spatial model for constructing maps of extreme precipitation return levels for a region of Colorado. ey modeled the exceedances over a threshold u common to all locations using the generalized Pareto distribution and an isotropic and stationary exponential covariance function to induce the spatial dependence in the latent process for the intensity and scale parameters. To generate the return level map, both the exceedance and its frequency of occurrence were modeled and a hierarchical model for both of these was constructed. Turkman et al. [19] employed a random walk to characterize the temporal properties and smooth the spatial dependence, constructing a spatiotemporal model that was applied to wind speeds in Portugal. Gaetan and Grigoletto [20] proposed a spatial model for analyzing the annual maximum precipitation values over the Triveneto region in Italy. A two-level hierarchical model was proposed using nonstationary spatial dependences and a spatial trend in the generalized extreme value distribution parameters. Cooley and Sain [21] developed a Bayesian hierarchical model for characterizing extreme precipitation by simulating a regional climate model over its spatial domain. Sang and Gelfand [22] fit a spatiotemporal model to extreme precipitation by assuming conditional independence in the data given the spatially correlated parameters and provided spatiotemporal forecasts by evaluating the uncertainty with Monte Carlo methods. Nakajima et al. [23,24] proposed a state-space approach for modeling the temporal dependence in an extreme value process and proposed an efficient algorithm to implement the Markov chain Monte Carlo (MCMC) method to perform Bayesian inference. Apputhurai and Stephenson [25] employed a spatiotemporal latent process for modeling extreme precipitation events in southeast Australia, in which the temporal part is captured by modeling trends in the location and scale parameters, with the spatial part captured using an anisotropic Gaussian random field. Ghosh and Mallick [26] considered a spatiotemporal approach that incorporates the spatial dependence in the likelihood, whereas the temporal dependence is maintained inside the latent structure. Yan and Moradkhani [27] proposed a Bayesian hierarchical model for the analysis of frequent floods, which is an alternative to the traditional analysis because it captures the spatial dependence in its inner structure. Wang and So [28] proposed a Bayesian hierarchical model that addresses the problem of spatial extremes with multiple durations. e effect of these extreme durations is characterized by pooling the extremes with different durations and merging the duration into a latent spatial process structure as one of the covariates. García et al. [29] used a hierarchical spatiotemporal Bayesian model with a generalized extreme value parameterization to analyze the temporal trend in extreme rainfall in the region of Extremadura, Spain, for the period 1961-2009. Sharkey and Winter [30] proposed a spatial extreme value model using the Bayesian hierarchical modeling, using an adjusted likelihood to account for the spatial and temporal dependence in the data when performing inference on the model parameters, by imposing a condition of spatial similarity on the model parameters, and produced a map of probabilities of extreme events that have similar behavior at neighboring sites.
Many studies have focused on some pollutants and air quality. For example, Menezes et al. [31] studied NO 2 in Portugal from a spatiotemporal approach using geostatistical tools. From the extreme value classical theory point of view without considering the spatial component, Kuchenhoff and amerus [32] adjusted three different models to analyze extreme value of daily average of ozone and nitrogen dioxide concentrations measured at two monitoring sites in Munich. Barrios and Rodrigues [33] studied the occurrence and duration of ozone exceedances in Mexico City using a queueing theory model. Rodríguez et al. [34] studied the ozone trends in Mexico City using the generalized extreme value distribution, where they estimated a trend parameter in each region as well as an overall trend parameter. Martins et al. [35] applied the extreme value theory in two large urban regions of Brazil, and the probabilities of exceedance and the return period of high concentration of pollutant were calculated for carbon monoxide, nitrogen oxides, ozone, and suspended particles for both regions.
However, no studies have utilized the extreme value spatial approach.
us, this work focuses on the spatial analysis of CO maxima in the Metropolitan Area of Valley of Mexico using the Bayesian methodology. In Mexico City, with more than 21 million inhabitants in its metropolitan area, air pollution has long been a major concern [34]. In densely populated areas such as Metropolitan Area of the Valley of Mexico, this pollutant can be considered a public health problem. is region constitutes a densely populated area limited by mountains that impede the circulation of air and within which extreme cases of air pollution are recorded. In this region, CO concentrations present spatial variations due to the vast amount of surface extension that makes up the region and the unequal distribution of human settlements. Environmental Atmospheric Monitoring System (SIMAT for its Spanish initials) that is composed, among other components, of the Automated Atmospheric Monitoring Network (RAMA for its Spanish initials) continuously measures different pollutants including CO which has been the most consistent pollutant in the area. erefore, the goal of this article is to propose a hierarchical model with a spatial structure in the extreme value distribution parameters, which is generalized through a latent spatial process for generating prediction maps for our CO data.

Description of the Data and Study Region.
e study region is the Metropolitan Area of the Valley of Mexico (MAVM), which is shown in Figure 1. e region includes Mexico City and 29 municipalities of the State of Mexico and is located at 19°20 North latitude and − 99°05 West longitude. e region has an average elevation of 2240 meters above sea level and covers an area of 3540 km 2 . e elevation at which the Valley of Mexico is located causes the oxygen content to be 23% lower than at sea level, which tends to make combustion processes more polluting.
In the MAVM, there is currently an extensive infrastructure for the measurement and recording of atmospheric pollutant levels and of the main meteorological factors that affect the air pollution levels. is infrastructure includes RAMA, which consists of 30 monitoring stations throughout the region. e network monitors the atmospheric concentrations of the main pollutants, such as sulphur dioxide (SO 2 ), carbon monoxide (CO), nitrogen dioxide (NO 2 ), ozone (O 3 ), suspended particles (PM 10 and PM 2.5 ), total suspended particles (TSP), and lead (Pb). Of these pollutants, CO has consistently been the most emitted pollutant in the MAVM, with values that have varied between 66% and 79% of the total emissions [36]. e data analyzed consisted of daily measurements of the CO levels recorded in Mexico City and its metropolitan area. e data were obtained from SIMAT of Mexico City and consisted of daily measurements of the CO levels recorded at 9 meteorological stations (see Figure 1) over the period 2008-2015. e data are available at http://www.aire.cdmx. gob.mx.

Extreme Value eory.
Extreme value theory (EVT) deals mainly with modeling the tail of a distribution. ere are two approaches for obtaining extreme values, which are then used for modeling: block maxima and peak-overthreshold. e former considers taking the maxima in blocks of equal size. at is to say, given a succession of identically distributed independent random variables, X 1 , X 2 , . . . , if a block of size n is taken, then M n � max X 1 , X 2 , . . . , X n . If there are successions of constants a n > 0 and b n such that is the generalized extreme value (GEV) distribution [37], whose cumulative distribution function (CDF) is which is defined for 1 + (ξ(z − μ)/σ) > 0, − ∞ < μ, ξ < ∞, and σ > 0, where μ, σ, and ξ denote the location, scale, and shape parameters, respectively [9]. e ξ parameter controls the behavior of the distribution's tail. If ξ < 0, the GEV distribution corresponds to the Weibull distribution with an upper-bounded tail; if ξ � 0, it corresponds to the Gumbel distribution with an unbounded tail; and if ξ > 0, it corresponds to the Fréchet distribution with a heavy tail. For additional details about the second approximation, see [9,38]. A special case of the GEV distribution is the unit Fréchet distribution with a CDF G(y) � exp(− (1/y)). If the transformation Z � 1 + (ξ(y − μ)/σ) is considered, where Y follows a GEV distribution, then Z has a unit Fréchet distribution, meaning that the marginal GEV distributions can be separated from their dependence structure. One of the goals in the analysis of extreme values is to estimate a quantile of the distribution of Y, which is the solution of G(z p ) � p that results in for ξ ≠ 0, whereas if the limit ξ ⟶ 0 is taken, it results in is taken, then z p is called the return level in T observations, which is interpreted as the level exceeded on average every T observation. Obtaining this level is very convenient for interpreting extreme value models rather than just obtaining the values of the individual parameters.

Model Formulation.
To analyze our data, we propose a model that includes the spatial variation in the parameters of the extreme value distribution which can be described by a latent spatial process [12,16]. More specifically, it is assumed that the response variables Y(x) { } are conditionally independent on an unobserved latent spatial process S(x), x ∈ X { } and that the GEV distribution parameters μ(x), σ(x), ξ(x) vary smoothly for each x ∈ X according to this spatial process. In addition, S(x) { } is assumed to be mutually independent from each GEV distribution parameter, although this assumption can be relaxed [12]. For example, μ(x) is modeled as where f μ (x; β μ ) is a deterministic function that depends on the regression parameters β μ � (β 0 , β 1 , β 2 ) and S μ (x; α μ , λ μ ) is a stationary Gaussian process with a zero mean and a covariance function c μ (h) � α μ exp(− (|h|/λ μ )), where α μ and λ μ are the sill and range, respectively, which are both unknown and h ∈ R + is the Euclidean distance between location 1 and location 2. Similar expressions are considered for the parameters σ(x) and ξ(x). Moreover, we assume that S μ (x; α μ , λ μ ), S σ (x; α σ , λ σ ) y S ξ (x; α ξ , λ ξ ) are mutually independent. us, if it is conditioned over the three Gaussian processes at locations (x 1 , . . . , x D ), it is assumed that the maxima are independent with i � 1, . . . , n and d � 1, . . . , D. From the Bayesian approach, it is necessary to define a priori distributions for the parameters α μ , α σ , α ξ , λ μ , λ σ , λ ξ , β μ , β σ , β ξ , and following [12], Mathematical Problems in Engineering we take an inverse gamma distribution for α μ and a multivariate normal distribution for β μ , which are conjugated a priori, and a noninformative gamma distribution for λ μ . Similarly, a priori distributions are taken for the parameters σ(x) and ξ(x). To obtain samples of the joint posterior distribution, we use an efficient implementation of the Markov chain Monte Carlo algorithm included in the SpatialExtremes package in R Studio [39,40]. is algorithm is described in detail in [12].

Methodology.
To model the CO data in the study region with the previously described model, we propose the following methodology: Use the block maxima approach, in which each block is formed by three consecutive days of observations to guarantee their independence [9]. Adjust the GEV distribution (1) using marginal parameters described by the trend surfaces where lat(x) and lon(x) are the latitude and longitude, respectively, of the stations where the data were obtained.
Propose the a priori distributions for the parameters; a multivariate normal distribution with a zero mean and a large variance is proposed for the regression parameters, whereas for the parameters of the covariance function for the latent process S(x), α and λ, informative a priori inverse gamma and gamma distributions, respectively, are proposed [41]. e parameters for these distributions are shown in Table 1. ese distributions were chosen after an exploratory analysis of the behavior of the posterior distributions. Prepare the maps for each parameter of the GEV distribution and for the return level with the estimated trends.

Results
A statistical summary of carbon monoxide concentrations by monitoring stations, as well as their geographic information, such as latitude, longitude and coding, is shown in Table 2. Figure 2 shows the time series for Pedregal station from 2008 to 2013, in which the behavior of three-day block maxima of carbon monoxide is illustrated for each year. As previously mentioned, the Mexican norm NOM-021-SSA1-1993 standard states that this pollutant must not exceed 11.00 parts per million (ppm). Note that all stations have carbon monoxide levels below the permitted level according to the last column which shows the maximum recorded for each station. Figure 3 shows the boxplot of the CO maxima at each of the 9 monitoring stations considered in the study presented in Table 2. e station with the lowest CO maxima is located   Table 3. ese parameters were obtained after 100,000 iterations of the Markov chain with a thinning of 10 and a warm-up period of 5000 iterations. e Gibbs sampler algorithm produces samples of the posterior for all unknown quantities of our model and after a few minutes it achieves convergence.
Note that there are no variations (sign changes) in the parameters β 1 and β 2 for the location parameter μ(x). is seems reasonable because there is no significant change in elevation in the MAVM, which is on average 2240 meters above sea level. is is due to the fact that all of the meteorological stations in this study have an elevation of between 2221 and 2326 meters, something which also occurs with the pattern of variation for scale parameter σ. Because of the shape of parameter ξ < 0, there is a Weibull distribution for the results, although it could correspond to a Gumbel distribution because it is very close to zero, and the credible interval contains the positive part.
In environmental models, especially in Bayesian statistics, it is common to perform a sensitivity analysis of the results with regard to the selection of the a priori distributions. Figure 4 shows the posterior distributions for α and λ for each parameter of the GEV distribution, which have different a priori distributions. Although they are very different in all cases, the results in Table 3 and the maps in Figures 5-8 are similar. is means that in our analysis, thesey a priori distributions do not have a significant impact on the subsequent a posteriori distribution for these   at is to say, one could use other hyperparameters for the a priori distributions shown in Table 1 and the results presented in this article would continue to be valid. e QQ plot for our data and the GEV distribution with parameters given in (2) is shown in Figure 9.
e results show that simultaneous learning in α and λ parameters was not possible because the posterior distribution for each parameter does not differ significantly from the a priori distribution; this occurred in each parameter of the GEV distribution. ese results are shown in Figure 10.   Table 3: Estimated parameters for the posterior distribution and the latent process. e estimation corresponds to the posterior mean and its corresponding credible interval of 95%, which is shown in parenthesis.    Figure 8 shows the map for the oneyear return level. ese maps were generated by performing a conditional simulation of three independent Gaussian processes for each state of the Markov chain given the actual values of μ, σ, and ξ. is realization is then used to calculate the return levels at the locations. ese maps show the parameters' marginal spatial behavior, which allows the local variations in the return levels to be captured, something which is not possible using only the deterministic trends. erefore, this shows the main strength of the latent variable approach: the use of stochastic processes to model the spatial behavior of the marginal parameters enables us to capture complex local variation in the return levels that deterministic trend surfaces cannot reproduce.
One of the goals of the proposed study is the use of information from multiple monitoring stations within a single model, thus taking advantage of existing relationships between data more efficiently and generating more robust results than when using only one monitoring station. We achieved this by adjusting a nonstationary GEV model with a radial-based Gaussian correlation structure, assuming a Gaussian process for the parameters. Unlike most models used in geostatistics which use the normal distribution to directly model the concentrations of the air pollutant, we have used the generalized extreme value distribution by adjusting their respective GEV parameters to each spatial location to model the nonlinear spatial features present in trends of the maxima of carbon monoxide (CO) concentrations.

Discussion
Similar to the findings in other studies related to the spatial distribution of air pollutant maxima [42], we have found variations and geospatial tendencies in the analyses of extremes of CO concentrations using a nonstationary extreme value model. ese variations have been shown through maps of the location, scale, and shape parameters of the GEV distribution. In the case of [43], the authors analyzed the distribution of the PM10 maxima in the MAVM and found patterns similar to the CO maxima, except for the northern region of the study area in which estimates with low levels were obtained. ese results are similar despite the fact that a different approach to define the predictor of the GEV parameters was used. Based on the principle that close observations in space tend to have more similar values than more distant observations, we assume a stochastic process with a covariance structure dependent on the distance between the covariates as a predictor of the parameters of the GEV distribution. Furthermore, we assume an exponential model for the covariance structure of the stochastic process, similar to some geostatistics models based on Gaussian processes, although, unlike the latter, the GEV distribution is used as a response variable.
In the spatial plane, we observed that the parameters of the distribution of extreme values vary locally. In particularly, the locality parameter ( Figure 5) shows that pollutant concentrations increase in the central zone of the study area and gradually decrease towards the periphery.
is is due to the high number of automotive vehicles and the concentration of factories in this area. Similarly, as we move towards the periphery, pollutant concentrations decrease, due to the existence of natural areas, such as Ajusco, where there are still protected natural areas that constitute a natural vent for the rest of the conurbation area. is was also the case for the other distribution parameters (Figures 6  and 7). Regarding the map of the estimated one-year return level for CO data (Figure 8), the northwest zone presents the lowest level of this pollutant compared with the southeast zone, which, as previously mentioned, is due to the forested areas. is shows the main strength of the latent variable approach: the use of stochastic processes to model the spatial behavior of the marginal parameters enables us to capture complex local variations in the return levels that deterministic trend surfaces cannot reproduce.
It is interesting to compare the sectoral distribution of CO emissions in the north, center, and south of the MAVM. Although the northern region is an industrialized area, the highest 1-year return levels of CO maxima are found in the central region, which also has the highest population density. is is consistent with the results of [43], which indicate that the domestic and transport sectors emit more CO emissions than the industrial sector. e southern region has a lower population density with less industry, and, therefore, resulted in consistently lower 1-year return level estimates being obtained for this region.

Conclusions
In the present research, we have proposed a parsimonious and robust spatial model of extreme values to evaluate the risk of extreme events of air pollution due to carbon monoxide (CO). Prediction maps are obtained for the generalized extreme value distribution parameters, in which a marginal behavior is observed allowing us to capture local variations in the return levels. is would not be possible using only the deterministic methods for estimating the trends of GEV parameters. e estimation of parameters was carried out using a Bayesian approach in order to include the proposed correlation structure of the linear predictor of the GEV parameters in a natural manner, as well as to eliminate possible invalid parameter values. We found that the return levels for a 1-year return period of CO concentration above 8 ppm in the MAVM are concentrated in the central part of the region studied and the areas with the lowest estimates are distributed in the periphery. ese areas with high levels in the CO maxima coincide with the most densely populated regions and are consistent with the results that indicate that the domestic sector is one of the main sources of carbon emissions.
e results obtained in this research on the spatial distribution of CO maxima, in addition to evaluating and monitoring the risk of extreme events, could be used for further study on carbon emissions and climate as validation and testing tools and will help government authorities to establish policies for the prevention, monitoring, and surveillance of extreme concentrations of CO pollution. Extensions of this work should consider more complex smoothing functions for the GEV parameters to capture the additional nonlinear components of the trend of extreme values of CO concentrations. Moreover, future research should explore different correlation structures for the Gaussian process for the parameters of the GEV distributions.
Data Availability e carbon monoxide pollution data used to support the findings of this study have been deposited in the http://www. aire.cdmx.gob.mx repository.

Conflicts of Interest
e authors declare no conflicts of interest.