Critical Episodes of PM10 Particulate Matter Pollution in Santiago of Chile, an Approximation Using Two Prediction Methods: MARS Models and Gamma Models

This has meant that countries take a series of environmental management measures to control particulate material emissions and also to generate early prediction of high air pollution episodes. Some measures are: a systematic shift to cleaner fuels, restriction of daily circulation of a certain percentage of motor vehicles, temporary shutdown of some industries, and so on. Air pollution causes are diverse, being anthropogenic activities the major contributor to the problem. But the air pollution level is also influenced by other factors such as climate and topography. Climate has a decisive influence on the persistence of air pollutants, and the winds, temperature and solar radiation drastically alter the dispersion and the type of contaminants present at one time. Topography influences the movement of air masses and hence the persistence of pollution levels in a given geographical area. The combination of these ultimately determines the quality of air 20.


Introduction
Several studies worldwide have reported the effects of air pollution on human health, especially those related with exposure to particulate matter 1,8,17,20,21,25 , research it is currently focussed on studying the acute and short-term effects, especially the mortality and morbidity impact due to cardiovascular and respiratory reasons 1,8,17,21,25 .
This has meant that countries take a series of environmental management measures to control particulate material emissions and also to generate early prediction of high air pollution episodes.Some measures are: a systematic shift to cleaner fuels, restriction of daily circulation of a certain percentage of motor vehicles, temporary shutdown of some industries, and so on.Air pollution causes are diverse, being anthropogenic activities the major contributor to the problem.But the air pollution level is also influenced by other factors such as climate and topography.Climate has a decisive influence on the persistence of air pollutants, and the winds, temperature and solar radiation drastically alter the dispersion and the type of contaminants present at one time.Topography influences the movement of air masses and hence the persistence of pollution levels in a given geographical area.The combination of these ultimately determines the quality of air 20 .
Prediction of critical episodes of air pollution in large cities has become an environmental management tool aimed to protect the health of the population, allowing health authorities to know with some certainty the likely levels of air pollution border within a certain time interval.This prediction has been addressed through different models combining deterministic and probabilistic approaches using various types of information 6,9,13,14,22,28 .The official methodology in use by the administrative authority of the Metropolitan Region of Santiago de Chile to forecast PM10 concentrations is the Cassmassi Model, proposed in 1999 by Joseph Cassmassi 2 .It uses multiple linear regression to predict the maximum concentration of 24-hour average PM10 for the next day (00 to 24 hours).This model includes observed meteorological variables, observed and forecasted weather conditions rates, pollutants expected concentrations rates of expected variations in emissions and others predictors.
Different statistical methods have been used in Chile to model airborne PM10 concentrations of air pollutants including time series 20 , neural networks 13,22 and regression models based on multivariate adaptive smoothing functions (splines) MARS 23 .The predictive efficiency of these models is variable and is closely associated with the behavior and evolution of environmental characteristics.Models that use extreme value theory are widely used for this purpose, especially for episodes that occur over short periods of time and present extreme values or exceedances of emergency limits established by the authority 3 .
The aim of this study is to compare the predictive efficiency of multivariate predictive models Gamma vs MARS to predict "tomorrow" maximum concentration of PM10 in Santiago de Chile in the period between April 1 and August 31 of the years 2001, 2002, 2003 and 2004.

Information sources
We used the databases of PM10 collected at the Pudahuel monitoring station, that it is element of the MACAM2-RM monitoring network, for the years 2001, 2002, 2003 and 2004.For each year measurements were selected from April 1 to August 31, which correspond to the time of year with less ventilation in the Santiago basin.We worked with the moving average of 24 hours.For missing data, imputed values were generated by a double exponential smoothing with a smoothing coefficient α = 0.7 (see Annex 1).We selected this monitoring station because most of the year presents the highest levels of PM10 concentration.It is also the most influential in taking administrative decisions about forecasting critical episodes for the next day.Moreover, because of environmental management measures implemented when declaring a critical episode of PM10 pollution, the behavior of the time series, is affected generating lower levels of concentrations that do not reflect actual observed concentration that would have occurred without environmental management measures , so we penalize this effect by introducing a correction constant given by 1 24  24

Procedures
152 multi-dimensional observations were used; they consisted of a response variable PM10 and 13 predictors.The modeling includes delays of 1 and 2 days for the variables of interest, corresponding (N+1) to tomorrow, i.e. the day to be modeled.The predictor variables for today (delay 1) were defined as pm0 the hourly average PM10 concentration at 0:00 hrs of day N, pm6 the average hourly PM10 concentration at 6:00 hrs on day N, pm12 the average hourly PM10 concentration at 12:00 hrs on day N, pm18 the hourly average PM10 concentration at 18:00 hrs on day N.
Some predictor variables incorporate delays of 2 days; they are: pm10h maximum PM10 concentration of 24-hour moving average between19 hrs of day N-1 and 18 hrs day N, mth maximum temperature between 19 hrs of day N-1 and 18 hrs o day N, mhrh minimum relative humidity between 19 hrs of day N-1 and 18 hrs of day N, dth difference between maximum and minimum temperature between 19 hrs of day N-1 and 18 hrs of day N, vvh , average wind velocity 19 hrs of day N-1 and 18 hrs of day N.The predictors of tomorrow (N+1) correspond to: mtm maximum temperature of day N+1, mhrm minimum relative humidity of day N+1, dtm difference between the maximum and minimum temperature of day N+1 and vvm average speed wind of day.The response in this study (pm10m) is the maximum concentration of the 24 hs moving average of PM10 of a day N+1.The values of the variables of tomorrow are forecasts validated and reported by the Chilean Meteorological Office using models Mesoscale Modeling System (MM5), which is a numerical model that uses the equations of physics of the atmosphere for weather forecasting in limited areas at the regional level 15 , The authorities of the National Environment Commission (CONAMA), have defined four levels of PM10, in order to make management decisions when critical events occur: good 0 -193 (µg/m 3 ); alert 194 -239 (µg/m 3 ), pre-emergency 240 -329 (µg/m 3 ) and emergency PM10 > 330 (µg/m 3 ) 3 .For our study we dichotomized the response into two classes I: pm10m < 240 (µg/m 3 ) and II: pm10m > 240 (µg/m 3 ), ie, "good or alert" versus "preemergency or emergency."The aim of the dichotomy is to generate 2 x 2 cross-tabulations between the observed and predicted values of the response for each proposed model.

Construction of Gamma models and MARS models
The fit of the models of a given year was validated with the information of the following year, thus ensuring the independence of the data used for validation with respect to those used in its construction.Therefore no predictions are given for the model year built with year 2004 information since there is no information for 2005.Each model was estimated with data from the period between April 1 and August 31 of a year and applied to the following year's data for the same period, evaluating the fit of these estimates compared with actual observations for that second year.

Gamma regression
Gamma models are used in situations where the variable has non negative values; were originally used for continuous data, but now the family of Gamma generalized linear models is used with count data 7 .In general, these models consider different ways of how to work the response variable, such as exponentiation of the response using the log-gamma transformation 12 .
The probability density function for the generalized gamma function is given by () (; , , ) ; 0 () , where 2 ln( ) ,( ) The parameter  is equal to t x  where x is the matrix of predictors including the intercept and  is the vector of coefficients.For generalized Gamma distribution the expected value conditional on x is given by: where 01 ˆ(1 / ) exp( ln( ( ))) and ln( )  is parameterized as 01 ln( ( )) Since it is required to report estimates and results in the original measurement scale, we work with the exponentiation of the model using the log-gamma transformation () e x p ( )    and ensure that the transformation does not affect the interpretations which refers directly to the original scale 12 .

Multivariate Adaptive Regression Splines (MARS)
MARS is a methodology proposed by Jerome Friedman in August 1991, it tries to build a model of non-linear regression that is based on a product of functions called base smoothing functions (splines).These functions incorporated into its structure predictors entering the model as part of a function, not directly as in classical regression, produces a model for the response in study that may be continuous or binary that automatically selects the predictors present in the final equation, they are incorporated in the smoothing basal functions 5,10,11 .

Model for a predictor
The methodology MARS, proposed by Friedman 5 , selects K nodes of the predictor variable x, denoted by k t , 1,......., kK  , which could correspond to each of the observations of the variable; then K +1 defined regions on the range of x, where it is associated to each node the linear smoothing function, generating a family of basis functions:   () 0,......, () 1,....., it is known as a truncation function .For the approximation of order q, we estimate the function , usually the order of smoothing to be taken must be less than or equal to three, so the function and its q -1 first derivatives are continuous.This restriction and the use of polynomial functions in each subregion produce smooth and tight functions.

Generalization to p predictors
For the vector of predictors 12 ( , ,...., ) the smoothing function is defined analogously to the univariate case.In this case the space p R is divided into a set of disjoint regions and within each region a polynomial of p variables is fitted.
For p > 2 disjoint regions are considered to define the smoothing approximation as tensor products of disjoint intervals in each of the variables associated to the node location.So placing K j nodes in each variable produces a product of K j +1 regions, j = 1,..,p.A set of basis functions generating the space of smoothing functions for the entire set of regions, is the tensor product of the corresponding basal one-dimensional smoothing functions associated with the location of the nodes in each variable given by:   


The selection of basis functions looks for a good set of regions to define a smoothing approach adequated to the problem; MARS generates the basis functions by a stepwise process.It starts with a constant in the model and then begins the search for a variable-node combination that improves the model.The improvement is measured in part by the change in the sum of squared errors (MSE) , adding basis functions is done whenever it reduces reduce the MSE.
To evaluate this model, Friedman proposes using the Generalized Cross Validation statistic B is the design matrix, the numerator is the lack of fit on the training data set and the denominator is a penalty term that reflects the complexity of the model.
To compare the models we considered the following statistics: (a) Pearson's linear correlation between the observed and the predicted value and (b) mean absolute error ratio (mpab) between the observed (obs) and the predicted value (pred) given by


that is equivalent to evaluate the average errors committed by both models in the predictions.We also considered the concordance proportion in each class.
In order to compare the settings of the models and the regions selected for predicting the response variable two MARS models were constructed to each year, one based on 20 and another using 40 base functions, allowing us to choose that pattern which best fits the expected response based on the partitions of the predictor variables 11 .In the case of Gamma regression we used a logarithm link function, with three sets of predictors: the first corresponds to all variables, the second set involved yesterday and today variables and the third included the variables: pm0, pm6 , pm18, dtm, dth and vvm.This last set of variables would better describe the behavior of the pm10 concentrations, as has been described by other authors 18 .

Results
Table 1 shows the descriptive statistics of the variables incorporated in the final modeling of PM10 concentrations.We can see that the maxima for the years 2001, 2002 and 2003 exceed the value of 240 (µg/m 3 ), such behavior is not seen for 2004.No. 2 shows the successes of the class I and II by the Gamma models and the MARS models.The correlations are significant for the three models per year, the mpab remain high for all models, except the 40 fb MARS 2002 model where we can see a 19% mean absolute error similar to the Gamma Model.In general for all three years the Gamma models have better performance than MARS for the PM10 concentrations < 240 (µg/m 3 ); MARS models have better performance for PM10 exceeding 240 (µg/m 3 ).No. 3) with maximum value 165 (µg/m 3 ) .This implies that for PM10 concentrations at 18 pm (pm18) above 137 (µg/m 3 ) and wind speed for tomorrow of 1,522 m/s, the contribution to the PM10 tomorrow concentration due to the interaction has a maximun of 165 (µg/m 3 ).On the other hand, figure (b) shows that for concentrations above 240 (µg/m 3 ), the variable pm0 has a maximum contribution to the response of 70 (µg/m 3 ), which represents the value with which the base function BF4 contributes to the predicted value.cutting points that act as threshold values for each predictor variable selected, indicating the change in the contribution generated by the basis function to the response under study.

to
After selecting the optimal model, MARS fits the model removing one variable, in order to determine the impact on the quality of the model due to eliminate that variable and assigns a relative ranking from the variable most important to the least important.Thus, replacement or competing variables are defined in the MARS methodology allowing us to treat missing values generating a special basis function for those variables that have no information, ie, generate an allocation basis function whose purpose is to impute the average value of the predictor without information.
As could be determined in this study, the MARS models performed similarly in their predictive efficiency when changing the number of basis functions, regardless of the year for which the model was built and the year used for validation.Partitioning the range of the predictor variables did not improve the quality of predictions, showing that sometimes a less fine partition (minus subregions) methodology was more robust than a thin partition.This could be explained as the time series of PM10 over time show a downward trend and variations in lower concentrations since the intervention measures applied 3 .
The Pudahuel monitoring station registered changes in annual and monthly concentrations of PM10 between 1998 and 2004.These changes were due mainly to global decontamination measures, extraordinary administrative actions on critical air pollution events, use the forecast model of critical events to reduce their negative impact and meteorological factors.Such changes affect the performance of these models allowing a less complex structure without loss of efficiency because the relationship established between the predictors is not as complex, this is reflected in the basis functions constructed.In turn there are differences from one year to another, which could be influenced by weather conditions of each year, such as dry year, El Niño and other climate changes.Another factor to evaluate and that could be relevant in this behavior is the increase in car ownership in recent years.
Additionally, different methodologies have been implemented to model the concentration of particulate matter in the metropolitan region of Santiago de Chile.For example, Silva and colleagues applying time series using transfer functions involving meteorological variables, reported a 40% average ratio of absolute error in the prediction of critical events 24 .Moreover Perez and colleagues, using neural networks with pre-smoothing, reported approximately 30% mean absolute error in predicting the critical episodes 19 .Subsequent work by the same author using a neural network shows results superior to classical linear regression 18 .Furthermore, Silva and colleagues assessed two methodological approaches to the problem of predicting air pollution by particulate matter, reporting that MARS generated better predictions than the discriminant analysis 23 .
Application of these models to the monitoring station Pudahuel shows that the predictor variables that best predict the answer would be (i) some PM10 concentrations: pm0, pm6 and pm18 and (ii) meteorological variables: vvm dtm; this is consistent since MARS selected variables related to persistence of ventilation conditions, which are related to meteorology 23.Prediction methodologies give us adequate models to study particulate matter pollution.Gamma regression was generally lower in Class II hits than MARS models, except for predicting from 2003 to 2004, so MARS appears as a better prediction tool of pollution episodes above the 240 (µg/m 3 ).Additionally, an advantage of the Gamma model is that generally makes better predictions for the class I (PM10 < 240 µg/m 3 ), ie, is sensitive to values of concentrations of particulate matter associated to air quality good or warning; this would be given by the behavior of the variable of interest.For example, for 2001 the intervention measures explain the best fit of the Gamma modeling in that year, whereas later on these measures had a moderate impact on the values of the series of PM10, making these models less efficient compared to MARS, as happens for the years 2002, 2003 and 2004.This study basically aims to proposed models being able to detect concentrations above the threshold of 240 µg/m 3 that mandate periods of epidemiological alert in Santiago de Chile; this consideration would place the MARS modeling as a tool statistically more powerful than Gamma modeling.This last point is consistent with previous findings showing that M A R S i s m o r e e f f i c i e n t than other techniques 4,23,27 .This could be explained by the smoothing approximation that uses this methodology generating breakdowns in the predictors time series and locally adjusting the basis functions in function on such nodes.

Appendix 1. -Exponential smoothing
This technique uses a smoothing constant; if the constant is close to 1 it affects very much the new forecast and conversely when this constant is close to 0, the new forecast will be very similar to the old observation .If you want a sharp response to changes in the predictor variable you must choose a large smoothing constant.The formula that relates the coefficient and the time series is given by [2]  [2] 1 (1 ) ; to generate the smoothing is necessary to have the values S 0 and S t , where x t corresponds to the values of the original series 16,26 .
www.intechopen.comCritical Episodes of PM10 Particulate Matter Pollution in Santiago of Chile, an Approximation Using Two Prediction Methods: MARS Models and Gamma Models 145 www.intechopen.comCriticalEpisodes of PM10 Particulate Matter Pollution in Santiago of Chile, an Approximation Using Two Prediction Methods: MARS Models and Gamma

Fig. 1 .
Fig. 1.Shows that the Gamma model predictions are further away from observed PM10 than MARS predictions for high PM10 concentrations for year 2003 using model of 2002, but Gamma model gives better predictions for values below 200 (µg/m 3 ).

Fig. 2 .
Fig. 2. Corresponds to the MARS model with 40 basis functions for 2002 and illustrates in part (a) the interaction of the basis functions max (0; pm18-137) and max (0; vvm-1, 522) that generate the basis function BF8 which represents an interaction surface (TableNo.3) with maximum value 165 (µg/m 3 ) .This implies that for PM10 concentrations at 18 pm (pm18) above 137 (µg/m 3 ) and wind speed for tomorrow of 1,522 m/s, the contribution to the PM10 tomorrow concentration due to the interaction has a maximun of 165 (µg/m 3 ).On the other hand, figure (b) shows that for concentrations above 240 (µg/m 3 ), the variable pm0 has a maximum contribution to the response of 70 (µg/m 3 ), which represents the value with which the base function BF4 contributes to the predicted value.
Table No.3 shows the explicit model for the MARS models with 20 and 40 basis functions and the Gamma models for the years 2001, 2002 and 2003.The complexity of the model is seen in the number of basis functions incorporated into the explicit model, the model for 2002 has nine basis functions of which five are interactions of univariate basis functions, the BF9 and BF8 functions correspond to interactions of the mirror variable for vvm and the corresponding basis function associated with pm18.

Fig. 3 .
Fig. 3. Shows MARS model with 20 basis functions for 2003 and illustrates in part (a) the interaction of the basis functions max (0;72-pm18) and max (0; vvm-0,56) that generate the basis function BF10 which represents an interaction surface (Table N .3).In the other figure (b) shows the surface contour are four regions where the letters A, B, C and D, D represents the area for values less than 0,56 meters per second (vvm) and particulate matter concentrations at 18 hours greater than 72 micrograms per cubic meter that generated by the interaction of both base functions with the maximum contribution values to the response variable of interest.

Table 1 .
Mean and standard deviation of predictor variables used to model PM10.Years 2001 to 2004.

Table 2 .
Results of MARS and Gamma modeling.