Predicting Fire Season Intensity in Maritime Southeast Asia with Interpretable Models

15 There have been many extreme fire seasons in Maritime Southeast Asia (MSEA) 16 over the last two decades, a trend which will likely continue, if not accelerate, due to cli17 mate change. Fires, in turn, are a major driver of atmospheric carbon monoxide (CO) 18 variability, especially in the Southern Hemisphere. Previous studies have explored the 19 relationship between climate variability and fire counts, burned area, and atmospheric 20 CO through regression models that use climate mode indices as predictor variables. Here 21 we model the connections between climate variability and atmospheric CO at a level of 22 complexity not yet studied and make accurate predictions of atmospheric CO (a proxy 23 for fire intensity) at useful lead times. To do this, we develop a regularization-based sta24 tistical modeling framework that can accommodate multiple lags of a single climate in25 dex, which we show to be an important feature in explaining CO. We use this framework 26 to present advancements over previous modeling e↵orts, such as the inclusion of outgo27 ing longwave radiation (OLR) anomalies, the use of weekly data, and a stability anal28 ysis that adds weight to the scientific interpretation of selected model terms. We find 29 that the El Niño Southern Oscillation (ENSO), the Dipole Mode Index (DMI), and OLR 30 (as a proxy for the Madden-Julian Oscillation) at various lead times are the most sig31 nificant predictors of atmospheric CO in MSEA. We further show that the model gives 32 accurate predictions of atmospheric CO at leads times of up to 6 months, making it a 33 useful tool for fire season preparedness. 34

indices and their proxies). The following subsections describe the data used as our re-115 sponse and predictor variables. For the response, we use carbon monoxide column-averaged volume mixing ratios 118 (referred to as simply CO) from the MOPITT instrument onboard the Terra satellite only retrievals from the joint near infrared (NIR) and thermal infrared (TIR) product. 127 Daytime retrievals over land have a higher sensitivity to CO than nighttime or ocean re-128 trievals due to higher thermal contrast. We use the joint product because it includes ad-129 ditional information from reflected solar radiation over land (Worden et al., 2010). See  resulting in 19 years of data and 991 weekly observations. We compute the seasonal cy-141 cle by taking an average over the 19 years of data for each week. We then remove this 142 seasonal cycle from the weekly time series so that our models are better able to capture 143 the anomalous CO observations corresponding to large burn events. Figure 2 shows the 144 weekly CO observations, climatological average, and resulting anomalies for the MSEA 145 region. 146 Finally, since we are interested in using CO as a proxy for fires, we only model anoma-147 lies during the fire season in the Southern Hemisphere, defined here as September through other aspects of climate over certain spatial regions. A well known example is ENSO, 156 which captures quasi-periodic variability in sea surface temperature and wind in the Pa-157 cific Ocean (Neelin et al., 1998;Trenberth, 2013 Thompson and Wallace (2000). This index captures Antarctic  Figure 1(b) via the arrows labeled SAM. We expect a relationship between these in-179 dices and CO, as each index is related to regional climate (e.g., rainfall), which in turn 180 a↵ects drought, fire, and ultimately CO concentrations.

181
In addition to these four indices, we also want to include variability captured by  This introduces multiple coe cient estimates for a single phys-192 ical phenomenon, which makes it harder to model and hinders model interpretability.

193
Instead of using these EOFs, we use outgoing longwave radiation (OLR) anoma-194 lies to capture variability described by the MJO in our models. OLR is a metric that 195 describes how much energy is leaving the atmosphere and is one climate variable used   We aggregate OLR values over the same spatial region that defines the MSEA re-203 gion shown in Figure 1, and we create anomalies in the same manner as the CO anoma- where CO(t) is the CO anomaly at time t, µ is a constant mean displacement, a k , 223 b ij , and c l are coe cients, are the climate indices, ⌧ is the lag value for each index in 224 weeks, ✏(t) is a random error component, and k,i, and j iterate over the number of cli-225 mate indices used in the analysis. Note that we standardize the climate indices, , be-226 fore fitting the model so that coe cient estimates can be directly compared. We con-227 sider lags between one and 52 weeks for each index, excluding zero week lags so that our 228 models can be used for prediction. We also enforce strong hierarchy, meaning that any 229 covariate that appears in an interaction or squared term must also appear as a main ef-230 fect. Strong hierarchy has long been recommended for models with interactions, as it helps 231 avoid misinterpretation of the included covariates (Nelder, 1977). See the Supporting In-232 formation file for more details on strong hierarchy.

233
Although the high frequency variability present in the weekly climate index data 234 has important near-term e↵ects, we do not expect it to have a large impact on the amount, 235 type, and dryness of available fuel far into the future. This is because we believe that 236 short anomalies do not last long enough to drastically alter large scale fuel reserves. There-237 fore, we want covariates with longer lags to capture progressively lower frequency com-238 ponents of the climate indices.

239
To accomplish this, we apply more smoothing to the climate mode indices as the  Here we examine the physical implications of the models fit using the procedure 300 described in Section 4. We focus on connections between climate and atmospheric chem-301 istry in MSEA through an analysis of selected indices and lag values.   cause rainfall leads to vegetation growth, which ultimately provides more fuel for fires.

341
The length of this lag is longer, implying that it takes around 12 weeks for the increased 342 vegetation growth to impact CO concentrations.

343
-10-manuscript submitted to JGR: Atmospheres The e↵ect of these two DMI lags is compounding. That is, more vegetation as a 344 result of DMI-driven rainfall at a 12 month lag leads to more fuel when a subsequent pos-345 itive DMI anomaly creates dry conditions. This is supported by the negative coe cient 346 on the interaction between the DMI lag of 12 weeks and one week present in the largest 347 model in Figure 4. Because the coe cient is negative, there is less CO on average when 348 the DMI has the same phase (i.e., either a positive or negative anomaly) at both a 12 349 and one week lag.  weekly data (i.e., the model shown in Figure 4). We see a noticeable increase in model 482 performance when using the weekly data, suggesting that the weekly data is able to cap-  ing data, and test it on the left out year. The average RMSE is then taken for each dif-521 ferent training and testing set pair and plotted as a function of minimum-lag-threshold. 522 We again see that performance falls o↵, although gradually. 523 We think that the gradual nature of the decline in model performance is a result 524 of the climate indices exhibiting high auto-correlation (not shown). Since many of the 525 short lags are highly correlated to longer lags of the same index, we think that these longer 526 lags are able to explain much of the same CO variability when the shorter lags are ex-527 cluded. This is again promising, as it means that predictions can be made decently far 528 in advance (on the order of a half year) without dramatically compromising performance.

529
To further visualize model performance at increasingly large minimum-lag-thresholds, 530 we consider predictions for the 2015 CO event in the MSEA region. Figure 9 shows pre-531 dictions from the models corresponding to the minimum-lag-thresholds from Figure 8.  These results indicate that our models can be useful for predicting the structure  Finally, we show that including multiple lags of the DMI is important for explaining CO 558 variability in MSEA. 559 We also perform a resampling-based sensitivity analysis to quantify the robustness 560 of the model fit to all of the data. We find that the models forced to retain the covari-561 ates from the model fit to all of the data perform as good or better than the models al-562 lowed to completely change based on the training set. This provides justification for us-563 ing the models from Figure 4 as the representative models for the MSEA region. Ad-564 ditionally, we determine which covariates are most likely to remain in model when trained 565 on slightly di↵erent data, finding that the terms in the most parsimonious model from 566 Figure 4 are also the most robust. This justifies assigning scientific weight to the selec-567 tion of these terms, as it suggests that they are capturing a physically-based relation-568 ship and are not simply artifacts of the specific training set used. 569 We show that our model for the MSEA region can explain around 70% of the vari-570 ability in the weekly CO anomalies solely using climate indices as predictor variables. Finally, we perform a minimum-lag-threshold study to assess the predictive capa-580 bilities of our models at longer lead times. We find that models for the MSEA region are 581 still able to explain around 65% of the weekly atmospheric CO variability when forced 582 to only use lags greater than 35 weeks. This indicates that predictions can be made rel-583 atively far in advance without losing the overall structure and general amplitude of the 584 CO anomalies. If these models are to provide advanced warning of fire season intensity, 585 then longer lead times are beneficial because they extend the time available to prepare.
The NCAR MOPITT project is supported by the National Aeronautics and Space Ad-588 ministration (NASA) Earth Observing System (EOS) Program. The MOPITT team also 589 acknowledges support from the Canadian Space Agency (CSA), the Natural Sciences and