Impact of the 2015 wildfires on Malaysian air quality and exposure: a comparative study of observed and modeled data

In September and October 2015, Equatorial Asia experienced the most intense biomass burning episodes over the past two decades. These events, mostly enhanced by the extremely dry weather associated with the occurrence of strong El Niño conditions, resulted in the transnational transport of hazardous pollutants from the originating sources in Indonesian Borneo and Sumatra to the highly populated Malaysian Peninsula. Quantifying the population exposure form this event is a major challenge, and only two model-based studies have been performed to date, with limited evaluation against measurements. This manuscript presents a new data set of 49 monitoring stations across Peninsular Malaysia and Malaysian Borneo active during the 2015 haze event, and performs the first comparative study of PM10 (particulate matter with diameter < 10 µm) and carbon monoxide (CO) against the output of a state-of-the-art regional model (WRF-Chem). WRF-Chem presents high skills in describing the spatio-temporal patterns of both PM10 and CO and thus was applied to estimate the impact of the 2015 wildfires on population exposure. This study showed that more than 60% of the population living in the highly populated region of the Greater Klang Valley was systematically exposed to unhealthy/hazardous air quality conditions associated with the increased pollutant concentrations from wildfires and that almost 40% of the Malaysian population was on average exposed to PM10 concentrations higher than 100 µg m−3 during September and October 2015.


Introduction
Equatorial Asia is a region of widespread biomass burning from multiple sources. These include peat burning, agricultural fires, centralized waste disposal and open fugitive waste burning (Nara et al 2011). Among these sources, wildfires associated with forest and peat burning resulting from agricultural practices are some of the most significant (Carlson et al 2012, Margono et al 2014 and have a prominent seasonality across the region (van der Werf et al 2010). Emissions from biomass burning are typically loaded with aerosols and carbon monoxide (CO) (van der Werf et al 2010, Wiedinmyer et al 2011) and significantly vary in spatial extent and duration from year-to-year. These annual variations are in part dependent on Figure 1. Locations of PM 10 and CO monitoring stations (red dots) used in this study and cities with more than 1 million people (black squares). In blue are six representative sites (stations number 2, 4, 7, 28, 44 and 49) which are analyzed in detail in this work. The Greater Klang Valley area (GKV), defined here as being the states of Klang, Petaling, Hulu Langat, Gombak, Sepang and Selangor combined with the federal states of Kuala Lumpur and Putrajaya, is also highlighted with a darker blue shading in Peninsular Malaysia. agricultural activities, variation in wildfire distributions and wider seasonal weather pattern changes (Field et al 2009, Marlier et al 2013, Siegert et al 2001, van der Werf et al 2008. In this work, we focus on the fires in the autumn 2015, when the region experienced a severe draught associated with the intense El-Niño conditions and subsequent widespread and uncontrolled wildfires (Tacconi 2016). Over this period, the burned land in the area was unusually large compared to previous years: approximately three times larger than in 2013 and 1.2 larger than in 2014 (figure S1).
Wildfire emissions can be transported for hundreds to thousands of kilometers from the originating fires and may thus degrade air quality conditions over highly densely populated areas far away from the burning regions (See et al 2006). In this work we focus on Malaysia, a very densely populated region with 26.4 million people (Bright et al 2014) living in an area of approximately 329 000 km 2 . The combined population of Kuala Lumpur and Selangor region is reported to be almost 7 million in 2013 (approximately 25% of the total Malaysian population) and the population of the Greater Klang Valley (GKV, figure 1) is expected to reach 10 million by 2020 (PEMANDU 2015). Malaysia experiences recurring haze events annually, most notably during the dry season when emissions from forest and agricultural fires in Malaysia and Indonesian Sumatra are transported by south-easterly winds to densely populated urban areas such as Kuala Lumpur and the surrounding GKV (Afroz et al 2003, Azmi et al 2010, Dominick et al 2012. Particulate air pollution is widely acknowledged as one of the most significant global health risks (Lelieveld et al 2015), estimated to cause approximately seven million deaths annually (WHO 2012). In health terms, the poor air quality associated with local sources as well as with haze events increases occurrences of asthmatic episodes and has adverse impacts on cardiovascular and respiratory diseases in Equatorial Asia (Emmanuel 2000, Frankenberg et al 2005, Sastry 2002. It is therefore of high priority to quantify the extent of the exposure from major events, such as the one in autumn 2015. Estimating the health impacts from population exposure to harmful air pollutants is challenging because it requires both an accurate description of spatio-temporal patterns and trends of air pollutants and robust relationships linking air quality conditions to human morbidity and mortality. Most literature studies have assessed trends of population exposure over multiple decades (Brauer et al 2016, Duncan et al 2014 based on air pollutants satellite observations. This approach is particularly effective in regions where no model simulations or ground-based observations are available, although it strongly depends on the satellite spatio-temporal coverage, data availability and uncertainties related to the weak spectral signatures of most pollutants (Duncan et al 2014). Satellite observations are thus often integrated with output from Earth System Models, including descriptions of atmospheric chemistry as well as aerosol dynamics, thereby allowing more accurate inference of near-surface air pollution (Lelieveld et al 2015). Recent studies (e.g. the NERC Coordinated Airborne Studies in the Tropics (CAST) campaign and the NASA Airborne Tropical TRopopause EXperiment (ATTREX)) have also emphasized the need for targeted campaigns as well as regional scale modeling to accurately capture air quality patterns and atmospheric chemical composition in tropical regions. Integrated approaches, including both direct observations and state-of-the-art model simulations, are thus required to understand this heterogeneous, dynamically variable (and sensitive receptor rich) environment.
To date, two studies have estimated the total population exposure to degraded air quality conditions and the enhanced premature mortality as a result of the 2015 wildfires (Crippa et al 2016, Koplitz et al 2016. In both studies the authors concluded that those wildfires impacted a large fraction of the total population living in Equatorial Asia, although both their exposure and mortality estimates are highly uncertain. This is mostly due to the very limited observations publicly available in the area during the haze episode which prevented a thorough evaluation of those model output. To overcome these limitations, in this work we present a new data set consisting of hourly observations of PM 10 and CO concentrations measured at 49 monitoring stations across Peninsular Malaysia and Malaysian Borneo during the 2015 wildfires. These hourly observations are used here for three main purposes: (i) to evaluate simulations of the Weather Research and Forecasting model with Chemistry (WRF-Chem) applied at high spatial and temporal resolution; (ii) to attribute sources of model bias and thus to provide insights for improving model setup, parameterizations and input data (e.g. emissions); (iii) to quantify the increment in pollutant concentrations directly attributable to the elevated burning activity in autumn 2015 and the resulting additional 'burning' public exposure in Malaysia. This study emphasizes the major role played by transnational transport of pollutants from the originating burnings in affecting air quality and population exposure across Malaysia and thus the need of a coordinated transnational effort to plan for mitigation strategies aimed at reducing wildfires impacts. Further, results from this work provide valuable information in identifying the areas mostly affected by the wildfires and thus can be used by policy makers to control fire occurrence and better plan for land-use changes.

Study Domain and Data Description
The monitoring stations used for this study are operated on behalf of the Malaysian Department of Environment (DOE) by Alam Sekitar Sendirian Berhad as the Malaysian Continuous Air Quality Monitoring network. Stations in Peninsular and Bornean Malaysia were selected based on the availability of PM 10 and CO data for the selected burning and post burning months (September 1st to November 30th 2015). During this period, 49 stations measured PM 10 (12 of these in Bornean Malaysia) with 40 of these measuring both PM 10 and CO (again with 12 of these in Bornean Malaysia, see figure 1) at hourly resolution. The stations are classified as residential, industrial, traffic, background and PM 10 . The majority (more than 50%) are identified as residential and are mainly located at or near roadside sites. The instruments used for monitoring PM 10 and CO across this network are standardized to follow the equivalence guidelines set down for monitoring by the US Environmental Protection Agency. For PM 10 this was achieved either using beta attenuation or tapered element oscillating microbalance instruments. For CO, gas filter correlation instruments are used across the network. Simulated spatial patterns are evaluated by comparing with Aerosol Optical Depth (AOD) and CO satellite observations. AOD is an indicator of the total aerosol load in an atmospheric column whereas CO is one of the main component of biomass burning smoke, both related to surface PM concentrations associated with wildfires (see details in the supplementary materials available at stacks.iop.org/ERL/13/044023/mmedia).

Model Description
The Weather Research and Forecasting model with Chemistry (WRF-Chem, Version 3.5, (Grell et al 2005)) was used over the study domain (93 E-137 E; 11S-15 N) with a 10 km resolution in the horizontal and 51 eta levels in the vertical up to 10 hPa (irregular levels to better represent the boundary layer). WRF-Chem is applied at high spatial and temporal resolution as it was found to add value to the representation of spatio-temporal patterns of aerosol, meteorological and chemical properties compared to analogous coarser runs (Crippa et al 2017). The simulations cover the period of September 1st to November 30th 2015 at hourly resolution, but were aggregated at daily level for some of the analyses presented in this work. Details on model setup, including emission data, boundary conditions and parameterizations adopted are described in the supplementary materials.

Population Exposure Methods
The Malaysian DOE adopted a revised Air Pollution Index (API) definition in 1996 modeled on the US Pollutant Standards Index (PSI) system to assess air quality conditions. The DOE API system translates measured concentrations of O 3 , CO, NO 2 , SO 2 and PM 10 into API bands (0-500) and descriptors ('Good' to 'Hazardous', table S3). API levels below 100 are considered 'safe levels' and above 100 generally unhealthy. Individual index values are calculated for each pollutant and the maximum value is then taken forward as the final (single) API value (DOE 2000). Details on API calculation are provided in the supplementary materials. Given the less stringent thresholds used as breakpoints in the API calculation currently adopted in Malaysia compared to those of developed countries (i.e. United States, Canada, Europe and United Kingdom), the National University of Malaysia is currently reviewing the existing API definition for the Malaysian DOE.
For a comparison with model simulations, observations from the monitoring network are interpolated on a regular grid identical to that of WRF-Chem via spatial interpolation (kriging, (Cressie 1993)), with higher weights to closer observations, according to an estimated spatial dependence structure (see section 3.1). In order to compute population exposure to unhealthy air quality conditions, we merged the API calculations with population data from the 2013 Landscan High Resolution Global Population Dataset (Bright et al 2014). This data product provides ambient population density data globally gridded at 30 arc seconds corresponding to approximately 1 km by 1 km at the equator. This was mapped to the WRF-Chem grid used in this study (0.1 latitude/longitude degree grid squares) and population counts were calculated by multiplying the regridded density data by the grid cell surface area.

Results and discussion
3.1. Data-model comparison WRF-Chem simulations were run for the three month period of this study and CO and PM 10 hourly output and daily averages were compared with the corresponding observational data from the monitoring sites in Peninsular and Bornean Malaysia. Figure 2 shows an example of comparison between WRF-Chem output (PM 10 and CO concentrations) and observed data, for two different sites (coded as 2 and 28 in figure 1 and  figure S4).
From these plots it is clear that the WRF-Chem run with fire emissions matches the observed data significantly better than the run without fires. Indeed the product moment correlation coefficient (R) increases from 0.20-0.76 at site 2 (for PM 10 ) and from 0.16-0.73 at site 28 (for CO), and especially in September/October the model is able to capture the increase in both pollutants markedly better when fire emissions are included. The overall degree of improvement in R for both PM 10 and CO for all sites is shown in figure S5. For PM 10 , the median correlation value for the distributions with no fire emissions included is −0.04 (0.03 for CO) and with fires included is 0.59 (0.56 for CO). For both pollutants, the results strongly suggest that the run with fires matches the observations more closely, and therefore subsequent analyses and comparisons will be performed considering only this simulation. Figure S6 shows the correlation coefficient (R) for observed and simulated CO (figure S6 (a)-(b)) and PM 10 (figure S6 (c)-(d)) at all the sites. The statistical behavior in space for the two islands, as apparent from the variograms (not shown), is markedly different, and separate analyses were performed for the different islands. The interpolated correlation surface was obtained using kriging with an exponential variogram with nugget (Cressie 1993). Correlations in Bornean Malaysia are markedly more constant (figure S6(b) and (d)), especially for CO ( figure S6(b)), thus the interpolated surface looks almost flat. In Peninsular Malaysia (figure S6(a) and (c)), there is good agreement in the simulated CO and PM 10 , especially in the northeast region. The correlation coefficient degrades markedly in the south/south-west, including the GKV and Singapore (refer to figure 1 for their location), with R values as low as 0.4, thus indicating lower agreement between WRF-Chem and observations. This may suggest that local emissions, possibly from anthropogenic sources neglected or underestimated by the model, may also significantly impact local air quality conditions. The model exhibits different skills in capturing the magnitude of CO and PM 10 concentrations, consistently with the correlation analysis results. During fire months CO concentrations are underestimated by the model with larger biases along the southern and western coastlines of Peninsular Malaysia which are more populated and urbanized than Bornean Malaysia and inland of Peninsular Malaysia where smaller biases in CO occur (figure S7). On the converse, most sites indicate a slightly positive/negative Normalized Mean Bias Factor (NMBF, see supplementary materials for its definition) in simulated PM 10 concentrations ( figure  S7).
To investigate reasons of this spatial variability we compared hourly data at four selected sites located in either a predominantly urban (site 4 and 47) or rural (site 7 and 49) emission context (figure 1). The magnitude of the bias in PM 10 during fire months is comparable at rural and urban sites (i.e. similar Normalized Mean Absolute Error Factor, NMAEF, defined in the supplementary materials), although its sign is variable (i.e. positive and negative NMBF), as also indicated by the comparison of hourly data (figures S8 and S9 and table S2). Conversely, the analyzed rural sites show a transition from positive (during fire months) to highly negative NMBF in the absence of fires (table  S2). This indicates that during non-fire months the model is not able to accurately capture the background PM 10 concentrations, which may comprise a significant secondary component (Kim Oanh et al 2006, Tahir et al 2013 in Equatorial Asia, as also revealed by the systematic negative bias in the hourly simulated data (figures S8 and S9). The adopted aerosol scheme within WRF-Chem indeed includes only a simplified treatment of secondary aerosol (i.e. without secondary organic aerosols), due to the computational cost of running high resolution simulations, and may thus be responsible for part of the model bias. At the urban sites the bias is smaller in November, possibly as a result of the major role played by local primary PM 10 emissions in dictating its atmospheric concentrations.
We further investigate possible sources of model bias by analyzing the pollutants' diurnal cycle. Being a primary pollutant, CO atmospheric concentrations are mainly dictated by direct emissions and transport phenomena. The bias in simulated CO is larger at urban than at rural locations and relatively unchanged in magnitude over time (table S2). This suggests that some anthropogenic CO emissions are missing in WRF-Chem and/or that urban scale pollution events are not fully captured by a 10 km resolution model run. Further evidence on the origin of the bias can be drawn by comparing the diurnal evolution of simulated and observed CO and PM 10 concentrations during fire and non-fire activity. Observations at the urban sites indicate a clear diurnal cycle of CO (particularly during non-fire periods) with concentrations sharply increasing generally during 6-8 AM (figures S8), as a result of traffic/rush hour emissions. At rural sites this behavior is absent (or less marked as at site 49, figure S9). This result thus strongly suggests that some primary emission sources of CO are not accounted for in the EDGAR-HTAP emission inventory (Janssens-Maenhout et al 2015). In conclusion, the analysis of hourly observed and simulated data indicates that the model may not always accurately capture variations occurring at fine spatial and temporal scales, such as urban scale pollution, partly because of the inaccurate anthropogenic emissions. Detailed ground-based observations may thus help to interpret model biases and may be possibly used to verify and improve model input (e.g. by evaluating emission inventories) as well as to identify the optimal model setup (e.g. by identifying the key pollution-related physical-chemical processes that the adopted model parameterizations should be able to account for).
We also evaluated model skills in reproducing AOD and CO spatial patterns based on MODIS and MOPITT data respectively, averaged during the months of September and October, which are dominated by the wildfires (figure S10). No sharp gradients are present between simulated and observed AOD and CO, thus indicating the model is not affected by a systematic bias associated with long-range transport (figure S10(c) and (f)). Further, WRF-Chem underestimates AOD from MODIS by a factor of 1.3 (Normalized Mean Bias Factor, NMBF = −0.34 and NMAEF = 0.85) for AOD whereas it shows high agreement with CO observations from MOPITT (NMBF = −0.09 and NMAEF = 0.17) during September and October.
To further assess the model ability to capture the observational records, we compared the average PM 10 for September and October derived from the model output with the interpolated values from the monitoring stations using a kriging with exponential variogram with nugget. A very good agreement is found in the Malaysian Peninsula, where WRF-Chem is able to capture the high concentrations in the GKV and in Singapore, and the lower values in the north-east regions of the peninsula (figure 3). The interpolated values from the monitoring stations are overall smoother than the simulations, especially inland, where the lack of coverage inevitably leads the estimation to be less reliable. In Bornean Malaysia, both observations ( figure 3(b)) and WRF-Chem (figure 3(d)) do not indicate high PM 10 concentrations, especially near the northern coast, whereas the model simulates higher concentrations in the south-west (figure 3(d)) as directly accounting for wildfire emissions from Indonesian Borneo. It is possible to interpolate the pollutant concentrations in Indonesian Borneo in principle, but since no monitoring stations are available in that region, the reconstructed field would depend on the stations in Malaysia, which would not be as severely affected by the fires, and hence would show considerably lower values. This is apparent from large discrepancy in figure 3(b) compared to figure 3(d) at the south-west border with Indonesia. WRF-Chem simulations are instead able to

Exposure and health impacts
Population exposure to the 2015 pollution episode was quantified by computing the total population exposed to different levels of PM 10 in September and October. In figure 4, an exposure comparison in both the Malaysian Peninsula and Borneo is shown, with threshold values chosen at 70 100 and 130 g m −3 , all above the daily WHO limit of 50 g m −3 (which would result in the entire Malaysian region being covered). The exposure maps derived from observations (figures 4 (a)-(b)) appear to be smoother than the ones from WRF-Chem (figure 4 (c)-(d)), as already observed in the pollution maps in figure 3. However, both maps similarly highlight the GKV as the area mostly affected by pollution, with similar exposure estimates. Based on WRF-Chem simulated concentrations, in Peninsular Malaysia we estimated that 21 million and 12 million people were exposed to mean PM 10 concentrations higher than 70 and 100 g m −3 , respectively (figure 4(c)). The estimates based on the observations are instead higher for the 100 g m −3 threshold (i.e. 16 million people), as a result of the low station density in the southerly highly populated areas. In Bornean Malaysia, the comparison between population exposure based on interpolated observations and WRF-Chem output shows much closer total estimates for all three thresholds.
A complete exposure study based only on observational data is not possible in the context of this work, as PM 10 and CO are insufficient for the calculation of pollution indices such as the API, which identifies the relative risk from five criteria pollutants (i.e. CO, O 3 , NO 2 , SO 2 , PM 10 , see table S3). It is however possible to compute the API for each day using WRF-Chem simulated pollutant concentrations and estimate the number of people persistently exposed to unhealthy conditions. During the 2015 wildfires, the simulated unhealthy air quality conditions appear to be always dictated by the high PM 10 concentrations. Given the agreement between modeled and observed pollutant concentrations indicated by our model evaluation, population exposure to degraded air quality can be also inferred using spatially interpolated PM 10 observations ( figure 4 and figure S11). We estimated that 4.5 million people, mostly concentrated in the GKV, were exposed to unhealthy air quality conditions on one day in two during September and October ( figure 4). This corresponds to ∼64% of the total population living in the Greater Klang Valley area. The unhealthy air quality conditions were almost exclusively caused by the increased PM 10 concentrations due to wildfires. Indeed, the normal population exposure to unhealthy conditions would be only ∼765 000 people as derived from API computed in the run without fire emissions. The currently adopted API definition is considerably more conservative than the Pollutant Standards Index formulation adopted by the Singapore National Environment Agency (NEA 2016), whose application would lead to almost 18.5 million people (70% of the Malaysian population) exposed on one day in two to unhealthy air quality conditions (Crippa et al 2016). Further, when comparing with the WHO guidelines for 24 h r PM 10 concentrations of 50 g m −3 , the entire Malaysian domain exceeds that threshold on at least one day in two during September and October, thus exposing more than 26 million people to PM 10 concentrations above the recommended threshold by WHO. This also indicates the less restrictive pollution control measures/levels implemented on the regional scale.

Conclusions
This study investigates the impact of the 2015 haze episode caused by the Indonesian wildfires on air quality and population exposure in Malaysia, by comparing a new observational data set including 49 monitoring sites with two high-resolution regional model simulations. Major findings of this work include: -The degraded air quality conditions experienced in autumn 2015 in Malaysia were primarily caused by transnational pollution transport from wildfires occurring in Indonesia and other biomass burning, as indicated by the higher agreement of simulated pollutant concentrations with observations when the model is run including fire emissions. However further research is required for the same time period over different years to generalize these findings.
-WRF-Chem, run at high spatial and temporal resolution, presents high skills in reproducing pollutant concentrations observed at multiple sites in Malaysia (i.e. R > 0.6 at 45% and 39% of the stations for PM 10 and CO, respectively) and therefore can be used to make robust conclusions on regional air quality and population exposure.
-Analyses of simulated PM 10 concentrations indicate that 10 million people were exposed to mean concentrations double the daily WHO limit during September and October 2015.
-Based on WRF-Chem simulations, 64% of the population living in the Greater Klang Valley was systematically exposed to unhealthy air quality conditions.
Results from this study provide crucial evidence that air quality in Malaysia is significantly linked to the emission context of the whole Equatorial Asia and that a coordinated effort between emitting and exposed countries is essential to improve air quality and to reduce population exposure (Tacconi 2016). This work has also shown that the application of a state-of-the-art regional model at high resolution integrated with a ground-based observational data can provide unique and robust information on the spatio-temporal patterns of pollution in Equatorial Asia, which will help policy makers to identify the most vulnerable areas and thus plan for coordinated mitigation strategies.