Smoke and COVID-19 case fatality ratios during California wildfires

Recent evidence has shown an association between wildfire smoke and COVID-19 cases and deaths. The San Francisco Bay Area, in California (USA), experienced two major concurrent public health threats in 2020: the COVID-19 pandemic and dense smoke emitted by wildfires. This provides a unprecedented context to unravel the role of acute air pollution exposure on COVID-19 severity. A smoke product provided by the National Oceanic and Atmospheric Association Hazard Mapping System was used to identify counties exposed to heavy smoke in summer and fall of 2020. Daily COVID-19 cases and deaths for the United States were downloaded at the County-level from the CDC COVID Data Tracker. Synthetic control methods were used to estimate the causal effect of the wildfire smoke on daily COVID-19 case fatality ratios (CFRs), adjusting for population mobility. Evidence of an impact of wildfire smoke on COVID-19 CFRs was observed, with precise estimates in Alameda and San Francisco. Up to 58 (95% CI: 29, 87) additional deaths for every 1000 COVID-19 incident daily cases attributable to wildfire smoke was estimated in Alameda in early September. Findings indicated that extreme weather events such as wildfires smoke can drive increased vulnerability to infectious diseases, highlighting the need to further study these colliding crises. Understanding the environmental drivers of COVID-19 mortality can be used to protect vulnerable populations from these potentially concomitant public health threats.


Introduction
Wildfires have increased in duration, intensity, and severity under climate change (Williams et al 2019, Goss et al 2020, with record-breaking events occurring annually for the past five years (CAL FIRE 2020) driving high exposure to air pollution for wide geographic areas (EPA 2020). Fine particulate matter (PM 2.5 ), particles under 2.5 µms in diameter composed of metals, ions, organic compounds, and other materials, is one of the main components of wildfire smoke. Fine particulate matter from wildfires accounts for 50% of total primary PM 2.5 emissions in California, and this is increasing under climate change (Valavanidis et al 2008, Ford et al 2018. PM 2.5 is associated with a range of adverse health effects-exposure to fine particulate matter causes inflammation and oxidative stress, exacerbating existing conditions and increasing risk of adverse health outcomes (Valavanidis et al 2008, Yang et al 2019. Moreover, wildfire smoke is several fold more impactful on health than similar levels of PM 2.5 from other sources (Aguilera et al 2021a(Aguilera et al , 2021b. The effect of wildfire-specific particulate matter is strongly associated with respiratory health impacts, attenuating the lung pulmonary immune system, and triggering asthma and chronic obstructive pulmonary disease (COPD) symptoms (Reid et al 2016).
The 2020 wildfire season broke records for the most active fire year on record along the West Coast producing the worst air quality in decades, which occurred concurrent to the public health threat of the COVID-19 pandemic (Hernández Ayala et al 2021). Approximately 2 million cases and over 25 000 deaths from COVID-19 occurred in California in 2020 (CDC 2020). Several studies have indicated an association between ambient particulate matter exposure and COVID-19 transmission and mortality (Cole et al 2020, Liang et al 2020, Wu et al 2020, Bourdrel et al 2021. Inhalation of air pollution can produce inflammation and expression of the alveolar angiotensin-converting enzyme 2 (ACE2), an important receptor for the host defense against SARS-COV-2, the virus that causes COVID-19 disease (Katoto et al 2021). Wildfire-specific PM 2.5 has also been hypothesized as a driver of severe COVID-19 including death (Henderson 2020). Recent studies have also considered the association between particulate matter and COVID-19 incidence and mortality during the wildfire season and have found an effect on COVID-19 test positivity rates (Kiser et al 2021), cases and deaths (Meo et al 2021, Zhou et al 2021. However, none have considered case fatality ratios (CFRs), or the proportion of deaths compared to the total number of people diagnosed with the disease, as the outcome of interest. In contrast to previous measures, this is an indicator for disease severity among detected cases and can reveal evidence of the role of wildfire smoke in exacerbating COVID-19 symptoms and disease progression.
There are strong mechanistic postulations for why wildfire smoke would increase COVID-19 severity. SARS-CoV-2, the virus that causes COVID-19, can cause a wide variety of symptoms ranging from loss of taste and smell to chest tightness and fever (Struyf et al 2020). In severe infections, the virus can cause alveolar damage in the lung and interstitial pneumonia (Aguiar et al 2020). A systematic review on predictive symptoms found that dyspnea was predictive of severe infection, while COPD, cardiovascular disease and hypertension symptoms were predictors of admission to intensive care units (Jain and Yuan 2020). Wildfire smoke has been shown to also intensify risk for these health concerns, particularly increasing risk of dyspnea (Black et al 2017) and COPD (Reid et al 2019). As both COVID-19 symptoms and wildfire smoke exposure drive similar effects on the respiratory system, it is thus plausible that exposure to wildfire smoke could increase risk of severe COVID-19, driving increased severity of disease and mortality rates.
Wildfire smoke as an acute exposure, i.e. impacting human health in the short term, in the Western United States in the summer and fall of 2020 provides a unprecedented context to unravel the role of acute air pollution exposure on COVID-19 severity. CFRs can be used as a measure of disease severity, representing the proportion of people dying of those infected with COVID-19. By considering trends in CFRs at the county level, we can identify and weight control counties to match the trend in outcome before any wildfire smoke occurred. As long as differences between counties are unchanged during the study period and a suitable weighted control is identified, any difference in the outcome after the exposure occurred can be attributable to wildfire smoke. We propose the use of this framework to understand the effect of wildfire smoke on COVID-19 CFRs.

Study area
As a densely population area with over 1500 COVID-19 deaths from March to early November 2020 (CDC 2020) and with exposure to record high wildfire smoke during the height of the pandemic in late summer-early fall, the San Francisco Bay Area (SFBA) is an ideal region to study this question. The region had a record number of Spare the Air days in 2020, a system developed to alert residents to unhealthy air quality according to the Environmental Protection Air Quality Index (US EPA 2021), with 30 consecutive alert days occurring from mid-August through September 2020 due to ongoing wildfires in the region (Spare the Air 2020). We focused the study on SFBA Counties (Alameda, Contra Costa, Marin, Napa, San Francisco, San Mateo, Santa Clara, Santa Cruz, Solano, Sonoma) to explore if wildfire smoke exposure played a role in changing COVID-19 CFRs.

Overview of analytical method
We used synthetic control methods (SCMs) to understand the effect of wildfire smoke exposure in driving COVID-19 CFRs. The benefits of SCM is that it adjusts for any unmeasured confounding by design, considering wildfire smoke as a natural experiment within a causal framework (Bouttell et al 2018). This not only allows the study of the potential role of wildfire smoke in exacerbating COVID-19 mortality, but also provides an opportunity to quantify the association between acute particulate matter exposure and severe COVID-19 within a causal analysis framework.
Synthetic control is a quasi-experimental method that capitalizes on a natural experiment, in this case, the timing of a wildfire smoke event, to estimate the causal effect of the exposure on an outcome of interest, in this case, the COVID-19 case-fatality ratio (CFR). A synthetic control is created by constructing (through weighing) a counterfactual in which the level and trend most closely matches the treated unit before the wildfire smoke event, and the difference in outcome between the treated unit and synthetic control after the event is attributable to the exposure (Bouttell et al 2018). Counties that were not exposed to any wildfire smoke event and outside of California were considered in the pool of possible control groups. Then, a weighting procedure has been applied to build a synthetic control for each exposed county using a SCM. SCMs can be seen as an extension of difference-in-differences methods when multiple control groups are available. The main idea behind this method is to calculate a weighted average of the controls' outcomes/covariates that will mimic the trend of the treated group of interest (i.e. county exposed to wildfire smoke) before the treatment. This weighted average, that aims at minimizing the difference with the treated observed values will be used to build the synthetic control group. Finally, we can use the synthetic control outcome in the postintervention period to represent the counterfactual outcome of the treated group in the absence of the wildfire smoke.
There are three key assumptions of this methodology: (a) the outcome of interest between treated unit and synthetic control must have similar fit before the wildfire event occurred, otherwise known as the parallel trends assumption; this was verified visually and empirically by estimating the weekly difference for each County and its synthetic control before the smoke event (b) the exposure has no effects in the potential control units, and (c) the only change that occurred in the treated and untreated groups is the natural experiment of interest otherwise known as the common shock assumption (Bouttell et al 2018). We only considered counties that were not on the West Coast (and of course not exposed to wildfire smoke) as eligible control groups to avoid any spillover into these control groups. The common shock assumption violation in relation to COVID-19 policies is unlikely as to our knowledge no specific policies coincided with the wildfire event and we also conducted several sensitivity analyses (see below) in this regard. When these assumptions are met, the difference between the outcomes in the treated unit and the synthetic control units after the event can be used to infer causal effects. By weighting control counties to have the most similar trend in the outcome before the wildfire smoke exposure occurred, any socio-demographic, exposure, or policy-related difference (including in relation to  between counties that is unchanged during the period of study is accounted for by design through the selection of controls.

Data sources
The National Oceanic and Atmospheric Association (NOAA) provides a Hazard Mapping System (HMS) Fire and Smoke product (NOAA 2020), which was used to identify smoke plumes at the daily level. In our main analyses, a county was considered exposed if 70% or more of the county's area was covered by heavy smoke each day based on the HMS smoke density product, so it could be presumed that the majority of the population living in the county was exposed to smoke. Daily COVID-19 cases and deaths for the United States were downloaded at the county-level from the CDC COVID Data Tracker (CDC 2020). The daily cases and deaths were smoothed using a 14 day moving average to reduce daily reporting anomalies due to the weekly cycle of reporting and to distinguish a trend.
Daily mobility information at the county level was downloaded from the U.S. Department of Transportation Bureau of Transportation Statistics (Bureau of Transportation Statistics 2020). This data is estimated by the Maryland Transportation Institute and Center for Advanced Transportation Technology Laboratory at the University of Maryland using mobile device data. It includes data from mobile devices that meet quality standards including temporal frequency and spatial accuracy of anonymized location point observations, temporal coverage and representativeness for the device, and spatial representativeness of the sample. A multi-level weighting method was applied to expand the study sample to countylevel estimates. In this study, a variable estimating the number of people not staying at home each day at the county level was extracted for this dataset for analysis.

Statistical analysis
All SFBA counties with more than 100 cases in the study period were considered as eligible Counties for the synthetic control analysis. Daily CFRs were computed at the County-level and considered as the outcome of interest. CFRs were preferred to other mortality measures for answering the research question of interest since they account for the epidemic curve by computing a ratio of persons with COVID-19 that resulted in death out of all persons diagnosed. To account for the potential lag from diagnosis to severe disease, the number of deaths on a given day were divided by the average number of cases diagnosed 7 d prior; this produced a daily measure of the casefatality rate for each county.
For the main analyses, days for which 70% of a County's area was covered by heavy smoke based on satellite imagery were considered to be the natural experiment of interest, using all other counties in the United States that had less than 1% smoke exposure as potential controls (several sensitivity analyses were conducted, see supplementary files available online at stacks.iop.org/ERL/17/014054/mmedia). We allowed for 1% exposure since it is negligible and because dropping these Counties would results in a very small pool of potential control Counties. The 'synth' and 'synth_runner' commands in the statistical software package Stata were applied to identify control Counties that had a similar trend to each of the eligible Counties before the smoke exposure occurred. Synthetic control models were adjusted for daily mobility, used as a proxy for change in contact rates, using county-level data of the population not staying at home provided by the U.S. Bureau of Transportation Statistics (Bureau of Transportation Statistics 2020). The SCM computes 1-sided p-values by running placebo tests and represent the proportion of placebo effects which are as large or larger than the main effect for each period following wildfire smoke exposure. 95% Confidence intervals were constructed using P values from the synthetic control results, consistent with previous work using this method (Mitze et al 2020), and proposed by Altman and Bland (2011), details are provided in supplementary text 1.

Sensitivity analyses
A series of analyses were conducted to confirm sensitivity of findings focusing on Alameda County. First, a different exposure measure was considered; a county was considered exposed to wildfire smoke if the percentage of the County's area covered by heavy smoke was more than 90%. Another measure considered 70% or higher exposure to medium smoke. Second, we used a different method to measure COVID-19 CFRs by considering the number of deaths from COVID-19 divided by the number of diagnosed cases on the same day. Lastly, interactive fixed-effect counterfactual estimator was computed to consider results from application of a different methodology to study this effect (Liu et al 2020). All datasets and code used for this project can be found online.

Results
SFBA Counties that had more than 100 COVID-19 deaths in our study period (June 1st to November 10th, 2020) were considered eligible and six Counties met the criteria: Alameda, Contra Costa, Marin, San Francisco, Santa Clara and Sonoma (table 1). The first day when 70% of a County's area was covered by heavy smoke ranged from 19th to 22nd August. By 22nd August, smoke exposure was ubiquitous across all eligible SF Bay Area counties ( figure 1). Results showed that a good synthetic control was reached for five Counties, as shown by a visual representation of the parallel trends assumption (figure 2). This parallel trend assumption was violated for Marin County as the trend in the synthetic control units did not match the one for this county before the wildfire smoke occurred (supplementary figure 1, table S1), and thus Marin was excluded from further analysis. We found evidence of an impact of wildfire smoke on COVID-19 CFR. The results vary by County with strongest effects observed in Alameda and San Francisco. A clear signal was identified in these Counties based on 95% confidence intervals (supplementary figure 2). Up to 58 (95% CI: 29, 87) additional deaths for every 1000 COVID-19 incident cases attributable to wildfire smoke was estimated in Alameda in September (figure 2, table S2). Every day following the wildfire smoke, an increase in 4-58 deaths per 1000 cases was observed in Alameda County for 35 d following smoke exposure. In San Francisco, the increase in CFRs was observed later and drove 14-42 additional daily deaths per 1000 cases from 35 to 49 d following smoke exposure (table S2). Other Counties showed no effects of smoke on COVID-19 CFRs. The variability in effect observed between Counties may be related to Alameda and San Francisco being the most densely populated in the SFBA (table 1). Although not all Counties in our analysis indicate a strong effect of wildfire smoke on COVID-19, there was a sharp increase in the CFR following wildfire smoke exposure in the two aforementioned Counties, indicating the importance of further exploring the compounding impacts of these colliding crises.
Sensitivity analyses showed that the results were robust to different exposure metrics and case fatality measures used. Applying CFRs using the diagnosed cases 7 d prior and using same-day measures showed similar effects ( figure 2 and supplementary figures 3  and 4). Results were also consistent when changing the event date according to different exposure metrics, such as 90% heavy smoke coverage and 70% medium smoke coverage (based on the thickness of smoke qualitatively described in the HMS product used in this study (NOAA 2020)), although somewhat decreased effects are observed with medium smoke exposure (supplementary figure 5). Similar effect were observed with the interactive fixed effect counterfactual estimator (supplementary figure 7). All selected control Counties and their weights for each of the SFBA eligible counties are described in the supplementary material (supplementary tables S3-8).

Discussion
By applying SCM to study wildfire smoke as potential drivers of COVID-19 CFR, we highlight the application of this methodology to an emerging and poorly understood topic. Although several studies have considered the role of air pollution in driving severe COVID-19, most associated an increase in particulate matter concentrations to COVID-19 cases and deaths. We capitalized on the California wildfires as a natural experiment, which allowed us to identify wildfire smoke as a driver of COVID-19 CFR within a counterfactual framework. The results of the SCM produces a daily estimate of the number of deaths per persons diagnosed with COVID-19 in each SFBA County attributable to wildfire smoke and that could have been prevented if the wildfire smoke had not occurred in that County.
We found varying results for different counties in the SFBA. A strong effect was observed in Alameda County with CFRs rising by up to 300% following wildfire smoke exposure (figure 2). The greatest effect was observed 18-24 d following the start of the smoke exposure where an increase in up to 58 (95% CI: 29, 87) deaths per 1000 cases was estimated on 8th September, 20 d after the smoke exposure began, for example (table S7). San Francisco County also showed an increase in CFR, although the observed effect was delayed by several weeks. Up to 42.67 (4.07, 81.27) additional daily deaths per 1000 cases were observed 44 d following smoke exposure in San Francisco County. This delayed response to wildfire smoke may have been driven by local terrain and meteorological conditions which may have altered smoke concentration and chemistry as the plumes travelled through the atmosphere.
For the remaining three counties (Contra Costa, Santa Clara and Sonoma), no effect was observed and there even appears to be a protective effect in Sonoma County, with the CFR increasing more in the weighted synthetic control than in Sonoma itself, although this effect was not precise at the 95% confidence level (figure S2). It is unclear why we observe these contrasts between neighboring Counties, but it may be related to differences in population density, variations in socio-demographics within each County and differences in behavioral change related to wildfire smoke. Nevertheless, the effects observed in Alameda and San Francisco indicate the need to further investigate the role of wildfire smoke in driving COVID-19 CFRs. Although the observed effects are not stable across Counties, it serves as an exploratory analysis to study these effects by applying a novel methodological approach. More work needs to be conducted to understand these varying results. Extreme weather events such as wildfires have been shown to directly impact population mobility (Hatchett et al 2021)-future research could disentangle the specific short term and delayed role of wildfire smoke in affecting mobility patterns and its association with COVID-19 outcomes, for example; this would be best addressed using individual-level data sources.
The implications of this work are two-fold: it can provide insight on risk factors of severe COVID-19 and inform hospital preparedness in the context of the ongoing pandemic, including in locations regularly exposed to wildfires where the levels of new COVID-19 infections are still high and vaccination rates are low. Such evidence can be particularly crucial to develop resilient communities and health care systems in the context of seasonal infectious diseases compounded by extreme weather events in a changing climate. Posing this question on the interactive effect of COVID-19 and wildfire smoke exposure can help in understanding the implications of these dual health risks on the human biological system.
Assessment of the role of wildfire smoke on severe COVID-19 outcomes can also be used to inform hospitals of increased risks during wildfire smoke exposure. Many local and regional hospitals have reached or even exceeded capacity in the context of the COVID-19 emergency (HHS 2021). While several models have been proposed to estimate demand for hospital beds (Capistrán et al 2020, Ferstad et al 2020), epidemic forecasting of the impacts of the COVID-19 pandemic have been largely unsuccessful (Ioannidis et al 2020). Understanding the potential role of environmental exposures in driving severe COVID-19 symptoms and disease progression can contribute to informing local hospital preparedness in the context of wildfires. Forecasting days with exposure to wildfire smoke could be used to estimate an increase in COVID-19 related hospitalizations and inform measures to prevent overwhelming of the healthcare system during a wildfire.
There are several limitations that should be mentioned to contextualize our study results. First, CFRs are only one measure of severe COVID-19 disease among many potentially relevant outcomes. Considering measures such as COVID-19 hospitalization rates may also be appropriate due to the potential acute role of wildfire smoke in driving increased hospital visits from COVID-19; we focused on CFRs to consider the most severe outcomes for our analysis. Second, there are inherent challenges to studying an epidemic that is underway; while SCM overcome some of these, there are potential differences in COVID-19 response measures that may have coincided with the wildfire event at the local scale that could challenge the common shock assumption; adjusting for daily County-level mobility should decrease this potential bias. Additionally, the wildfire smoke events may have changed people's behavior in seeking medical care due to recommendations for residents to stay in their homes, which in turn may affect the number of diagnosed COVID-19 cases; however, using lagged CFRs helped reducing this potential bias.
Moreover, HMS data, which provides smoke plume coverage based on satellite imagery, does not differentiate between smoke at the ground-level and smoke higher up in the troposphere; this could lead to exposure misclassification. Also, the product is based on visual classification of plumes satellite imagery and may there may be errors in this classification as well as the extrapolation to the county level. For example, we assume that when a percentage of a County has smoke based on satellite imagery that the population living there is exposed. However, as the majority of California was exposed to a thick layer of wildfire smoke in August and early September 2020 and PM 2.5 levels measured at ground-level monitoring stations reached unparalleled values. Thus, this misclassification is likely inconsequential in this context. Also, sensitivity analyses were applied to consider different percentages of smoke coverage, and consistent results were found with these various measures.
In conclusion, this study provides evidence of wildfire smoke driving a higher COVID-19 CFR. The effect observed in certain Bay Area Counties indicates a need to continue to study the role of air pollution in driving severe COVID-19 using novel approaches and methodologies. Extreme weather events such as wildfires smoke can drive increased vulnerability to infectious diseases, highlighting the need to further study these colliding crises to increase preparedness for future pandemic threats in a changing climate.

Data availability statement
The data that support the findings of this study are openly available at the following URL/-DOI: https://github.com/benmarhnia-lab/wildfires_ covid19.

Funding
This work was supported by the California Environmental Protection Agency Office of Environmental Health Hazard Assessment #19-E0022. Lara Schwarz was supported by the Fogarty International Center of the National Institutes of Health under Award Number D43TW009343 and the University of California Global Health Institute.

Author contributions
LS, AD and TB conceived the project and assisted in writing the paper. LS led the writing and the first draft of the paper. LS, AD and RA carried out the data curation, data analysis and the wildfire smoke exposure modelling. AD, RA, AG, RB, & TB discussed the results, and contributed to the final paper. All authors have read and agreed to the published version of the manuscript.

Conflict of interest
Authors declare that they have no competing interests.