Disentangling the effects of air pollutants with many instruments

https://doi.org/10.1016/j.jeem.2021.102489Get rights and content

Abstract

Air pollution poses a major threat to human health. Far from unidimensional, air pollution is multifaceted, but quasi-experimental studies have been struggling to grasp the consequences of the multiple hazards. By selecting optimal instruments from a novel and large set of altitude–weather instrumental variables, we disentangle the impact of five air pollutants in a comprehensive assessment of their short-term health impact in the largest urban areas of France over 2010–2015. We find that higher levels of at least two air pollutants, ozone and sulfur dioxide, lead to more respiratory-related emergency admissions. Children and elderly are mostly affected. Carbon monoxide increases emergency admissions for cardiovascular diseases while particulate matter is found responsible for increasing the cardiovascular-related mortality rate, and sulfur dioxide the respiratory-related mortality rate. Assuming a five air pollutants context, we show that an analyst who ignored the presence of interrelations between air pollutants would have reached partially false conclusions.

Introduction

To protect human health, urban environmental regulations increasingly rely on ambient pollutant concentrations both to inform and take actions. Thanks to the high frequency monitoring systems in place in large cities, local authorities may implement driving restrictions, impose lower speed limits or ban industrial activities when a pollutant concentration exceeds a regulatory threshold. The avoided damage when concentrations fall are central to the design of these environmental policies, first and foremost damage from the respective health impacts of pollutants forming the urban air pollution mixture. Whereas recent quasi-experimental evidence relative to global air quality is clear-cut, disentangling the effect of distinct air pollutants has been a long-discussed challenge. While having received considerable attention, it remains a key difficulty in observational studies. In the discussions for updating the global air quality guidelines, the World Health Organization has set “Causality and independence of effects including multi-pollutant effect estimates as a basis for joint health impact assessment” as a key point for debate, highlighting the relevance of describing and regulating jointly several pollutants (WHO, 2018).1

In this paper, we conduct a large-scale quasi-experimental study of the short-term concomitant effects of five air pollutants on morbidity and mortality in the largest urban areas of France over 2010–2015. We address the challenge posed by the highly correlated daily variations of air pollutants by leveraging a novel and large set of instruments which describe extensively altitude–weather conditions e.g. winds or temperature profiles. By mining predictive relationships to find instruments for each air pollutant separately, we disentangle the concomitant effects of the main pollutants of the urban mixture. Our first contribution is to provide causal evidence on the separate effects of five air pollutants on short-term cardiovascular and respiratory morbidity and mortality, in the real urban environment, while controlling for the other pollutants. Our second contribution is to suggest a novel set of instruments which allows precise estimations when leveraged with the IV-Lasso method by Belloni et al. (2012).

Concerns about extrapolating associational estimates have been voiced insistently (Currie et al., 2011, Dominici et al., 2014, Bind, 2019), but causal estimation remains challenging. Air pollution is not allocated randomly through time and space and may serve as a surrogate for a number of economic and population variables (e.g. traffic, industrial activities, bank holidays...), therefore the well-known challenge to isolate exogenous air pollution variations, even at high frequency. While causal estimates are considered as the gold standard to inform public policies, quasi-experimental studies are still scarce, and typically not able to isolate a given pollutant effect, but rather a cocktail of several ingredients. To identify air pollution effects, the quasi-experimental literature has been very creative in finding external shocks affecting air pollution independently of health outcomes. Authors have taken advantage of plausibly exogenous shocks such as airport congestion (Schlenker and Walker, 2015), daily boat traffic (Moretti and Neidell, 2011), changes in local traffic (Currie and Walker, 2011, Knittel et al., 2016, Simeonova et al., 2018) or recession (Chay and Greenstone, 2003). The nature of the shocks underpinning the estimations entails quasi-random variations of air quality — but not pollutant-specific variations.

Car traffic engender emissions of particulate matter, carbon monoxide and nitrogen oxides as primary pollutants, and indirectly ozone, a secondary pollutant formed from primary sources. A lower economic activity entails a slowdown in emissions from industry, reducing among other sulfur dioxide and particulate matter. Pollutant concentrations often vary together as they share some common sources, but approximating air pollution by a unidimensional phenomenon might be questionable, not least because some air pollutants are strongly anti-correlated due to chemical equilibrium. As a result, studies relying on these global sources of variations are not well suited to separate the causal effect of distinct air pollutants. Some recent studies resort to finding exogenous shocks specific to one pollutant e.g. Halliday et al. (2019) who use volcano eruptions whose chemical composition is very specific, hence justifying a single-pollutant model. Using change in wind directions as instruments, Deryugina et al. (2019) show that the PM2.5 impact on elderly mortality is more robust than that of other pollutants, instrumenting jointly for three pollutants (particular matter, carbon monoxide and ozone). Yet a broader set of instruments may increase the ability to separate the impact of more air pollutants — showing that not only PMs may be impacting daily mortality. In contrast, most of the existing literature is based on single-pollutant models and authors generally acknowledge that the given pollutant under study may serve as a surrogate for another.2

In parallel, the emergence of novel econometric and data science techniques has fostered the hope that the causal effects of each air pollutant could be more precisely estimated (Carone et al., 2020). In this spirit, a number of studies intend to explore multi-pollutant exposure consequences with random forest or clustering over pollution profiles but remain ultimately correlative evidence (Zanobetti et al., 2014, Bobb et al., 2015, Tavallali et al., 2020). This paper moves beyond both literature by using a wide set of physical instruments allowing to separate the impact of five air pollutants in a quasi-experimental setting and over a broad set of outcomes. We here observe and consider simultaneously six long-regulated hazardous pollutants which have been the focus of the first and often-revised national and international standards for protecting human health: particulate matter of less than 2.5 micrometers PM2.5, of less than 10 micrometers PM10, carbon monoxide CO, nitrogen dioxide NO2, ozone O3 and sulfur dioxide SO2. On a relatively small sample, we bridge the gap between on one hand, quasi-experimental studies which often lacked sufficiently distinct exogenous shocks to disentangle air pollutant effects, and on the other hand, associational studies using data-mining techniques within multi-pollutant models.

For this study, we use a novel and large set of instruments, altitude–weather variables: thermal inversions, planetary boundary layer height, altitude winds and altitude pressures derived from a general climate model — the LMDZ model, from the Laboratoire de Météorologie Dynamique.3 We exploit the richness of a great number of instrumental variables to predict each pollutant variation. Indeed, the atmosphere dynamics, such as wind effects, plays a key role in the mixing, the chemistry and the dispersion of urban air pollution and thus in the ambient air pollution inhaled by the population. The exclusion restriction for this type of IV strategy is that, after largely and flexibly controlling for surface weather variables and city-specific seasonal fixed effects, changes in altitude–weather variables are unrelated to changes in population health outcomes except through their influence on air pollutant concentrations. The specification includes very flexible month-by-year-by-city fixed effects and day-of-the-week-by-city fixed effects, so the estimates are identified from deviations within month–year–city cells on similar week days,4 and we control for daily temperature, humidity, precipitations and wind strength specified as polynomials of order two, and sunlight and presence of snow. Controlling for surface weather is important inasmuch as they have a direct effect on health and are correlated to our instruments, so a number of robustness checks to the main specification are examined. Individually, some of these instruments have been used to instrument a unidimensional air pollution component. Arceo et al., 2016, Jans et al., 2018, Chen et al., 2018 and Sager (2019) rely on thermal inversions, an inversion of the gradient of vertical temperature profiles which favors polluted conditions. Deryugina et al. (2019) and Anderson (2019) use wind characteristics. Schwartz et al. (2016) use surface wind speed and the planetary boundary layer height, a key driver of ground-level air quality although still under-used in the literature.

To derive pollutant-specific causal effects, we use optimal instrument selection among a high-dimensional set of altitude weather variables, relying on the econometric theory by Belloni et al., 2012, Belloni et al., 2016 and Chernozhukov et al. (2015).5 These recent techniques allow us to select instruments in an optimal way, avoiding ad-hoc selection and enhancing precision in a setting where it is decisively needed. Compared to the literature drawing causal inference from the unpredictable components of weather variations, the originality here is to use a large set of altitude–weather conditions as opposed to a sub-component, and let the data reveal the underlying strongest relationships. We may indeed find many other and more complex phenomena linking altitude–weather variables to ground-level pollution by leveraging the rich set of instruments at hand. Isolating different exogenous reasons for each pollutant variation with an IV-Lasso, we prove the empirical added-value of these recent high-dimensional econometric methods, whose applications are too often confined to repeating the existing analysis. Indeed, it is in practice very rare to rely on a naturally-large set of instruments.

This study contributes to the recent literature in economics which estimates the health effects of air pollution in quasi-experimental settings (Currie et al., 2011, Schlenker and Walker, 2015, Deryugina et al., 2019). We combine daily air pollutant concentration data with administrative data on location-specific daily mortality and emergency hospital admissions for cardiovascular and respiratory diseases across age-groups. These data cover over six years (2010–2015) the ten largest urban areas of France, where about 40% of the French population lives. Our results show that ozone and sulfur dioxide impact positively emergency admissions for respiratory diseases, independently from each other and even after controlling for the other pollutants. Quantitatively, we find 4% more respiratory admissions when O3 goes up by + 10 μgm3 (about half a standard deviation) and 7% more respiratory admissions when SO2 goes up by + 1 μgm3 (two-third of a standard deviation). These aggregate effects are mostly driven by emergency admissions of young children and elderly. Although not in all specifications, some of the models suggest an additional impact of carbon monoxide on respiratory emergency admissions. On cardiovascular diseases, we find an impact of carbon monoxide: + 100 μgm3 (about half a standard deviation) leads to 4% additional emergency admissions. Moreover, we find an effect of PM2.5 on cardiovascular-related mortality: + 10 μgm3 (about a standard deviation) leads to a 5% higher mortality rate for deaths with at least one cardiovascular cause (or a 2% increase in the mortality rate). An increase by + 1 μgm3 of SO2 translates to a 10% higher mortality rate for deaths with at least one respiratory cause (or a 2% increase in the mortality rate). These short-term health effect estimates are significant even when controlling the family-wise error rate of at least one false rejection out of the five hypothesis tests (one per candidate pollutant).

Our last contribution is to shed light on the shortcomings of single-pollutant models compared to multi-pollutant models, by providing an extensive comparison of the results in both paradigms. If most pollutants can be found as having a strong causal effect on short-term health in single-pollutant models, multi-pollutant models offer a more nuanced picture. In single-pollutant models, there may be pollutants acting as surrogate for the others, entailing misleading conclusions. For all outcomes, we reject the equality of estimates from single-pollutant IVs with these of a multi-pollutant IV-Lasso. When instruments are specifically chosen for each pollutant, we reject equality between single and multi-pollutant models for mortality outcomes. These results may question the proxy paradigm which often is the rule in empirical analysis. For instance, if NO2 has been advocated as a good candidate to proxy for all pollutants in Levy et al. (2014), we find no effect of this pollutant (at short-term) when other pollutants enter the equation. In addition, controlling for four other pollutants and selecting optimally the instruments allow to eliminate the odd finding that O3 leads to a decrease in mortality or emergency admissions. This spurious result is usually explained by the strong negative correlation that this pollutant has with other pollutants.6

More generally, our results tie into the literature intending to design policy instruments in a multi-pollutant context e.g. Montero, 2001, Ambec and Coria, 2013, Fullerton and Karney, 2018. Rich economic valuation of environmental policies taking several major pollutants into account, such as Holland et al. (2018) or Clay et al. (2019), substantially rely on integrated assessment models where the health impact measurement is a key step, and is taken from studies which generally use the proxy approach. This paper contributes to quantifying the respective marginal benefit in reducing distinct pollutants, in a context of increasing interest in regulating air pollutants jointly. In addition, our result put into question the current implementation of Air Quality Indexes, which are generally specified as maximum over pollutant sub-indexes, ruling out concomitant effects. Real-time AQI information about air pollution has been shown critical for defensive investment and protective behavior (Neidell, 2009, Zhang and Mu, 2018, Barwick et al., 2019).

The article proceeds as follows. In the second section, we introduce background information on pollutants, estimation of health impacts and on pollutants’ interaction with weather conditions. In the third section, we present jointly the data and the mechanisms at work. Then, we present and discuss the empirical strategy and the instruments’ selection procedure in the fourth section. Finally, we present our results and then conclude.

Section snippets

Air pollution or air pollutants?

Air pollutant concentrations are highly correlated in the urban setting but air pollution is without doubt multidimensional in its nature and consequences. The air we breathe contains particulate matter of various sizes and various gases, which may affect differently our health. In this paper, we consider the air pollutants which gather the strongest evidence according to WHO (2018): PM, O3, NO2, SO2 and CO. These are pollutants in WHO’s “Group 1”, which “should be considered of greatest

Data

In this section, we describe the data sources which have all in common the following scope: the ten most populated urban areas in France over the 2010–2015 period. Table A.1 in Appendix reports the population, and Fig. 1 the geographical location and extension of urban areas. The largest urban area is the Paris region where more than twelve million people live. Most of the other urban areas have about a million inhabitants. The urban areas are well spread out on the French territory.

Within

The causal damages of air pollution: preliminary evidence

In this section, we provide basic preliminary evidence on the detrimental impact of air pollution on health by relying on isolated instruments. We show how these instruments are strong predictors of air pollution. We make a case empirically for the need to rely on instrumentation for eliminating confounders. We caution against single-pollutant models when instruments are far from being pollutant-specific.

Eliminating confounders. In this section, we empirically evidence that air pollution is

Results

We first disentangle separately the impact of distinct air pollutants on four short-term health outcomes using the Lasso-selected set among the large set of potential altitude–weather instruments. To discuss the shortcomings of single-pollutant models, we then provide results when considering separately distinct pollutants instead as it is classic in this literature.

Conclusion

This paper shows how distinct pollutants have strong and independent effects on the short-term respiratory health of the urban population. We develop a two-step strategy, showing first how air pollution is causally linked to daily emergency admissions and mortality rates and second how optimally selecting many more instruments allows to disentangle the effects of several pollutants. We provide causal evidence on the separate effects of ozone and sulfur dioxide on respiratory diseases, jointly

References (76)

  • GatelyC.K. et al.

    Urban emissions hotspots: Quantifying vehicle congestion and air pollution using mobile phone GPS data

    Environ. Pollut.

    (2017)
  • HansenC. et al.

    Instrumental variables estimation with many weak instruments using regularized JIVE

    J. Econometrics

    (2014)
  • JansJ. et al.

    Economic status, air quality, and child health: Evidence from inversion episodes

    J. Health Econ.

    (2018)
  • KanH. et al.

    Short-term association between sulfur dioxide and daily mortality: The Public Health and Air Pollution in Asia (PAPA) study

    Environ. Res.

    (2010)
  • MunirS. et al.

    Modelling the impact of road traffic on ground level ozone concentration using a quantile regression approach

    Atmos. Environ.

    (2012)
  • SagerL.

    Estimating the effect of air pollution on road safety using atmospheric temperature inversions

    J. Environ. Econ. Manag.

    (2019)
  • ZanobettiA. et al.

    Health effects of multi-pollutant profiles

    Environ. Int.

    (2014)
  • ZhangJ. et al.

    Air pollution and defensive expenditures: Evidence from particulate-filtering facemasks

    J. Environ. Econ. Manag.

    (2018)
  • AddaJ.

    Economic activity and the spread of viral diseases: Evidence from high frequency data

    Q. J. Econ.

    (2016)
  • AndersonM.L.

    As the wind blows: The effects of long-term exposure to air pollution on mortality

    J. Eur. Econom. Assoc.

    (2019)
  • ArceoE. et al.

    Does the effect of pollution on infant mortality differ between developing and developed countries? Evidence from Mexico city

    Econ. J.

    (2016)
  • BanzhafH.S. et al.

    Do people vote with their feet? An empirical test of Tiebout

    Amer. Econ. Rev.

    (2008)
  • BarrecaA. et al.

    Adapting to climate change: The remarkable decline in the US temperature-mortality relationship over the twentieth century

    J. Political Econ.

    (2016)
  • BarwickP.J. et al.

    From Fog to Smog: The Value of Pollution InformationTechnical Report

    (2019)
  • BauernschusterS. et al.

    When labor disputes bring cities to a standstill: The impact of public transit strikes on traffic, accidents, air pollution, and health

    Am. Econ. J.: Econ. Policy

    (2017)
  • BelloniA. et al.

    Sparse models and methods for optimal instruments with an application to eminent domain

    Econometrica

    (2012)
  • BelloniA. et al.

    Inference in high-dimensional panel models with an application to gun control

    J. Bus. Econom. Statist.

    (2016)
  • BindM.-A.

    Causal modeling in environmental health

    Annu. Rev. Public Health

    (2019)
  • BobbJ.F. et al.

    Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures

    Biostatistics

    (2015)
  • CameronA.C. et al.

    Bootstrap-based improvements for inference with clustered errors

    Rev. Econ. Stat.

    (2008)
  • CaroneM. et al.

    In pursuit of evidence in air pollution epidemiology: The role of causally driven data science

    Epidemiology

    (2020)
  • ChayK.Y. et al.

    The impact of air pollution on infant mortality: evidence from geographic variation in pollution shocks induced by a recession

    Q. J. Econ.

    (2003)
  • ChenS. et al.

    Air Pollution and Mental Health: Evidence from ChinaTechnical report

    (2018)
  • ChernozhukovV. et al.

    Post-selection and post-regularization inference in linear models with many controls and instruments

    Amer. Econ. Rev.

    (2015)
  • ChernozhukovV. et al.

    hdm: High-dimensional metrics

    (2016)
  • CheruyF. et al.

    Combined influence of atmospheric physics and soil hydrology on the simulated meteorology at the SIRTA atmospheric observatory

    Clim. Dynam.

    (2013)
  • ClayK. et al.

    The external costs of shipping petroleum products by pipeline and rail: Evidence of shipments of crude oil from North Dakota

    Energy J.

    (2019)
  • CoindreauO. et al.

    Assessment of physical parameterizations using a global climate model with stretchable grid and nudging

    Mon. Weather Rev.

    (2007)
  • Cited by (17)

    • A novel grey projection incidence model for assessing the relationships between cardiovascular diseases and air pollutants

      2023, ISA Transactions
      Citation Excerpt :

      For low- and middle-income areas with insufficient data samples and incomplete medical information, however, it is difficult to verify the main air pollutants that are related to CVDs based on extant research. In addition, given that the pathophysiological underpinnings of the associations have yet to be fully elucidated, the process of deriving specific probability distributions and experimental assumptions as the basis of statistical analysis is subjective and implicit [13]. As a consequence, the need for a novel modelling technique that is not constrained by sample data or assumptions for analysing the associations between cardiovascular diseases and various air pollutants has arisen.

    • Particulate pollution and learning

      2023, Economics of Education Review
      Citation Excerpt :

      Exposure to fine particulate matter (PM2.5) has been shown to be detrimental to human health and cognitive function, particularly to the most vulnerable groups such as young children (Jones & Goodkind, 2019; Jones, 2020; DeCicca & Malak, 2020) and the elderly (Godzinski & Suarez Castillo, 2021; Deryugina et al., 2019; Wang et al., 2022; Hollingsworth et al., 2021).

    • Air pollution and political trust in local government: Evidence from China

      2022, Journal of Environmental Economics and Management
      Citation Excerpt :

      Future studies could use laboratory experiments or field experiments paired with better surveys to address this limitation with using survey data. Another caveat is that different pollutants are highly correlated, which prevents us from isolating the effect of a single pollutant (Godzinski and Castillo, 2021). While this issue is ameliorated by the fact that not all air pollutants are affected by thermal inversions, other air pollutants such as carbon monoxide are also highly responsive to this atmospheric phenomenon (Arceo et al., 2016).

    View all citing articles on Scopus

    We thank the audience at the AERE Virtual 2020, CREST seminar, Causal Machine Learning Workshop 2020 in St Gallen, the ESEM 2019, EAERE 2019, LAGV 2019, ESPE 2019, INSEE seminar, 10th French Econometrics Conference 2018 for their suggestions and useful comments. We thank David Benatia, Jérémy l’Hour and Christophe Gaillac for hindsightful discussions, Laurent Gobillon, Dominique Goux, Sébastien Roux, Antoine Dechezleprêtre and Pauline Givord for their comments and suggestions. We thank the organizers and participants of the Machine Learning for Economics workshop at the Barcelona GSE Forum 2019, whose comments help improve this paper. We are especially grateful to Frédérique Chéruy for the help with the LMDZ model, and to Frédéric Hourdin. We thank Pierre Bayart and Chantal Vilette at Insee for their help with the civil registry data. We thank the CépiDc-Inserm for providing us the mortality by cause database. We thank Alireza Banaei, Fatma Kaci, Anne Bataillard, Max Bensadon and Francoise Bourgoin from ATIH for providing us the hospital admission database, Anne Laborie, Gilles Levigoureux and Frédéric Penven from ATMO France and the AASQA for providing us the air pollution datasets. This work has been partly funded by a French government subsidy managed by the Agence Nationale de la Recherche under the framework of the Investissements d’avenir programme reference ANR-17-EURE-001.

    View full text