Future air quality in Europe: a multi-model assessment of projected exposure to ozone

In order to explore future air quality in Europe at the 2030 horizon, two emission scenarios developed in the framework of the Global Energy Assessment including varying assumptions on climate and energy access policies are investigated with an ensemble of six regional and global atmospheric chemistry transport models. A specific focus is given in the paper to the assessment of uncertainties and robustness of the projected changes in air quality. The present work relies on an ensemble of chemistry transport models giving insight into the model spread. Both regional and global scale models were involved, so that the ensemble benefits from medium-resolution approaches as well as global models that capture long-range transport. For each scenario a whole decade is modelled in order to gain statistical confidence in the results. A statistical downscaling approach is used to correct the distribution of the model projection. Last, the modelling experiment is linked to a hind-cast study published earlier, where the performances of all participating models were extensively documented. The analysis is presented in an exposure-based framework in order to discuss policy relevant changes. According to the emission projections, ozone precursors such as NOx will drop to 30% to 50% of their current levels, depending on the scenario. As a result, annual mean O3 will slightly increase in NOx saturated areas but the overall O3 burden will decrease substantially. Exposure to detrimental O3 levels for health (SOMO35) will be reduced down to 45% to 70% of their current levels. And the fraction of stations where present-day exceedences of daily maximumO3 is higher than 120 μg m-3 more than 25 days per year will drop from 43% down to 2 to 8%. We conclude that air pollution mitigation measures (present in both scenarios) are the main factors leading to the improvement, but an additional cobenefit of at least 40% (depending on the indicator) is brought about by the climate policy.

Abstract. In order to explore future air quality in Europe at the 2030 horizon, two emission scenarios developed in the framework of the Global Energy Assessment including varying assumptions on climate and energy access policies are investigated with an ensemble of six regional and global atmospheric chemistry transport models.
A specific focus is given in the paper to the assessment of uncertainties and robustness of the projected changes in air quality. The present work relies on an ensemble of chemistry transport models giving insight into the model spread. Both regional and global scale models were involved, so that the ensemble benefits from medium-resolution approaches as well as global models that capture long-range transport. For each scenario a whole decade is modelled in order to gain statistical confidence in the results. A statistical downscaling approach is used to correct the distribution of the modelled projection. Last, the modelling experiment is related to a hind-cast study published earlier, where the performances of all participating models were extensively documented.
The analysis is presented in an exposure-based framework in order to discuss policy relevant changes. According to the emission projections, ozone precursors such as NO x will drop down to 30 % to 50 % of their current levels, depending on the scenario. As a result, annual mean O 3 will slightly increase in NO x saturated areas but the overall O 3 burden will decrease substantially. Exposure to detrimental O 3 levels for health (SOMO35) will be reduced down to 45 % to 70 % of their current levels. And the fraction of stations where present-day exceedences of daily maximum O 3 is higher than 120 µg m −3 more than 25 days per year will drop from 43 % down to 2 to 8 %.
We conclude that air pollution mitigation measures (present in both scenarios) are the main factors leading to the improvement, but an additional cobenefit of at least 40 %

Introduction
With the growing demand by air quality legislators for technical underpinning of emission control policies, addressing uncertainties and robustness is essential to provide information that can be used as a basis for defining efficient mitigation measures. While our understanding of physical and chemical processes, the documentation of social and economic activity, as well as the computing power continue to grow at a steady pace, it remains to be demonstrated that the robustness of predictions of air quality models has increased in line with their growing complexity.
In the present paper we will address uncertainties in three important aspects of air quality risk assessments, inter alia: emission projections, transport and transformation, and exposure downscaling.
1. Emission projections: a large fraction of the uncertainty in the projections is due to the air pollutant emissions prescribed in social, economic and technological scenarios. Uncertainties about the level of economic activity, the available technologies, their emission reduction potential, and their acceptability by end-users constitute many obstacles for providing quantitative estimates of future emissions of pollutants.
2. Atmospheric transport and transformation: in order to assess the impacts in terms of air pollution, the transformation of these primary emissions into secondary species (such as ozone or secondary organic aerosols) has to be taken into account, as well as their transport and processing in the atmosphere. Usually this is done using chemistry transport models (CTMs), numerical tools that evolved gradually from simple advection codes to complex systems that now take into account photochemical processes and heterogeneous chemistry, impacts of climate change, long-range transport, etc.
3. Exposure downscaling: in order to be policy-relevant, the impacts derived from the CTMs must be converted into exposure metrics at an appropriately fine scale to assess impacts on human health and ecosystems.
Most existing future projections of anthropogenic emissions of air pollutants are based on global models, such as those developed as part of the Intergovernmental Panel on Climate Change (IPCC). There has been an ongoing effort within the IPCC to include more detailed representation of short-lived gases in emission scenarios (Nakicenovic et al., 2000;IPCC, 2007). The more recent scenarios developed as part of the IPCC AR5 report -the RCPs (Representative Concentration Pathways) (van Vuuren et al., 2011) include a representation of a number of air pollutants with a level of detail (see for example, Riahi et al., 2011) in terms of underlying technologies and legislations that varies across models. However the RCP scenarios were primarily developed to encompass a range of long-term climate change outcomes and do not specifically look at the uncertainties in air pollution development in the shorter term. For our analysis we use a set of air quality scenarios from the Global Energy Assessment (GEA) Riahi et al., 2012). These scenarios are based on a similar set up described in  and are an outcome of combining a global energy systems model, MES-SAGE (Messner and Strubegger, 1995;Riahi et al., 2007), with air pollution legislations at a technology-specific level from the GAINS model (Amann et al., 2011). The GEA scenarios thus provide a platform to investigate air quality and climate mitigation co-benefits, following a methodology that explicitly includes technology-based air quality legislation and changes in spatial emission patterns. Uncertainties in the determination of future air quality using atmospheric models can be estimated using different methods. A first approach consists of building a model of the anticipated response of the atmosphere to a given change in the anthropogenic emission influx. Atmospheric response models are an approximation of full chemistrytransport models that allow the investigation of many scenarios; they are especially relevant for the optimization of air quality management strategies. This approach is found in the GAINS model (Amann et al., 2011;Schöpp et al., 1998) where the atmospheric response is represented with a statistical fit to a large number of EMEP  model calculations which use incremental emission changes. Other atmospheric response approaches include the direct decoupled method, adjoint modelling and source apportionment -see the recent review of (Cohan and Napelenok, 2011). A second approach consists in implementing an ensemble of comprehensive CTMs, often at the cost of investigating a more limited number of scenarios. The analyses of such ensembles are frequently used for air pollution forecasting (Zyryanov et al., 2012) and model inter-comparison . Ensemble long term studies of the evolution of atmospheric chemistry were reported at the global scale Shindell et al., 2006;Textor et al., 2006) as well as for regional air quality (van Loon et al., 2007;Vautard et al., 2006b;Cuvelier et al., 2007).
Here we will follow the ensemble approach. The statistical significance of the simulations is enhanced compared to existing assessments at the regional scale since we performed simulations over 10 yr, hence minimizing the sensitivity to inter-annual variability. The strength of the ensemble implemented here is further supported by a recently published companion study that evaluated the capacity of the same ensemble at reproducing air quality trends and variability in a hindcast mode (Colette et al., 2011), hereafter referred to as C2011.
Atmos. Chem. Phys., 12, 10613-10630, 2012 www.atmos-chem-phys.net/12/10613/2012/ The downscaling of exposure metrics also needs to be investigated since average modelled trace species concentrations are not the most relevant proxy for the assessment of impacts on human health and ecosystems. However model results contain biases for various reasons including -but not limited to -the spatial resolution. Given the sensitivity of threshold-based exposure indicators, it is essential to explore the implementation of bias correction techniques to improve the robustness of projections. Bias corrections and statistical downscaling strategies that are particularly relevant in the context of future projections can be found in the literature. However, their implementation has been limited up to now to the field of climate research (Michelangeli et al., 2009;Déqué, 2007). We propose to apply these techniques to air quality projections in order to derive unbiased proxies of future exposure to air pollution.
Section 2 of this paper will introduce the scenarios of emissions for air pollutants. In Sect. 3, the ensemble of CTMs participating in the study is introduced and a general description of model behaviour is discussed. In Sect. 4, the results are investigated for the background changes as well as exposure metrics, including their statistical downscaling.

Emission scenarios
The GEA emission pathways assume a median growth in global population up to 9.2 billion inhabitants in 2050 in agreement with the United Nations projections (United Nations, 2009). The average gross domestic product (GDP) growth rate over 2005 to 2030 is 2.7 % globally and 1.7 % for Europe. As described in Riahi et al., 2012), emissions include CH 4 , SO 2 , NO x , NH 3 , CO, VOCs (Volatile Organic Compounds), BC (Black Carbon), OC (Organic Carbon), PM 2.5 (particulate matter smaller than 2.5 µm in diameter). The emission data are originally computed on the basis of 11 world regions and consider in particular two main blocks for the European continent (Western and Eastern). They are then spatialised on a 0.5 degrees resolution grid . Base year (2000) emissions and spatial distributions are based on ACCMIP  and are identical to the RCPs. As described in , for both scenarios, future spatial emissions are estimated with an exposure-driven spatial algorithm where emissions decrease faster in those cells with the highest exposure. These exposure-driven trends are designed by means of comparison with emission trends over the recent past in order to capture the fact that there is more scope for emission abatement in densely populated area. (Butler et al., 2012) documented the significance of this redistribution algorithm in the case of the RCP8.5 by comparison with a scenario where the reductions of emissions are homogeneous.
The emission trajectories cover the whole 21st century, but we focus on 2030 because of its relevance for short-term policy making. The 2030 time period has also the advantage for air quality modelling that the climate signal is relatively weak (Katragkou et al., 2011;Langner et al., 2012), so that we used present-day meteorological conditions to drive the CTMs (see further details in Sect. 3.1) and we left the investigation of the relative impact of climate and emission changes to future work. In addition, this relatively short time scale allows representing explicitly the technological emission abatement measures.
We selected two of the GEA scenarios for the air quality simulations: 1. A Reference Case (hereafter referred to as 'reference').
It corresponds to a high energy demand scenario that includes all current and planned air quality legislation until 2030. The reduction of global annual energy intensity is slightly faster than observed over the recent past at 1.5 % until 2050; there are no policies on climate change and energy access. The climate response in 2100 is comparable to the RCP8.5 in terms of global radiative forcing.
2. Sustainable Climate Policy Case (hereafter referred to as "sustainable"). This scenario assumes underlying climate change policies, in particular a global temperature target of 2 degrees C by 2100 and energy efficiency improvements leading to an annual energy intensity reduction of 2.6 % until 2050. Also included are moderate energy access policies that reduce global use of solid fuels in cooking by 2030.
The total NO x and VOC emissions of the GEA scenarios in 2005 and 2030 for the 27 countries of the European Union (EU27) are given in Table 1 and further mapped over Europe in Fig. 1. Also included for comparison is the EMEP inventory for 2005 (Vestreng et al., 2009). The EMEP inventory is developed from national emission officially reported by the countries and it constitutes a benchmark widely used in air quality studies. In addition, this inventory has been used in the C2011 study that relies on the same ensemble of CTMs as the present paper.
In general, we observe that while there is an overall agreement between the two sets for 2005, there are differences, especially in the spatial distribution at finer scale. For example, the GEA emissions exhibit less spatial variability than EMEP emissions in the main hotspot constituted by the larger Benelux area. A smoother representation of emission hotspots was expected given that the underlying maps for the GEA 2005 emissions are derived from ACCMIP global fields , whose resolution is coarser than regional inventories.
The total NO x emissions over Europe in the GEA base year (2005) is higher than in the EMEP inventory for the same year (Table 1). ACCMIP data was matched to the EMEP regional totals for the year 2000, but 2005 is actually a projection and, again some differences were expected. A similar behaviour is documented by   in their comparison of RCPs, ACCMIP and EMEP data. The larger amount of total emissions of NO x is especially notable over rural areas on Fig. 1. The VOC to NO x ratio for the present-day conditions is 82 % and 73 % in the EMEP and GEA data for 2005, respectively. This ratio is relevant to define the chemical regime that dominates in the ozone formation process. Thus the spread between the projection and the officially reported data illustrates well the uncertainties remaining in the input data for atmospheric chemistry modelling.
As seen in Table 1, current air quality legislations in the 'reference' scenario reduce NO x by 50 % while the inclusion of climate-change policies ("sustainable") leads to largescale reductions of 69 % as compared to 2005 levels. In other terms total NO x emission in the 'sustainable' scenario are 38 % lower than for the "reference'; in 2030, so that the cobenefit brought about by the climate policy in terms of NO x emissions is 38 % (Colette et al., 2012). Most of these reductions occur in particular in the transport sector.
The decrease is observed to be larger over formerly highemissions areas and, in many cases, large urban centres cannot be distinguished on the NO x map for 2030 in the 'sustainable' scenario, thus indicating that combined policies on air pollution and climate-change will be effective in Europe in achieving large-scale reductions in emissions.
Last, the NO x emissions prescribed in the GEA scenarios were split into NO and NO 2 contributions using countrydependant ratios derived for 2020 (CAFE, 2005) in order to account for the significant change in this speciation reported over the recent past. These NO/NO 2 ratios were derived using the GAINS model, but they are not part of the GEA dataset. Only the regional CTMs accounted for this change whereas the global models used a constant ratio.

Participating models
Using an ensemble of CTMs constitutes a major strength of the present study. Six modelling groups were involved and they include a variety of approaches: global or regional offline chemistry transport as well as one online regional model. The CTMs involved in this study are: 1. BOLCHEM (Mircea et al., 2008) is a regional online coupled atmospheric dynamics and composition model that computes both chemistry and meteorology accounting for the relevant interactions. It is developed and operated by the Institute of Atmospheric Sciences and Climate of the Italian National Council of Research.
2. CHIMERE is a regional CTM developed and distributed by Institut Pierre Simon Laplace (CNRS) and INERIS . In the present case it was implemented by INERIS.
3. The EMEP MSC-W model , hereafter referred to as EMEP model, is a regional CTM developed, distributed and operated at the EMEP Centre MSC-W, hosted by the Norwegian Meteorological Institute.
4. The EURAD model (Jakobs et al., 2002) is a regional CTM operated at FRIUUK for continental and local air quality forecasting in the Ruhr area.

MOZART (Model for OZone And Related chemical
Tracers) is a global chemistry transport model developed jointly by the United States National Center for Atmospheric Research, the Geophysical Fluid Dynamics Laboratory, and the Max Planck Institute for Meteorology. The MOZART-4 version of the model (Emmons et al., 2010) was implemented by CNRS and NOAA for this study.
A detailed description of the models is given in C2011 and the setup of the simulations presented here is also very close to the configuration of that hindcast experiment. All the groups simulated 10 meteorological years corresponding to the early 21st century for each of the three emission scenarios described above. Five models used the same reanalyses of historical years (downscaled with a mesoscale model for the regional tools) as in the C2011 paper, and the remaining model used downscaled control climate simulations representative of the early 21st century. The boundary conditions for the regional models are identical to C2011 and therefore also representative of early 21st century (LMDzINCA fields for CHIMERE and BOLCHEM and observation-based O3 climatology for EMEP and EURAD). The only changes in terms of model setup (excluding anthropogenic emission changes discussed in Sect. 2) are: It should be noted that in the present work, as well as in the hindcast study of C2011, besides using the same anthropogenic emissions, the modelling setup was not heavily constrained since the scope was to investigate the envelope of AQ trajectories.

Overview of the ensemble
The 6-member ensemble of CTMs was thoroughly evaluated in C2011, and the reader is referred to that paper for an assessment of model performances in a hindcast perspective (i.e. using past emissions and reanalysed meteorology). In the remainder of the paper we will focus on composite maps of the ensemble. Nevertheless it is useful to provide an evaluation of the spread amongst model results. Figure 2 displays the average summertime (June, July, and August) surface O 3 concentrations over the 10 yr of simulation. All the models display a similar geographical pattern dominated by the land-sea gradient (especially over the Mediterranean) driven by deposition processes. Only BOLCHEM really stands out of the distribution with a much lower ozone background because of higher NO 2 levels attributed in C2011 to vertical mixing and heterogeneous chemistry. The magnitude of the local minima over the Benelux hotspot driven by titration processes differs across the models. It is noteworthy to highlight that this local minimum is captured by MOZART since this feature is not common in global models.
The model performances are further documented in Table 2 that provides a comparison between each model and observed values reported at AIRBASE stations (the public air quality database maintained by the European Environmental Agency http://air-climate.eionet.europa.eu/ databases/AIRBASE/). The mean bias, root mean square error and correlation of the daily maximum ozone over the June-July-August months are provided for all models reporting hourly data (BOLCHEM, CHIMERE, EMEP, EU-RAD, and OsloCTM2). For EMEP only the mean bias is given: since the CTM relies on meteorological fields from a climate free-run for this experiment, there was no scope for a synchronous comparison with observations. The low bias of BOLCHEM mentioned before appears on the median score as well as the high bias of CHIMERE, only compensated by a high correlation to achieve an average root mean square error. Whereas EMEP reported a similar behaviour than CHIMERE in the C2011 study, it exhibits here a negative bias attributed to the different choice of meteorological forcing. The maps of coefficient of variation (CV: ratio of the standard deviation of the 6 models divided by the ensemble  mean) across the ensemble allow discussing further the spread of the models. For NO 2 (Fig. 3a), the CV is computed from the annual average of each model, while for O 3 we use the summer average. The low CV of NO 2 over high-emission areas illustrates the consistency of anthropogenic emissions handling in the 6 models, which is an important strength of the ensemble. Differences are found in coastal areas because of the higher sensitivity to emission injection heights in areas where the marine boundary layer can be shallow. This sensitivity yields an important spread of O 3 (Fig. 3b) over the Mediterranean region (offshore Marseille and North of Algeria). The O 3 spread is also high over the Benelux/UK region even though the models are consistent in their representation of ozone precursors such as NO x in this area. While this higher O 3 spread highlights uncertainties in existing models, it also advocates for the use of such an ensemble to cover the envelope of possible behaviours.

Evolution of NO 2 concentrations
The ensemble medians of simulated annual NO 2 concentrations for all the emission scenarios are given in Fig. 4. The present day situation is given in the first row according to the GEA emissions (Fig. 4a). We also include results based on the EMEP officially reported emissions (Fig. 4b). These two representations of NO 2 levels for the early 21st century are very similar. The order of magnitude for background levels is consistent. But there are differences over high-emissions areas, due to smoother local maxima of emissions in the GEA dataset for 2005, for example in the large urban hotspots such as the Benelux area, large Spanish cities, Paris, Milan and Krakow as well as the ship tracks. On the other hand, NO 2 levels are much higher when using the GEA dataset for Helsinki and the Marseille plume in South-Eastern France. These differences in the spatial variability of the NO 2 fields are attributed to the global inventories used to spatialise the base year emissions of the GEA dataset (ACCMIP, Lamarque et al., 2010).
The projections for 2030 given on the last two rows of Fig. 4 exhibit a large decrease of NO 2 levels throughout Europe for both scenarios. This finding confirms the impact of the strong policy regulation of anthropogenic air pollution to be enforced during the next 20 yr. The differences compared to GEA emissions for 2005 reach almost 20 µg m −3 over high emission areas for the "sustainable" scenario. For the 'reference' trajectory, NO 2 concentrations over the main hotspots of the larger Benelux/UK/Germany area and the Po-Valley still stand out from the background. For the "sustainable" scenario, emission reductions are such that the concentration in these hotspots does not exceed background levels. Apart from these differences in the magnitude of the change, since the downscaling algorithms are identical for all GEA scenarios, the patterns of reduction are identical for the "reference" and "sustainable" scenarios ( Fig. 4d and f).

Evolution of O 3 concentrations
The average annual O 3 concentrations and changes are displayed on Fig. 5. The background fields are very similar for the present day conditions with the GEA and EMEP emissions. Important differences are however simulated over the high emission areas around the greater Benelux region. As mentioned before (Sect. 4.1.1), the spatial gradient of NO x emissions is lower over these large urban centres in the GEA emissions compared to the EMEP inventory (even if both emission datasets agree in terms of total mass emitted), so that the titration effect -illustrated by a local minimum of O 3 when using EMEP emissions -disappears. As reported by (Beekmann and Vautard, 2010;Tarasson et al., 2003), this region is in the process of becoming less saturated in NO x . But it appears that in the GEA emissions for 2005 the Benelux hotspot is less saturated in NO x than the official EMEP inventory for the same date. Again the lower spatial variability of the GEA dataset is related to the underlying global maps used to produce the spatialisation. Important differences are 'reference' 'sustainable' 'reference' 'sustainable' also found over the Mediterranean, despite the use of identical meteorological forcing ruling out possible changes in incoming solar radiation or deposition fluxes. These differences in remote areas are thus also attributed to differences in the total mass of precursor over Western Europe that builds up as ozone after having undergone long range transport over the sea (Kanakidou et al., 2011).
In the projections for 2030, ozone air pollution decreases over Europe. In both the 'reference' and 'sustainable' scenarios, there is a widespread decrease of O 3 over the southern part of the domain. Annual means of daily ozone increase over the Benelux/UK/Germany/Northern France area as a result of a less efficient titration by NO x , which shows that the area was still saturated in NO x . It should be noted that this feature -that stands out in the ensemble median -is also captured by global models albeit with a slightly smaller magnitude (especially for OsloCTM2 that operates at a coarser resolution).
The fact that the decrease of ozone in southern Europe in the future is accompanied by an ozone increase over NO x saturated areas by 2030 is a well documented concern in the context of air quality management in Europe, as discussed in previous papers (Amann and Lutz, 2000;Thunis et al., 2008;Szopa et al., 2006). It is noteworthy to highlight that our study confirms this conclusion when using an updated set of projections and a representative ensemble of air quality models.
The hindcast analysis of C2011 confirmed that such an increase of O 3 associated with a decrease of NO x emissions over the NO x -saturated hotspot in the Benelux was found Atmos. Chem. Phys., 12, 10613-10630, 2012 www.atmos-chem-phys.net/12/10613/2012/ in surface observations for the past decade. Even though the trends were small in magnitude, they were significant at most urban and suburban sites. At rural sites the patterns were more variable, with more significant positive trends around the greater Benelux area than elsewhere. Because of their relatively coarse resolution, the CTMs involved in the hindcast were more successful in capturing the geographical patterns of the trends observed at rural than urban stations. Whereas increasing trends in urban areas were widespread in the observations, the models could only capture this behaviour over the main NO x saturated area of the larger Benelux region. The joint analysis of the present projection and the published hindcast allows us to conclude that the projected increase of O 3 modelled over the greater Benelux area could actually apply to urban areas beyond the Benelux region in Europe, even if it is not explicitly resolved by the models implemented here.
In addition we recall the fact that the 2005 GEA emissions are less saturated in NO x than the EMEP inventory. Consequently the upward trend seen in the results for the Benelux area can only be underestimated.
From this study, we conclude that annual mean O 3 concentrations will increase over high emissions areas, this increase being probably underestimated in the maps presented here (if EMEP emissions are considered as a reference). It is essential to note however that we focus only on annual mean concentrations of O 3 . It should be recognized that average ozone is sensitive to the NO x titration effect that influences mainly low O 3 levels. The higher quantiles of the O 3 distribution will respond in a quite different manner and decreases of ozone peaks in conjunction with an increase of the ozone mean have been reported before (Vautard et al., 2006a;Wilson et al., 2012). A thorough investigation of the evolution of the proxies that are more relevant for air quality exposure studies will be discussed in Sects. 4.2 and 4.3.

Evolution of the exposure to ozone air pollution
When focusing on the evolution of quantities that follow skewed distributions it is essential to discuss the changes in all the statistical properties of the distribution in addition to the average changes presented in Sect. 4.1. In the field of air pollution, this requirement is further supported by the non-linearity of the transformation of polluting substances (e.g. the seasonal cycle in photochemistry), the variability of the exposure (e.g. the impact of the phenology: plants being more exposed during growth phases), and the threshold effects (damaging impacts of some pollutants being negligible below some background level).
As an alternative some authors focused on the trends of given statistical metrics (5th, 10th, 90th or 95th quantiles) (Wilson et al., 2012;Vautard et al., 2006a). In the present work we chose to use exposure proxies that are relevant for vulnerability studies and often used in a policy making context (EEA, 2009;Ellingsen et al., 2008). These proxies are designed to capture the non-linear features of the distributions that matter for exposure purposes. In our analysis, we will include five exposure indicators: -MTDM: the mean of the ten highest daily maximum ozone concentrations (based on hourly data) over April to September, expressed in µg m −3 .
-Nd120: the number of days with maximum ozone over the warning threshold of 120 µg m −3 (based on 8-hr running means).
-SOMO35: the annual sum of daily maximum over 35ppbv (based on 8-h running means), expressed in µg m −3 .
-AOT40c: accumulated ozone over 40 ppbv from 8am to 8pm over May to July, expressed in µg m −3 h and based on hourly data.
-AOT40df: same as AOT40c but over April to September.
Some of these metrics are particularly relevant for human health exposure (SOMO35, and Nd120). Others are designed to capture detrimental effects on vegetation (AOT40c for crops and AOT40df for deciduous forests). The current legislation in Europe (EC, 2008) defines target values for Nd120 (25 days a year averaged over 3 yr) and AOT40c (18 000 µg m −3 h averaged over 5 yr). MTDM is not a regulatory proxy but it is a good indicator of photochemical processes. For all models, the indicators were derived from the first model layer concentration, except for the EMEP model for which a downscaled 3m concentration was provided because of the thickness of the first model layer (with centre at ca. 45 m), the downscaling methodology is described in . MOZART could not be included in this exposure assessment since only average daily fields were archived.
The results obtained from the different simulations are given in Table 3. For each indicator, we give the average over the whole of Europe (20 • W, 30 • E, 33 • N, 65 • N), as well as weighted indicators that capture better air pollution impacts in sensitive areas. At each grid point, the indicator is multiplied by a weighting function, and the weighted indicator is then aggregated over the whole domain. For AOT40, the weighting function is given by a land use database to estimate the crops (AOT40c) or forest (AOT40df) fraction in a given grid cell (unitless). For all the other indicators, the weighting is performed according to the population density (in thousands of inhabitants per grid cell), based on a projection for 2030 (United Nations, 2009). Table 3 shows that despite an increase of average ozone over a significant part of the domain (Sect. 4.1.2), the exposure to ozone pollution will decrease in the next 20 yr in Europe. All the indicators exhibit a downward trend. The improvement is systematically better for the "sustainable" trajectory. Specifically, Nd120, which is being used for regulatory measures in Europe, is quite efficiently reduced. The relative change is smaller for MTDM which corresponds approximately to the mean of the ozone concentrations above the 95th quantile (EEA, 2009). We can thus conclude that the future policies will lead to an efficient reduction of emission with regard to most ozone peaks, but the higher end of the distribution of ozone peaks will remain.
Exposure-weighted proxies are more relevant for future impact studies using dose-response relationships (Holland et al., 2010). One can note that the difference between the 'reference' and 'sustainable' scenarios (indicated in the last column of the table) is larger for most population-weighted proxies. The co-benefits for air quality indirectly brought about by the climate policy are thus particularly efficient in high-exposure areas. Table 3 also provides information about the model spread given in the first three columns as the coefficient of variation. Some proxies are more robust than others. The confidence for SOMO35 is higher than for AOT40 or Nd120 that are more sensitive to threshold effects.
The spread of the relative change is lower than the spread of absolute values, as shown by the standard deviation of the relative change between 'reference', 'sustainable' and 2005. This is a very positive feature of the setup as, even if the models are scattered, their relative changes agree well for the projections. For each indicator, the standard deviation across the model ensemble is small enough to avoid an overlap between the relative change for the 'reference' and 'sustainable' scenario (column 4 and 5). We can thus state with confidence that the conclusions drawn above are robust and not modeldependent.
Atmos. Chem. Phys., 12, 10613-10630, 2012 www.atmos-chem-phys.net/12/10613/2012/ Table 3. Exposure indicators averaged over Europe. Modelled exposure indicators before (raw) and after (weighted) applying a weighting function designed to highlight changes in sensitive areas. For the first three columns, we provide, for each scenario, the median over the whole domain, averaged across all models delivering hourly data. The number in parenthesis is the coefficient of variation (in %). The two following columns provide the ratio between scenario and references (in %) and the standard deviation of this percentage across the ensemble (in brackets). In the last column, we give the difference between the latter two ratios. The robustness of this assessment is further supported by a comparison with the GAINS model (Amann et al., 2011). Besides its emission and optimisation capabilities, GAINS includes also a module for the impact assessment (used in the optimisation procedure). SOMO35 changes for the GEA scenarios could thus be computed with an atmospheric response model. According to GAINS, SOMO35 by 2030 would be reduced to 77.1 % and 61.3 % of 2005 levels for the "reference" and "sustainable" scenarios, respectively. These figures were 82.5 % and 57.5 % for the ensemble of CTM, showing a very good agreement between the two very distinct types of modelling approaches. It is encouraging that GAINS, whose atmospheric-chemistry responses were derived from one model only, still produces answers consistent with the ensemble of CTMs.

Downscaled exposure to ozone
Using ensembles allows the documentation of the uncertainties associated with the models, but it does not compensate all the biases that models carry. Besides possible uncertainties of the numerical methods, CTMs have shortcomings related to their spatial resolution, the driving meteorology, the boundary conditions, as well as anthropogenic and natural emission data. In this Section, we implement a statistical downscaling technique to correct the modelled distribution over a control period and in the projections.

Probabilistic downscaling methodology
The bias correction implemented here is a probabilistic downscaling method called CDF-t for Cumulative Distribution Function transform (Michelangeli et al., 2009). It is derived from the quantile matching technique while expanding it to take into account the changes in the shape of the distribution for the projection. Quantile matching (Déqué, 2007) builds on the knowledge of modelled and observed CDFs for a control period. The matching consists of comparing the quantiles of two distributions, and attributing to the value in the modelled distribution, the value in the reference distribution that has the same probability. By scaling the quantilequantile relationship, this method improves the whole range of the distribution and allows a better representation of values whose frequency (or probability) is systematically underestimated in the model. CDF-t expands quantile matching by taking into account the evolution of the projected distribution while quantile matching relies only on present-day information. It uses the relationship between modelled and observed CDFs for a given variable during a control period as well as the change between the control and projected distribution to scale its value in the future.
The main underlying assumption is that the transformation remains stationary in time, which is not granted if model biases change in the future. This limitation raises specific concern for photochemical modelling since ozone formation regimes shall change in the future (switching from NO x saturated to NO x limited) and therefore a bias correction developed over a past period might prove less efficient in the future. Note that any bias correction technique would carry the same underlying hypothesis. An alternative in the field of climate downscaling consists of accounting for a conditionality depending on the weather regime (Vrac et al., 2007). How-ever, this is not an option here as one would need to build upon observations at a single location where both regimes occur. For the same reason it is not possible to quantify the uncertainty. Such an assessment should be the priority for future work when long time series exhibiting photochemical regime changes become available. In the meantime, we considered that it was worth doing our utmost to minimize model Atmos. Chem. Phys., 12, 10613-10630, 2012 www.atmos-chem-phys.net/12/10613/2012/ biases by exploring existing statistical correction techniques that constitute a significant refinement compared to basic bias correction approaches. Specifically, the CDF-t was performed by matching the model outputs interpolated bilinearly to the location of AIR-BASE stations. For each model, this matching was performed for about 1700 stations (677 urban, 505 suburban and 525 rural sites) over Europe and 10 yr of simulation. Since the goal of this scaling was to derive exposure indicators, the matching was performed for hourly model extractions in order to retrieve adequate diurnal cycles. The reference (control) period used to train the matching algorithm was the 1998-2007 decade with GEA emissions for 2005.
One should keep in mind that this matching is performed on a station-per-station basis. The discussions in the present section are thus heavily influenced by the spatial distribution of the monitoring network that is far from being as representative spatially as the features discussed in Sect. 4.2.
The distributions of ozone biases for all stations before and after applying the CDF-t are given in Fig. 6a. The boxes provide the three inner quartiles and the whiskers show the points lying outside the 25th and 75th quantile plus 50 % of the interquartile distance. The distributions are based on annual values so that they contain about 17 000 points (1700 stations times 10 yr). The results of C2011 (i.e. spanning the same meteorological decade but with EMEP anthropogenic emissions) are also displayed. Overall, we find similar performances with both the official EMEP emissions and the GEA 2005 control data. However, for CHIMERE and EMEP the biases are slightly higher with GEA emissions, because of the lower NO 2 levels discussed above (Sect. 4.1.1). Nevertheless, Fig. 6a confirms that the biases obtained with GEA emissions are not unusual for these models, when used at coarse resolution for such long term simulations as being done here. Also the whiskers show that a significant number of stations exhibit a very large annual bias: those are the urban stations, where the titration is very efficient and not captured at this resolution. The distribution of biases computed with the CDF-t corrected hourly ozone time series are much more satisfying, illustrating the efficiency of this technique.

Downscaling projections of ozone exposure indicators
In Fig. 6b, we provide the distributions of SOMO35 in the AIRBASE observations over the 1998-2007 decade as well as for each model and the three scenarios (one control and two experiments). The correction technique performs very well: the distributions for the control (2005) simulations are very close to the observations. In the projections, by 2030, SOMO35 decreases very consistently for all models and all scenarios. Most of the improvement is brought about by the implementation of the current legislation on AQ policy, whereas the models show an additional improvement when accounting for the co-benefits of climate policies (i.e. moving from "reference" to "sustainable"). The magnitude of the response is however variable across models, which illustrates well the relevance of using ensemble approaches. A synthesis of Fig. 6b for all indicators is given in Table 4. It provides the exposure proxies for each indicator and each scenario, averaged over the ensemble. For each model, a given proxy is computed on an annual basis at each station, a total of about 17 000 estimates. For each model, we take the median of that distribution and then the mean across all models to provide a single number on Table 4. The coefficient of variation (standard deviation over all models divided by the mean, expressed in %) is also given to provide an insight into the model spread. Exposure indicators derived from the observational records are also provided on the first column for comparison purposes as well as their coefficient of variation which reflect the spatial variability. The similarity of these numbers with the CDF-t corrected model estimate illustrates again the efficiency of the downscaling technique. To emphasize the relevance of AOT for ecosystems we use only rural stations for AOT40c and AOT40df but all types of stations are used otherwise.
Because the ensemble is dominated by models exhibiting a positive bias in ozone, SOMO35 was largely overestimated before applying the CDF-t. When looking at biascorrected estimates, SOMO35 for the 'reference' and 'sustainable' scenarios drop to 58.3 and 32.3 % of the 2005 levels, respectively. These figures were about 10 points higher before applying the correction of the distribution. The spread of SOMO35 across models also drops from 30 % to 8 % for the control experiment after having applied the CDF-t.
Nd120 is also a proxy relevant for health exposure; in addition it is used for regulatory purposes. The mean of Nd120 for all models, after bias correction, is about 22.2 days per year for the control experiment. Again, this estimate is based (for each model) on the median of the distribution of Nd120 modelled at 1700 stations over 10 yr. It shows that the air quality conditions are currently very poor: the median is very close to the target value of 25 days per year, i.e. the target is not met at 43 % of the stations. Table 4 reveals that the model uncertainty on Nd120 is very high in the "reference" and "sustainable" projections with a coefficient of variation of 118 and 141 % for the uncorrected estimate, respectively. This finding illustrates that Nd120 is more sensitive than SOMO35. The relative change between current and projected levels is more robust than the absolute values. We find that Nd120 will drop compared to current levels. By 2030, the target of 25 exceedances per year would be met at 92 % and 98 % of the stations for the "reference" and 'sustainable' scenarios, respectively, on average across the ensemble of models.
MTDM is much more robust than Nd120 in terms of model spread, but the change in the future is much smaller too: the projections will reach about 82 to 72 % of current Table 4. Exposure indicators at the location of air quality monitoring stations. Observed and modelled ozone exposure indicators before (raw) and after (CDF-t) applying the statistical downscaling for the control (2005) and projections for 2030: "reference" and "sustainable". For the first column the observed median over all stations is given as well as the coefficient of variation across the entire network (in parenthesis, %). For the following 3 columns we provide the mean over all models, the proxy for each model being the median of the distribution of each indicator at each station and for 10 yr. For AOT40c and AOT40df, only rural stations are used, whereas all types of stations are used for the other metrics. The numbers in parenthesis are the coefficients of variation (in %). The last two columns provide the ratio between scenario and reference (in %), and the standard deviation of this ratio in the ensemble, in brackets. Similarly to Nd120, the spread of AOT40 (for crops and for forests) is high. But again, the spread of the relative change is more reasonable and we find that AOT40c and AOT40df will be divided by about two to four in the "reference" and 'sustainable' scenarios, respectively. As a consequence, the fraction of stations where the target of 18 000 µg m −3 h is not met will decrease from 32 % in the control experiment to 9 % and 3 % in the 'reference' and 'sustainable', respectively.

Observed
To summarize the discussion on the intra-model uncertainty, we found that the change brought by the statistical downscaling was much larger for the absolute values of SOMO35, AOT40c and AOT40df than Nd120 and MTDM. We can thus conclude that the CTMs are more efficient at capturing accurately peak levels than background ozone. This distinction also holds for the relative changes except for AOT40c and AOT40df that are less sensitive. The difference of the impact of the CDF-t correction depending on the scenario illustrates well the sensitivity of threshold-based indicators that cannot be fully compensated by the downscaling technique. The inter-model uncertainty, exhibited from the model spread in the ensemble is high for AOT40c and AOT40df in the case of the "sustainable" scenario. It is also high for Nd120, showing that, even if the intra-model uncertainty is small on average (limited need for statistical correction), the envelope of models can be large.

Climate/Air quality cobenefits
Policy measures for the mitigation of air pollution and climate change overlap, and integrated assessments are required to assess their interlinkages. The two scenarios investigated in the present paper differ only in their representation of climate policies (the legislation regarding air pollution is identical) yet very significant differences are observed. These differences constitute what is commonly referred to as a cobenefit of climate policies for air pollution matters.
The total NO x emission in the "sustainable" scenario for 2030 is 38 % lower than in the reference for the same year (Table 1 and Sect. 2), hence a 38 % cobenefit of a sustainable climate policy for anthropogenic emissions of nitrogen oxides. According to the ensemble of CTMs, the cobenefit in terms of atmospheric concentration of NO 2 is also 38 % when using NO 2 fields weighted by the population. Identical figures are found because most nitrogen oxides found in the atmosphere are actually primary emissions and emissions are highly correlated to the population.
For ozone, cobenefits in terms of annual mean are very small (only 4 %) because the trends in the upper and lower part of the distribution compensate. In addition the natural background ozone is unchanged and masks somewhat the cobenefits. Using the bias corrected exposure indicators introduced in Sect. 4.3.2 is more relevant.
The comparison of the 'sustainable' and 'reference' scenario (Table 4) shows that the cobenefit is limited for the most extremes events (10 % for MTDM) that are heavily influenced by outstanding meteorological conditions. But the benefit is very clear regarding the detrimental impact of air pollution on human health (45 % for SOMO35). The cobenefit is even larger than for the primary NO x emissions for ecosystems (56 % and 60 % for AOT40df and AOT40c, respectively). The cobenefit for regulatory purposes (Nd120) is also high: 78 %.

Conclusion
Anticipating future air quality is a major concern and it has been the focus of many atmospheric chemistry research projects over the past decades (Amann and Lutz, 2000;van Loon et al., 2007;Stevenson et al., 2006;Szopa et al., 2006;Tuinstra, 2007). We present the results of a multi-model exercise aimed at addressing this issue for Europe. Our analysis is based on an ensemble of air quality models covering both regional and global spatial scales that are implemented in a coordinated manner for future projections of anthropogenic emissions at the 2030 horizon. The two scenarios explored were developed in the framework of the Global Energy Assessment . The focus is on climate cobenefits for air quality: the scenarios include identical measures for air quality legislation but they differ in terms of climate policy. One of the scenarios is a baseline, while the other aims at limiting global warming to 2 • C by the end of the century. The analysis is based on multi-annual simulations investigated with downscaling techniques that are novel to the field of air quality modelling in order to assess exposure changes. The discussion of uncertainties in intramodel biases (using a statistical bias correction) and in intermodel spread (investigating the ensemble variability) allows increasing the robustness of the conclusions. By 2030, total NO x emission in Europe are reduced to about half of their current (2005) levels in the scenario that includes air quality policies but no measures to mitigate climate change. When stringent climate policies are included, NO x emissions in 2030 are decreased to a third of presentday levels.
As a result, ozone decreases substantially throughout the domain even though over areas currently saturated in NO x , an increase is found for the mean annual ozone. However, we also demonstrate that this change of annual mean ozone is not representative of exposure to ozone pollution. Air quality indicators specifically designed to capture the fraction of ozone distribution that is detrimental to human health (SOMO35, Nd120) or vegetation (AOT40c, AOT40dc) are efficiently reduced for both scenarios by 2030.
By 2030, SOMO35 levels (average over Europe) will reach about 80 to 55 % of their current value. These changes are quite consistent across the ensemble (inter-model uncertainty). Furthermore, the estimates of SOMO35 obtained with the GAINS model (which are derived from statistical fits to an older version of the EMEP model) also give similar figures. This consistency gives confidence in the use of the GAINS model for assessing policy-relevant changes in Europe. Using a statistical correction of the distribution at the location of monitoring stations shows that the relative change of SOMO35 is sensitive to the biases of the models, arguing against the commonly used argument that the impact of model biases are minimised where looking at relative trends. We estimate the relative change to be underestimated by about 10 % with the uncorrected model output.
Consequently, average SOMO35 levels in Europe in 2030 would be 70 % to 45 % of current values for the 'reference' and 'sustainable' scenarios, respectively.
As far as the relative change of AOT40 is concerned, the indicators of exposure to detrimental ozone levels of vegetation for crops are estimated in 2030 to reach about 60 % and 25 % of their present levels for the "reference" and 'sustainable' scenarios, respectively. The projections for AOT40c are expected to meet the current target (18 000 µg m −3 h) for the vast majority of stations but the long term objective (6000 µg m −3 h) will likely not be met over most of Europe if climate policies are not enforced. These estimates are robust both in terms of model spread (ensemble) and model uncertainty (difference between the raw and distribution-corrected estimates). Absolute estimates of AOT40 are very sensitive to the statistical correction but it appears that AOT40 relative changes are less sensitive to model biases.
According to the current European legislation, maximum daily ozone should not exceed 120 µg m −3 more than 25 days a year. In the control simulation representing current conditions, this limit is exceeded at 43 % of the monitoring stations. These fairly poor air quality conditions are consistent with the air quality assessments of the European Environmental Agency (EEA, 2011). The projections point towards a decrease of this indicator. The fraction of European population exposed to exceedance of the 120 µg m −3 limit value (derived from the population-weighted indicator) will decrease substantially by 55 to 85 %, in the 'reference' and 'sustainable' scenarios, respectively. The estimate of the relative change is robust in terms of inter-model spread (across the ensemble) as well as intra-model uncertainty (low sensitivity to the statistical downscaling). On average, 92 % to 98 % of the stations will comply by 2030 for the "reference" and "sustainable" scenarios, respectively.
The present study opens the way for more comprehensive assessments of future air quality. Including the impact of climate on air quality (Meleux et al., 2007;Szopa et al., 2006;Stevenson et al., 2006;Katragkou et al., 2011) is the focus of several on-going studies. The full coupling might follow although uncertainties in the indirect impact of aerosols remain large and without this factor, two-way feedbacks are limited compared to the one-way influence of climate on air quality (Raes et al., 2010).
Investigating the attribution of long range transport and local air quality management is also important to support decision making (Dentener et al., 2005;Szopa et al., 2006;Stevenson et al., 2006;Katragkou et al., 2010). The ensemble of CTM models in the present paper included two global models that exhibited similar trends, suggesting that the local effect dominates. But this statement certainly needs to be refined.
Last, we illustrated the relevance of implementing statistical downscaling techniques for air quality purposes. Whereas such approaches are commonplace in climate studies (e.g. for assessment of future wind energy potential), they are under-exploited in the air quality community although the sensitivity of these techniques in the context of changing chemical regimes remains an important topic for future work.