Methodology for evaluating lateral boundary conditions in the regional chemical transport model MATCH ( v 5 . 5 . 0 ) using combined satellite and ground-based observations

Hemispheric transport of air pollutants can have a significant impact on regional air quality, as well as on the effect of air pollutants on regional climate. An accurate representation of hemispheric transport in regional chemical transport models (CTMs) depends on the specification of the lateral boundary conditions (LBCs). This study focuses on the methodology for evaluating LBCs of two moderately longlived trace gases, carbon monoxide (CO) and ozone (O3), for the European model domain and over a 7-year period, 2006– 2012. The method is based on combining the use of satellite observations at the lateral boundary with the use of both satellite and in situ ground observations within the model domain. The LBCs are generated by the global European Monitoring and Evaluation Programme Meteorological Synthesizing Centre – West (EMEP MSC-W) model; they are evaluated at the lateral boundaries by comparison with satellite observations of the Terra-MOPITT (Measurements Of Pollution In The Troposphere) sensor (CO) and the Aura-OMI (Ozone Monitoring Instrument) sensor (O3). The LBCs from the global model lie well within the satellite uncertainties for both CO and O3. The biases increase below 700 hPa for both species. However, the satellite retrievals below this height are strongly influenced by the a priori data; hence, they are less reliable than at, e.g. 500 hPa. CO is, on average, underestimated by the global model, while O3 tends to be overestimated during winter, and underestimated during summer. A regional CTM is run with (a) the validated monthly climatological LBCs from the global model; (b) dynamical LBCs from the global model; and (c) constant LBCs based on in situ ground observations near the domain boundary. The results are validated against independent satellite retrievals from the Aqua-AIRS (Atmospheric InfraRed Sounder) sensor at 500 hPa, and against in situ ground observations from the Global Atmospheric Watch (GAW) network. It is found that (i) the use of LBCs from the global model gives reliable in-domain results for O3 and CO at 500 hPa. Taking AIRS retrievals as a reference, the use of these LBCs substantially improves spatial pattern correlations in the free troposphere as compared to results obtained with fixed LBCs based on ground observations. Also, the magnitude of the bias is reduced by the new LBCs for both trace gases. This demonstrates that the validation methodology based on using satellite observations at the domain boundary is sufficiently robust in the free troposphere. (ii) The impact of the LBCs on ground concentrations is significant only at locations in close proximity to the domain boundary. As the satellite data near the ground mainly reflect the a priori estimate used in the retrieval procedure, they are of little use for evaluating the effect of LBCs on ground concentrations. Rather, the evaluation of ground-level concentrations needs to rely on in situ ground observations. (iii) The improvements of dynamic over climatological LBCs become most apparent when using accumulated ozone over threshold 40 ppb (AOT40) as a metric. Also, when focusing on ground observations taken near the inflow boundary of the model domain, one finds that the use of dynamical LBCs yields a more accurate representation of the seasonal variation, as well as of the variability of the trace gas concentrations on shorter timescales. Published by Copernicus Publications on behalf of the European Geosciences Union. 3748 E. Andersson et al.: Methodology for evaluating lateral boundary conditions

results are validated against independent satellite retrievals from the Aqua-AIRS (Atmospheric InfraRed Sounder) sensor at 500 hPa, and against in situ ground observations from the Global Atmospheric Watch (GAW) network.It is found that (i) the use of LBCs from the global model gives reliable in-domain results for O 3 and CO at 500 hPa.Taking AIRS retrievals as a reference, the use of these LBCs substantially improves spatial pattern correlations in the free troposphere as compared to results obtained with fixed LBCs based on ground observations.Also, the magnitude of the bias is reduced by the new LBCs for both trace gases.This demonstrates that the validation methodology based on using satellite observations at the domain boundary is sufficiently robust in the free troposphere.(ii) The impact of the LBCs on ground concentrations is significant only at locations in close proximity to the domain boundary.As the satellite data near the ground mainly reflect the a priori estimate used in the retrieval procedure, they are of little use for evaluating the effect of LBCs on ground concentrations.Rather, the evaluation of ground-level concentrations needs to rely on in situ ground observations.(iii) The improvements of dynamic over climatological LBCs become most apparent when using accumulated ozone over threshold 40 ppb (AOT40) as a metric.Also, when focusing on ground observations taken near the inflow boundary of the model domain, one finds that the use of dynamical LBCs yields a more accurate representation of the seasonal variation, as well as of the variability of the trace gas concentrations on shorter timescales.

Introduction
Hemispheric transport of aerosols and trace gases receives increasing attention owing to its impact on air quality, climate and visibility.Several recent studies have focused on hemispheric transport and, related to that, the significance of lateral boundary conditions (LBCs) in regional chemical transport modelling.The growing interest in hemispheric transport has partly been prompted by an increase in the average amount of pollution that is transported over hemispheric scales (e.g.Fiore et al., 2009).Hemispheric transport can also have a strong episodic impact on regional air quality (Fiore et al., 2002;Oltmans et al., 2006).Observational data from satellites of various tracers, such as carbon monoxide (CO) (Heald et al., 2003) and ozone (O 3 ) (Zhang et al., 2008), corroborate that hemispheric transport of air pollutants can be important for regional and local air quality.Further, while air quality studies traditionally have a strong focus on nearsurface concentration fields, climate effects of air pollution involve aerosols and radiatively active trace gases throughout the atmospheric column.Concentration fields aloft are typically even more strongly influenced by long-range transport than near-surface concentrations.Thus, in modelling systems that couple regional climate and regional air quality models (e.g.Thomas et al., 2015) hemispheric transport of pollutants is likely to play an important role.
In regional models, the impact of hemispheric transport is described by LBCs in the inflow region.The significance of LBCs for regional air quality modelling has been analysed by several investigators (e.g.Mathur, 2008;Rudich et al., 2008;Song et al., 2008).In general, the impact of LBCs on in-domain concentration fields can be quite significant; it increases with species lifetime and decreases with the transport time from the domain boundary.For instance, Barna and Knipping (2006) studied sulfate concentrations in a regional model covering the USA and Mexico; it was found that, depending on meteorological conditions and on the choice of boundary conditions, between 4 and 25 % of the sulfate concentration at the surface and at a location far away from the boundaries can be attributed to particulate sulfate or sulfur dioxide precursors entering the model domain at the boundaries.Jiménez et al. (2007) found that ground-level ozone concentrations on the Iberian Peninsula are strongly influenced by the boundary conditions of both ozone and ozone precursors.It has also been pointed out that ecologically sensitive regions can be particularly susceptible to negative impacts of air pollution (e.g.Pour-Biazar et al., 2010); in such cases the role of hemispheric air pollution transport can be even more significant.
Traditionally, regional models have often relied on prescribed boundary conditions that do not adequately capture temporal and spatial variations.This approach can be particularly problematic during episodes of elevated emissions outside the model domain, such as dust-storm episodes, volcanic eruptions, or forest fires, that are transported across the domain boundary.While global models do not face challenges related to lateral boundary conditions, they are often too coarse for investigating, e.g.regional air quality standard attainment.However, they can be used to provide boundary conditions for regional air quality models that can introduce improvements over fixed boundary conditions.For example, Tang et al. (2007) compared temporally and spatially varying boundary conditions to either time-averaged or time-and horizontally averaged boundary conditions; the regional model was run over the continental USA as well as over a smaller sub-domain with a finer resolution.The dynamic boundary conditions yielded the best correlation with aircraft observations of O 3 and CO concentrations, especially in the high-resolution model.
A direct evaluation of the boundary conditions is often complicated by the sparsity of observational data.For this reason, one often performs an indirect evaluation by comparing model results within the computational domain with observations.Tang et al. (2009) investigated the benefit of using dynamic boundary conditions derived from either ozonesonde observations or from global models to forecasting ozone concentrations in the continental USA.The results confirmed that the boundary conditions can have a strong impact on simulated ozone concentrations near the surface and aloft, especially near the inflow boundary.Further, while the use of dynamic boundary conditions from global models can improve correlations between predicted and measured surface ozone, this approach can also contribute to an increased model bias.
A common problem in the evaluation of boundary conditions by comparing model results to in-domain observations is to disentangle the impact of boundary conditions from all other parameters and processes that influence the model results.Satellite observations offer a good spatio-temporal coverage, thus allowing one to evaluate boundary conditions directly at the boundary.This approach has been chosen by, e.g.Henderson et al. (2014), who investigated lateral boundary conditions of ozone and carbon monoxide in a regional air quality model for the continental USA.Pfister et al. (2011) made combined use of measurements from aircraft, ozone sondes, and observations of CO and O 3 from the TES (Tropospheric Emission Spectrometer) instrument onboard NASA's Aura satellite, as well as modelling results from the global model MOZART-4.The study focused on the inflow of air pollution into California during the summer months.The authors found that the global model was able to reproduce about half of the free tropospheric variability when confronted with observational data.When used as LBCs in a regional model, the variability in the pollution inflow strongly impacted the surface concentrations of CO and O 3 over California.In their conclusions the authors identify the evaluation of LBCs in regional models as one of the essential elements in regional model validation studies.They found it essential to evaluate both the spatio-temporally averaged background fields and the spatial and temporal variation of the LBC.However, they also concluded that, owing to high computational demands, nesting of global/hemispheric models with regional models may not always be practicable in all types of applications.
An alternative to the use of global models is to derive LBCs from satellite observations.For instance, Pour-Biazar et al. (2011) employed ozone observations from the OMI (Ozone Monitoring Instrument) instrument onboard NASA's Aura satellite as well as aerosol optical depth obtained from MODIS onboard NASA's Terra and Aqua satellites to produce LBCs for the regional air quality model CMAQ (Community Multiscale Air Quality), which was run for the continental USA.The analysis showed significant improvements for O 3 concentrations in the free troposphere and for PM 2.5 in the boundary layer.
This study aims to evaluate LBCs from a global chemical transport model (CTM), the European Monitoring and Evaluation Programme (EMEP) MSC-W model by comparing them with satellite retrievals and by investigating the impact of the LBCs by implementing them into a regional CTM, the MATCH (Multi-scale Atmospheric Transport and CHemistry) model developed by the Swedish Meteorological and Hydrological Institute.There are a large number of regional models being used in Europe, so a robust methodology for evaluation of lateral boundary fields for the European model domain has a potentially large user community.A similar direct evaluation, on domain boundaries, was conducted by Henderson et al. (2014) over the North American domain, but no similar studies has, to our knowledge, been done over the European domain.
The evaluation methodology is divided into two major parts.First, the LBCs from the global EMEP model are directly compared at the lateral boundaries of the European model domain, with satellite retrievals from the Terra-MOPITT (Measurements Of Pollution In The Troposphere) and Aura-OMI instruments.Second, the study investigates the impact of LBCs on regional concentration fields by applying the LBCs from the global CTM, to the regional CTM, MATCH.The MATCH model results are compared to satellite retrievals from the Aqua-AIRS (Atmospheric InfraRed Sounder) instrument as well as to ground-based measurements.Using the global CTM as LBCs gives the benefits of studying the impacts of using dynamical or climatological LBCs, which would not be possible if using satellite retrievals, due to the time resolution.The latter part of the evaluation is done by addressing the following questions: (i) how strongly are concentrations near the surface influenced by the LBCs?(ii) How are the concentrations influenced aloft in the troposphere at 500 hPa? (iii) What are the benefits of using dynamic vs. climatological LBCs?
The outline of this paper is as follows.Section 2 presents the models and the observations together with a more detailed description of the methodology of this evaluation, which is the main focus of this paper.Section 3 shows the results from the evaluation processes, and concluding remarks are given in Sect. 4.

The EMEP model
The European Monitoring and Evaluation Programme Meteorological Synthesizing Centre -West (EMEP MSC-W) chemical transport model has been developed for the EMEP at the MSC-W (see www.emep.int).The EMEP model has been specifically developed to support policy work of the Convention on Long-range Transboundary Air Pollution (CLRTAP; http://www.unece.org/env/lrtap/lrtap_h1.html).EMEP model results have played an important part in the development of emission reduction scenarios, for both the convention (now comprising 51 Parties, including USA and Canada) and increasingly for the European Commission (Amann et al., 2011;Simpson, 2013).
The EMEP MSC-W model (rv4.5(svn2868)) has been described in detail by Simpson et al. (2012) (with updates in Simpson et al., 2013;Tsyro et al., 2014).Although traditionally run on the European scale with grid sizes of around 30-50 km (Jonson et al., 2006;Simpson et al., 2006b;Fagerli and Aas, 2008;Bergström et al., 2012), the model is increasingly being used for smaller-scale applications, e.g.2-5 km grids over the UK and Norway (Vieno et al., 2010;Karl et al., 2014), or globally (Sanderson et al., 2008;Jonson et al., 2010).In standard usage, the EMEP model has 20 vertical layers extending from the ground to 100 hPa (about 16 km), using terrain-following coordinates.The lowest layer has a depth of about 90 m.Meteorological data are taken from the European Centre for Medium-Range Weather Forecasting Integrated Forecasts System (ECMWF-IFS) model (http: //www.ecmwf.int/en/research/modelling-and-prediction).
EMEP model results for O 3 from the global model version were compared with ozone-sonde data by Jonson et al. (2010) and found to reproduce observed values.Model results for CO, generated from both the European and globalscale runs, have been compared with column data from Fourier transform infrared (FTIR) measurements at six sites (Angelbratt et al., 2011).Comparisons were complicated by the 100 hPa limit of the EMEP model and the fact that some of the stations were high altitude sites (hence above the planetary boundary layer sometimes), but mean CO concentrations were captured within 10-22 % by the European-scale model, and within 1-9 % by the global model.Further description and model runs of the global EMEP model can be found in the EMEP Status Report (Fagerli et al., 2014).
For the present study, the EMEP model has been run on a global scale, with a horizontal resolution of 1 • × 1 • latitude/longitude.Concentrations of CO, O 3 , and other components have been exported every 3 h for use as LBCs to MATCH.

The MATCH model
MATCH is a three-dimensional, Eulerian offline model that has been developed at the Swedish Meteorological and Hydrological Institute (SMHI).It is highly flexible and can be used for different scenarios, regions, and scales.The modelling system includes a three-dimensional variational data assimilation module (Kahnert, 2008(Kahnert, , 2009) ) and an aerosol dynamics model (Kokkola et al., 2008;Andersson et al., 2015).Studies have been performed at both urban scales (Gidhagen et al., 2012) and regional scales (Andersson et al., 2006).As with the EMEP model, it is also part of the core services in the European air quality ensemble forecasting system that has been developed in the EU FP7-project Monitoring Atmospheric Composition and Climate (MACC) (http://www.gmes-atmosphere.eu/about/project_structure/regional/r_ens/).For full descriptions of the model, see Robertson et al. (1999) and Andersson et al. (2015).
In this study we use the developmental top version of MATCH, which is based on the latest version 5.5.0.The MATCH model is set up over Europe, covering a range of 35 • of longitude and 43 • of latitude in a rotated lat-long grid, with the horizontal resolution of 1 • × 1 • .The model has 40 vertical hybrid η layers ranging from the surface up to about 13 hPa.These η levels are varying at each grid point to better follow the topography.The meteorological input data are read every 3 h, and interpolated to hourly fields.Here, analysed data from the numerical weather prediction model HIRLAM (HIigh-Resolution Limited-Area Model) (Undén et al., 2002) are used, where analysed data are available every 6 h, and forecast data are available every 3 h.
Boundary conditions can be specified in MATCH in three different ways.The simplest option is to specify fixed lateral values at the western, eastern, northern, southern, and top boundaries.The second option is to specify vertical boundary profiles at discrete latitudes where the intermediate latitudes are derived by linear interpolation.The third option is to read in gridded boundary fields; one can either use dynamic boundary conditions from a large-scale model, or some climatology based on time-averaged model results or on satellite retrievals.Previously, the first two options have predominantly been used, but in this study the constant lateral boundaries for CO and O 3 are replaced by and compared to dynamic and climatological lateral boundary fields from the global EMEP model.The climatological LBCs, which are used in Sect.3.1, consist of a monthly climatology based on data from 2006 to 2012, whereas the dynamic LBCs corresponds to 3-hourly data for the same period.These two different runs will henceforth be referred to as ELBCc and EL-BCd, respectively.The original LBCs (henceforth referred to as ORIG) are monthly and seasonally varying boundary conditions, which are partially based on large-scale model runs reported in Näs et al. (2003) and back-trajectory analysed measurements from 1999 and EMEP stations close to the  1. model-domain boundaries (Solberg et al., 2005).All ORIG LBCs are described and tabulated in Andersson et al. (2006).

Satellite retrievals
Following the work by Henderson et al. (2014) for the North American region where the global CTM GEOS-Chem was used to generate boundary conditions, we evaluate LBCs for the European model domain by comparison with satellite retrievals.The evaluation is done by collocating and extracting grid cells corresponding to the regional boundaries surrounding Europe (see Fig. 1); time averages are created for the period 2006-2012.
The O 3 model data are compared with satellite retrievals from the OMI sensor onboard the Aura satellite.The OMI sensor uses two wavelength channels to retrieve a ozone partial column profile, OMI UV1 (270.0-308.5 nm), and the first part of OMI UV2 (311.5-330.0nm), where the longer wavelengths at 330 nm are more affected by the changes of ozone in the troposphere (Kroon et al., 2011).The retrieval algorithm is based on the optimal estimation method (Rodgers, 2000); for full description of the retrievals see Bhartia (2002).The OMI data used in this evaluation correspond to level 2 data, version 3 (OMO3PR), for the whole period of 2006-2012.Filtering of the data is done according to setting all the processing quality flags to zero; see the user guide for ozone products http://disc.sci.gsfc.nasa.gov/Aura/additional/documentation/README.OMI_DUG.pdf.During the evaluation period, the OMI instrument has suffered from three different row anomalies, the first one starting on 25 June 2007, the second one starting 11 May 2008, and the third one starting on 24 January 2009.These anomalies affect all wavelengths at certain viewing angles of OMI, but are filtered out, using the variable "ReflectanceCostFunction", less than 30 (Kroon et al., 2011, J. F. de Haan, personal communication at Koninklijk Nederlands Meteorologisch Instituut, KNMI, 2015).The data were downloaded from the online archive (ftp://aurapar2u.ecs.nasa.gov).
The CO model data are evaluated against satellite retrievals from the MOPITT instrument onboard the Terra satellite.MOPITT detects CO by gas correlation radiometry and retrieves the data by a differential absorption method in two infrared spectral bands.The full description of the retrieval algorithm is found in Deeter et al. (2003).The retrieved MOPITT data used in this study correspond to level 3 version 6, using both the thermal and near infrared spectral bands, MOP03JM with no additional filtering.MOPITT data for August and September of 2009 were not available and therefore not evaluated.
When comparing the vertical distribution of model data with satellite retrievals it is important to let the model data undergo the same degree of smoothing and get the same a priori and averaging kernel dependence as the satellite retrievals.This is done by applying Eq. (1), which is taken from the MOPITT product user's guide (Deeter, 2009(Deeter, , 2013)).
where ŷrtv corresponds to retrieved or smoothed data, y a is the a priori profile that is used to constrain the retrievals to fall within physically realistic solutions, A is the averaging kernel, and y m is the original prediction, in this case the EMEP model data.It is important to note that the averaging kernel in MOPITT is used for logarithmic concentrations fields, i.e. log 10 (VMR) (volume mixing ratio).This expression can also be used for the OMI data, but with the difference of using the natural logarithm (Kroon et al., 2011).This smoothing error, which is added to the model data through Eq. ( 1), is associated with the shape and magnitude of the measurement weighting functions and gets diminished when, either the averaging kernels go towards delta functions, or when the difference between y a and y m gets smaller (Deeter et al., 2012).
In the analysis, two months are chosen, January and August, to represent the winter and summer season and the low and high level periods of ozone.The statistical metrics used, throughout this study, are the bias and correlation, according to Eqs. ( 2) and (3).

bias
where N is the number of data points, x m corresponds to the model data, x o to the measurement data, x avg and σ are the arrhythmic mean and standard deviation of each data set, respectively.Retrievals from the AIRS onboard of the Aqua satellite are employed for validating MATCH results for O 3 and CO computed with different sets of LBCs.The AIRS sensor has several physical retrievals, among them the trace gases used in this study, CO and O 3 .AIRS is a hyperspectral instrument that is sounding in the thermal spectrum and provides the longest record (since 2002) of the profiles of these gases retrieved simultaneously from the same sensor, Chahine et al. (2006).Over the last decade retrieval algorithms have been continuously improved and validated.The uncertainties and sensitivities are also better understood and documented (Divakarla, 2008;Fetzer, 2006;Xiong et al., 2008;McMillan et al., 2011;Warner et al., 2013).We used the monthly level 3 data (1 • × 1 • resolution), which suits best to our purpose of evaluation (i.e.investigating the large-scale statistics), from the most recent version 6 release of the products (AIRS Science Team, 2013;Tian, 2013).Thomas and Devasthale (2014) and Devasthale and Thomas (2012) have previously demonstrated the usefulness of AIRS level 3 data in investigating the large-scale variability of CO over the northernmost part of the study area.All satellite retrievals from OMI, MO-PITT, and AIRS were downloaded from NASA's REVERB website (http://reverb.echo.nasa.gov/reverb).

In situ ground observations
The ground stations used in this study are summarised in Table 1 and Fig. 1.All the stations, except one, have hourly data for at least 6 out of the 7 years 2006-2012, for both CO and O 3 .The station that does not have hourly data is the Irish station Mace Head, that has continuous event data for CO between 2006 and 2012.All measurement data were downloaded from the Global Atmospheric Watch -the World Data Centre for Greenhouse Gases (GAW-WDCGG) website (http://ds.data.jma.go.jp/gmd/wdcgg/wdcgg.html).The maximum range within which the satellite retrievals and the smoothed model results vary are shown as red and grey shaded areas, respectively.The red horizontal lines represent the uncertainties of the satellite observations.As expected, the smoothed model results always lie between the original model results and the a priori estimate.Around 500 hPa, the smoothing procedure produces results that are least influenced by the a priori; above that altitude and, even more so, near the surface, the smoothed model results are more closely following the a priori.This reflects the fact that the averaging kernel peaks in the mid-troposphere, where the satellite observations are most sensitive (Deeter et al., 2007).

Evaluation of lateral boundary conditions
The comparison of smoothed model results in Fig. 2 is complemented by Fig. 3, which illustrates the bias ranges at each altitude bin.The rows and columns are as in Fig. 2. The blue boxes represent the interquartile range, the black dotted lines show the range of bias at each altitude, and the blue line inside each box represents the median.The vertical lines represent the ±10 and ±30 % bias ranges.The most important fact we learn from Figs. 2 and 3 is that the simulated CO lies well within the uncertainties of the satellite retrievals at all four boundaries and months.Figure 3 confirms the relatively small difference between the EMEP model and the MOPITT data.The largest differences appear between EMEP and MO-PITT below 700 hPa, with an average bias of −17 %.At altitudes in the range of 700-400 hPa, the agreement between the smoothed model results and the MOPITT retrievals tends to be least biased.As pointed out earlier, this is also the range where the instrument is most sensitive.Hence the smoothing procedure produces results in this altitude range that rely least on the a priori estimate.
Ozone concentrations retrieved from the OMI instrument and computed by the EMEP model are found in Fig. 4, and Fig. 5 presents the corresponding bias ranges.The smoothed EMEP results for ozone lie well within the uncertainties of the satellite retrievals, except at around 800 hPa at the southern boundary in January.Here, Fig. 5 shows a correspondingly large bias.In January there is a general overestimation of O 3 by EMEP with an average bias of 7 %, whereas the August months show an average underestimation of about 5 %.
In this comparison, it is important to keep in mind the effect of the smoothing procedure.As pointed out earlier, this approach ensures that the comparison of model results and satellite retrievals is self-consistent.This is particularly important at those altitudes at which the instrument is least sensitive, which usually includes the altitude range near the surface.However, self-consistency alone does not guarantee the reliability of the validation.In the mid-troposphere, where the instruments tend to be most sensitive, the smoothing procedure alters the model results only little, and the satellite retrievals are mostly influenced by the measured signal rather than by the a priori estimate.Thus the comparison can be expected to provide us with a reliable model validation procedure at these altitudes.By contrast, near the surface both the satellite retrievals and the smoothed model results are strongly influence by the a priori estimate.It is, therefore, by no means obvious that the model validations presented in Figs.2-5 allow us to conclude much about the reliability of the EMEP LBCs near the surface.To learn more about the effect of boundary fields on in-domain concentrations, we continue the investigation with an indirect validation of the model-derived LBCs.To this end, we force the regional MATCH model with the EMEP LBCs and compare the results to independent satellite retrievals from the Aqua-AIRS instrument in the free troposphere, and to in-domain ground concentrations from the GAW network.One important question in this comparison is to what extent the validation procedure we performed for the EMEP LBCs can be relied upon in the free troposphere and near the surface.We also force the MATCH model with its originally used boundary conditions (ORIG), which are based on combined use of a global model climatology and on ground observations near the lateral boundaries.We compare the independent observations to MATCH results obtained with ORIG as well as with dynamic (ELBCd) and climatological (ELBCc) LBCs from the global EMEP model to assess possible improvements achieved with the validated EMEP boundary conditions, and to assess the possible benefits of using dynamic rather than climatological LBCs.

Evaluation of MATCH results near the surface
To better understand the impact of the new LBCs at the surface, and to find out what possible benefits there might be in using dynamical vs. climatological LBCs, the MATCH model runs are compared at the lowest model layer with each other and to the ground observations, in   is based upon the work done by Loibl et al. (1994), where the relative altitude between the stations altitude above sea level and the minimum altitude within a certain search radius, typically around 5 km, is used to find the corresponding model level.In this study the search radius is about 5 km, where the reference topography has a resolution of 1 arcmin (1.8 km) and can be found at https://www.ngdc.noaa.gov/mgg/global/global.html.To obtain information about the general behaviour of the ELBC and ORIG runs of MATCH, a comprehensive statistical analysis was done, investigating different time periods and statistical metrics for all stations.As a summary of the findings, Taylor diagrams (Taylor, 2001) were constructed, with statistics from the period 2006-2012.Figures 8 and 9 show the Taylor diagram with bias indicators, for CO and O 3 .Essentially these diagrams summarise four statistical pa- rameters, the root mean square error, the correlation, the ratio between the variances (model / measurements) and the bias.The correlation is given as the cosine angle and can be read on the perimeter and have lines indicating different correlations.The root mean square corresponds to the distance from the REF indicator on the x axis and have dotted semicircles around this point to indicate the distance.The normalised variance, or standard deviation, is given as the radii from the origo point and are indicated with dashed quarter circles.The bias is indicated with markers, listed with triangles and circles in the diagram.
When changing from ORIG to ELBC, the CO results in Fig. 8 show very little change in the statistical parameters, except for the bias.The correlation gets slightly improved for three of the stations, Kollumerwaard (KMW), Krvavec (KVV), and Mace Head (MHD), whereas it otherwise does not change or it correlates a little bit worse.The normalised variance is low, below one, for all MATCH runs, and does not differ much between the runs.The root mean square does not change significantly either.The largest change is seen in the bias, where the new ELBC runs both underestimate the amount of CO more than the ORIG run, at all stations.These results are in line with earlier findings from Monks et al. (2015) and Stein et al. (2014).Monks did a comprehensive study, where eleven models, where inter-compared with CO, O 3 , and OH, concluded a general underestimation of CO by all the models in the Arctic and the Northern Hemisphere.Stein investigated the underestimation of CO in wintertime and in the Northern Hemisphere and concluded that it partially comes from an underestimation of wintertime road traffic emissions, too high dry deposition rates in boreal forests, and possibly from errors of the geographical and seasonal distribution of OH concentrations.
Figure 9 illustrates the summary of the O 3 statistics for the nine ground observation sites.The new ELBC MATCH model runs increase the amount of O 3 and clearly improve the variance, where the normalised standard deviation gets closer to one, REF.The bias varies among the stations, but over all gets improved.Nevertheless, the correlation gets slightly worse for all stations except MHD, where it remains unchanged.
Time series of CO and O 3 corresponding to the year 2011 for all three model runs at the Mace Head station are investi-  gated and shown in Figs. 10 and 11, respectively.These figures show the daily maximum of ozone and the CO mixing ratios at the same time of the day at which O 3 mixing ratios peak, which usually occurs in the afternoon.We chose the time of the O 3 maxima to avoid problems with nocturnal shallow boundary layers.It is evident that there are bias problems for both ELBC runs and both trace gases.CO is underestimated more strongly with the new ELBC runs than the ORIG run, and O 3 is slightly more overestimated compared to the ORIG run.Examining the correlation, the ELBCd run captures more of the variability, especially on shorter timescales.Looking only at the summer season of 2011 the correlation is higher, 0.77 compared to 0.56 (EL-BCc) and 0.59 (ORIG) for CO, and 0.72 compared to 0.64 (ELBCc) and 0.66 (ORIG) for O 3 .For the year of 2011, the larger-scale (seasonal) variation dominates, and the correlation is unchanged for CO and improved for O 3 , 0.77 (EL-BCd), 0.78 (ELBCc), and 0.66 (ORIG).The winter time O 3 is also much improved with the new ELBCs.
We have also investigated changes in a rather sensitive metric, the accumulated ozone over threshold 40 ppb (AOT40) (Fuhrer et al., 1997).AOT40 is an important metric when studying ozone impact on vegetation (Fuhrer et al., 1997), it is also very sensitive to small variations in O 3 (Simpson et al., 2006a), and can thus highlight the differences among the different model runs.The AOT40 is derived for the months April to September of 2011 and at 07:00-19:00 UTC. Figure 12 shows the AOT40 for all the considered measurement stations in the model domain.Clearly the use of ELBC runs cause significant changes in the AOT40.In most cases this gives a better comparison with measurements, although it should be noted that this alone is not proof of better boundary conditions (BCs): many other processes also affect the bias with respect to AOT40, such as dry deposition rates or chemical production rates (Tuovinen et al., 2007).The one station that deviates more from this improvement is Mace Head, where the ORIG BCs give their best results.The reason for this is most likely the use of Mace Head data in setting the values used in the ORIG BCs.The dynamic boundary conditions, ELBCd, also yield a better agreement with the observations than the climatological boundary conditions, ELBCc, at six of the nine stations.However, the ELBCc and ELBCd give rather similar AOT40 levels, suggesting that climatological ELBCs can be good enough even for this rather sensitive ozone metric.On average, the ORIG results underestimate the AOT40 by 41 % and new ELBC runs overestimate by 10 and 29 %.Sensitivity tests show that (as expected) AOT40 is most sensitive to the O 3 BCs, and this is also consistent with the findings in Schulz et al. (2014), who compared the global EMEP model to the regional EMEP MSC-W model and ground-based observations, and where the global EMEP model overestimates the amount of O 3 at the surface.

Evaluation of MATCH results at 500 hPa
The question we address here is how the LBCs influence the in-domain concentrations aloft in the troposphere at 500 hPa.To get a general view of how MATCH performs in the free troposphere, the mean vertical profiles for CO and O 3 are inspected.Figure 13 shows mean vertical profiles for CO and O 3 and the three MATCH runs together with the EMEP model at MATCH η levels (surface to about 13 hPa).The profiles are averaged over 2006-2012, at the Mace Head station, which again is used since it is located close to the western boundary.The two ELBC runs and the EMEP results have similar vertical profiles, while the MATCH results obtained with the original prescribed boundary values display a much weaker vertical variation.Judging by the satellite retrievals in Figs. 2 and 4, the vertical variation obtained with the EMEP model and with MATCH using the new boundary conditions are qualitatively more realistic than the corresponding results computed with the original boundary values, especially higher up in the free troposphere, where the difference between EMEP and the satellite retrievals are smaller.Compared with the long-term average of measurements (2006)(2007)(2008)(2009)(2010)(2011)(2012) at the Mace Head station (represented by the black triangle), there seems to be a rather large bias for CO close to the surface, with an underestimation of about 20 % for the ELBC runs, and about 4 % for the ORIG run.As stated earlier in the comparison near the surface, many models have problems with underestimating the amounts of CO.
As for O 3 , the new ELBC runs seem to produce too much near-surface ozone compared to the ground observations, while the ORIG run produces too little, as stated in the previous section.For this long-term mean, the ELBC runs have a positive bias of about 15 % and the ORIG a negative bias of about −9 %.
In order to gauge the significance of the differences in ozone concentrations aloft obtained with the different LBCs, we take a look at how these ozone profiles translate into radiative forcing rates.To this end, we run a one-dimensional radiative transfer model.We use a standard US atmosphere (Anderson et al., 1986), in which we replace the tropospheric ozone concentrations up to an altitude of 100 hPa by the Mace Head profiles shown in Fig. 13.We consider a dark ocean surface with a spectrally constant albedo of 7 %, and we perform the computations for a solar zenith angle of 50 • (which is typical for Mace Head around noon at equinox).We use the radiative transfer tool uvspec (Kylling et al., 1998), which is included in libRadtran, where we use DIS-ORT (DIScrete Ordinate Radiative Transfer) code (Stamnes et al., 1988) with six streams as a radiative transfer solver in conjunction with Kato's correlated k band model (Kato et al., 1999).The radiative fluxes are computed over the spectral range from 250 nm to 4.5 µm.Not surprisingly, the ELBCd and ELBCc cases yield forcing rates that agree to within 99 % throughout the atmospheric column.The ELBC ozone profiles yield a radiative forcing rate of −1.8 W m −2 at the surface, and +0.7 W m −2 well above the troposphere at 18 km altitude.By contrast the ozone profile obtained with the MATCH run based on the original LBCs yields a radiative forcing of only −1.1 W m −2 at the surface, and +0.4 W m −2 at 18 km.Thus, the magnitude of the radiative forcing of tropospheric ozone computed with the original LBCs is considerably lower than that computed with the EMEP-based LBCs.
This example clearly illustrates the impact of LBCs and concentration fields aloft on the climate forcing effect of tropospheric ozone.Thus, we take a closer look at the MATCH results at 500 hPa obtained with different LBCs, and compare the simulations to independent satellite observations from AIRS.In this comparison we do not smooth the data, according to Eq. (1) as in Sect.3.1.There are primarily two reasons as to why we did not smooth the model data.First, smoothing a data set increases the reliability of the vertical distribution, but we are only interested in one particular pressure level.Second, the chosen 500 hPa level is the level at which the satellite retrievals are least dependent upon the a priori.Thus, the a priori has very little impact on the retrieval result at that level.In addition, we are more interested in investigating the pattern correlations than the bias (which is more affected by the smoothing error).Also, it is noted that AIRS, in general, has a high sensitivity in the mid-troposphere at around 500 hPa (Warner et al., 2010).During the winter half year, when the surface temperatures are very cold over the study area and the lower troposphere is likely to be stratified due to inversions, the thermal contrast between the surface and successive layers in the troposphere is weakest (especially under the presence of near isothermal vertical structure).In such a case, the maximum information content and averaging kernels peak around 500 hPa (Warner et al., 2010).This means that even in winter AIRS is most sensitive in the mid-troposphere.
Figures 14 and 15 show the three MATCH runs (ELBCd, ELBCc, and ORIG) together with AIRS data at 500 hPa, for CO and O 3 , respectively, averaged over 2006-2012.The new ELBC runs clearly impact the MATCH results.In comparison to AIRS, the CO results shows a clear improvement in the pattern correlation, from 0.71 (ORIG) to 0.85 (ELBC runs); also, the north-south gradient in the ELBC results are stronger than in the ORIG results, and compares better to the north-south gradient in the AIRS retrievals.The ELBC runs do not deviate much from each other, much due to the long averaging period.Looking at the time averaged variance over the model domain, the ORIG run has 3 times larger variance than AIRS, whereas the ELBCs has about the same.
The O 3 results, in Fig. 15, also show an improvement in the north-south gradient and the pattern correlation, 0.70 (ORIG) to 0.78 (ELBC runs).Looking at the time correlation averaged over the two-dimensional domain, the best correlation is found with a lag of 1 to 3 months, meaning the ORIG run lags about a month and the ELBCs lags 3 months behind the AIRS data.This is an important observation, highlighting the need for further investigations of how the MATCH model and other models are performing in the free troposphere in order to be able to couple chemical transport models with climate models.

Conclusions
The main goal of this study was to test a methodology for evaluating lateral boundary conditions (LBCs) for the two long-lived atmospheric species CO and O 3 .The methodology is based on a combined approach of performing (i) a direct comparison of LBCs derived from a global model with satellite observations at the domain boundaries, (ii) an indomain comparison of a regional model run with satellite retrievals aloft, and (iii) and in-domain comparison of regional model runs with in situ observations at the ground.Thus, our methodology combines the direct evaluation approach discussed in Henderson et al. (2014) with indirect testing methods used in various studies (e.g.Tang et al., 2007).Boundary fields generated from the global EMEP model, to our knowledge, have not been validated previously by use of satellite data.
The direct evaluation with the global EMEP model shows good agreement, well within the uncertainties of the satellite retrievals.However, it is important to stress that the satel-lite data sets are retrieved with several assumptions, and they each have specific limitations, MOPITT has a lower sensitivity in the lower troposphere due to lower thermal contrasts between the surface-skin temperature and the surface-level air temperature, which leads to higher sensitivity on land, in daytime and at midlatitudes (Deeter et al., 2007).OMI has known biases, especially in the troposphere, where there are uncertainties, not only in distinguishing the tropospheric column from the stratospheric, but also in how ozone is distributed in the troposphere (Kroon et al., 2011).
To illustrate the impact of the lateral boundary fields, we forced the MATCH regional CTM, set up over the European domain, with boundary fields obtained from the global EMEP model.This was done by using (a) dynamic boundary fields, and (b) climatological boundary fields obtained by averaging, for each month, EMEP results from a 7-year model run.The performance of the MATCH model aloft at 500 hPa is substantially improved with the use of the new ELBCs, where the pattern correlations increases from 0.68 (ORIG) to 0.83 (ELBC runs) for CO, and from 0.73 (ORIG) to 0.81 (ELBC runs) for O 3 .Using AIRS as a reference, the model goes from a 12 % overestimation to a 9 % underestimation for CO and an underestimation of 28 % to a small overestimation of 3 % for O 3 .Note, however, that the bias is not considered a strong metric in this comparison at 500 hPa, since we have not smoothed the model data.
At the surface, it was less straight forward to draw general conclusions from the direct comparison with the groundbased measurements.The most significant improvements in temporal correlations are observed at Mace Head, for the EL-BCd run.This station lies closest to the Western boundary; therefore it is more strongly influenced by the LBCs and less strongly by in-domain local sources.The improvements in temporal correlation are more pronounced when focusing on shorter time periods (e.g.summer season).This is consistent with Tang et al. (2007).Comparing the AOT40 results between the ELBC runs, the dynamical ELBCd set-up of the LBCs produces, in general, amounts of O 3 closer to the measurements.This difference in performance between the ELBC runs can become important when studying air quality and health impacts on shorter timescales, but if investigating the climate and its changes, a climatology can well represent the average amount of long-lived trace gases.In general the new ELBCs caused significant changes to the AOT40, where the new ELBCs gives results closer to the measurements.
The use of a global CTM as LBCs in regional modelling certainly impacts the longer-lived trace gases both at the surface and aloft, in the free troposphere.It confirms that LBCs evaluated by satellite observations at the boundary can be expected to provide accurate results in the free troposphere; however, they also reveal the limitations of the methodology for ensuring the accuracy of boundary-layer concentrations.This indicates that the significance of LBCs on ground concentration may have been overestimated in previous studies.Even though we consider long-lived species, we find that the

Figure 1 .
Figure1.Map of the European model domain of the regional model, also showing the ground-based measurement stations, summarised in Table1.

Figure 2
Figure2presents the comparison of CO at the lateral boundaries.The four columns show results for the southern, northern, eastern, and western boundaries, while the top and bottom rows show results for January and August, respectively.The original EMEP model results are represented by grey dots, the smoothed EMEP results are shown as black solid lines, and the MOPITT retrievals and the a priori estimate are depicted as solid red and dashed blue lines, respectively.

Figure 2 .
Figure 2. Carbon monoxide mixing ratios for January (first row) and August (second row) at the four cardinal boundaries (denoted SB, NB, EB, and WB), observed by MOPITT (red solid line) and simulated retrievals from EMEP (black solid line).The retrievals (EMEP-R) are calculated by using Eq.(1) with EMEP model data (grey dots) and applying MOPITT's averaging kernel and adding the a priori profile (blue dashed line).The red and grey shaded area correspond to the range of values in which the satellite and retrieval values vary at each level.The satellite uncertainties are represented by the red horizontal lines.

Figure 3 .
Figure3.Retrieved bias for each altitude, shown as box plots, between EMEP and MOPITT for CO and the same months and cardinal boundaries as Fig.2.The blue boxes corresponds to the interquartile range, the black dotted lines show the range of bias at each altitude, the blue line within each box represents the median.The four black vertical lines show the ±10 and ±30 % bias range.

Figure 4 .
Figure 4. Same as Fig. 2 but for ozone and the satellite retrievals from OMI.
Figures 6 and 7 show the CO and O 3 results of the different MATCH model runs at the lowest model layer, averaged over the period 2006-2012.The new ELBC runs clearly show a reduction of the CO mixing ratios throughout the model domain, on average by 15 %.On the other hand O 3 increases all over the model domain, on average by 21 %.

Figure 6 .
Figure 6.CO volume mixing ratios at the lowest model layer for the ELBCd (top left subplot), ELBCc (top right subplot) and ORIG (bottom subplot) runs of the MATCH model.The results are averaged over the entire period 2006-2012.

Figure 8 .
Figure 8.Taylor diagram showing CO statistics for all the chosen GAW stations, for the three MATCH model runs, ELBCd as red, ELBCc as blue and ORIG as green, with statistics from 2006 to 2012.The correlation is given by the cosine angle from the horizontal axis, the root mean square error corresponds to the distance from the "REF" indicator on the x axis, the ratio between the variances of the model and the measurements, here referred to as the normalised standard deviation and are represented by the radius or distance from the origo and the bias is symbolised next to each marker.Standard deviations larger than 1.75 are represented with their standard deviation/correlation as numbers underneath the diagram.The bias symbols are indicated in the list to the top left.

Figure 10 .
Figure 10.Time series showing the summer season of 2011 for CO, at the Mace Head station.

Figure 12 .
Figure 12.The AOT40 in ppb(v) h for the different in-domain ground-based measurements stations and the different LBC setups.The AOT40 is derived for a corresponding growing season in 2011, representing the months of April to September and at 07:00-19:00 UTC.

Figure 13 .
Figure 13.The 7-year average vertical profiles at the Mace Head station location for CO (to the left) and O 3 (to the right), for the MATCH results and EMEP.The average ground-based measurements at the Mace Head station is also shown as a black triangle.The levels correspond to η levels which varies with surface and model top pressure at each grid point and the pressure level 500 hPa approximately corresponds η level no.18.

Figure 14 .
Figure 14.CO volume mixing ratios at 500 hPa for the ELBCd (top left), ELBCc (top right), ORIG (lower left) and the AIRS (lower right).

Table 1 .
The names and abbreviations of ground measurement stations used in the evaluation of O 3 and CO in MATCH.

www.geosci-model-dev.net/8/3747/2015/ Geosci. Model Dev., 8, 3747-3763, 2015 3754 E. Andersson et al.: Methodology for evaluating lateral boundary conditions
Table 1 at the surface.The model data are collocated with the measurement data by extracting the grid cell lying closest to the measurement station in latitude, longitude, and local time.For the surface comparison we use a relative altitude method to extract the best corresponding model layer.This latter method