Gas transport in firn: multiple-tracer characterisation and model intercomparison for NEEM, Northern Greenland

. Air was sampled from the porous ﬁrn layer at the NEEM site in Northern Greenland. We use an ensemble of ten reference tracers of known atmospheric history to characterise the transport properties of the site. By analysing uncertainties in both data and the reference gas atmospheric histories, we can objectively assign weights to each of the gases used for the depth-diffusivity reconstruction. We deﬁne an objective root mean square criterion that is minimised in the model tuning procedure. Each tracer constrains the ﬁrn proﬁle differently through its unique atmospheric history and free air diffusivity, making our multiple-tracer characterisation method a clear improvement over the commonly used single-tracer tuning. Six ﬁrn air transport models are tuned to the NEEM site; all models successfully reproduce the data within a 1 σ Gaussian distribution. A comparison between two replicate boreholes drilled 64 m apart shows differences in measured mixing ratio proﬁles that exceed the experimental error. We ﬁnd evidence that diffusivity does not vanish completely in the lock-in zone, as is commonly assumed. The ice age- gas age difference ( 1 age) at the ﬁrn-ice tran-sition is calculated to be 182 + 3 − 9 yr. We further present the ﬁrst intercomparison study of ﬁrn air models, where we introduce diagnostic scenarios designed to probe speciﬁc aspects of the model physics. Our results show that there are major on differences in the way the models handle advective transport. Furthermore, diffusive fractionation of isotopes in the ﬁrn is constrained the which has consequences for attempts the isotopic of trace gases back in using ﬁrn air and ice core records.

Abstract. Air was sampled from the porous firn layer at the NEEM site in Northern Greenland. We use an ensemble of ten reference tracers of known atmospheric history to characterise the transport properties of the site. By analysing uncertainties in both data and the reference gas atmospheric histories, we can objectively assign weights to each of the gases used for the depth-diffusivity reconstruction. We define an objective root mean square criterion that is minimised in the model tuning procedure. Each tracer constrains the firn profile differently through its unique atmospheric history and free air diffusivity, making our multiple-tracer characterisation method a clear improvement over the commonly used single-tracer tuning. Six firn air transport models are tuned to the NEEM site; all models successfully reproduce the data within a 1σ Gaussian distribution. A comparison between two replicate boreholes drilled 64 m apart shows differences in measured mixing ratio profiles that exceed the experimental error. We find evidence that diffusivity does not vanish completely in the lock-in zone, as is commonly assumed. The ice age-gas age difference ( age) at the firn-ice transition is calculated to be 182 +3 −9 yr. We further present the first intercomparison study of firn air models, where we introduce diagnostic scenarios designed to probe specific aspects of the model physics. Our results show that there are major

Introduction
The compacted snow (firn) found in the accumulation zone of the major ice sheets acts as a unique archive of old air, preserving a continuous record of atmospheric composition from the present up to a century back in time . Sampling of this archive has allowed for reconstruction of the recent atmospheric history of many trace gas species (e.g. Butler et al., 1999;Sturrock et al., 2002;Aydin et al., 2004;Montzka et al., 2004;Assonov et al., 2007;Martinerie et al., 2009) and their isotopologues (e.g. Francey et al., 1999;Ferretti et al., 2005;Bernard et al., 2006). Because of its temporal range it naturally bridges the age gap between direct atmospheric observations and the ice core record (Etheridge et al., 1998). Firn air analysis has some significant advantages over the ice core technique. First, firn air can be sampled directly using a pumping line (Schwander et al., 1993), making an ice extraction step unnecessary. Second, large sample sizes can be obtained making this method very suited for studying, e.g. recent changes in the isotopic composition of trace gases (Sugawara et al., 2003;Röckmann et al., 2003;Ishijima et al., 2007). Third, because bubble occlusion introduces additional smoothing, firn air records can achieve higher temporal resolution than ice cores (Trudinger et al., 2004).
Another important motivation for studying firn air is the interpretation of ice core records. All air found in glacial ice has first traversed the firn, and has thus been affected by its transport properties and the bubble close-off process. Some of the most commonly described artifacts of firn air transport are: gravitational fractionation (Craig et al., 1988;Schwander, 1989), thermal fractionation in the presence of temperature gradients (Severinghaus et al., 2001), diffusive smoothing of rapid atmospheric variations (Spahni et al., 2003), molecular size fractionation during bubble close-off Severinghaus and Battle, 2006), diffusive isotopic fractionation (Trudinger et al., 1997) and age, the finite age difference between gas bubbles and their surrounding ice (Schwander et al., 1997). In certain cases the peculiarities of firn air transport actually give rise to new proxies, e.g. for temperature (Severinghaus and Brook, 1999;Landais et al., 2004;Dreyfus et al., 2010) and local summer insolation .
Diffusion is the dominant mechanism by which variations in atmospheric composition are transferred into the firn. The effective diffusivity decreases with depth as the pore space decreases; unfortunately, diffusivities measured on small firn samples do not adequately describe the behaviour of the whole firn (Fabre et al., 2000). Consequently, the diffusivity profile with depth needs to be reconstructed for each firn air site independently through an inverse method (Rommelaere et al., 1997;Trudinger et al., 2002;Sugawara et al., 2003). The procedure consists of forcing a firn air transport model with the atmospheric history of a selected reference gas, often CO 2 , and subsequently optimising the fit to measured mixing ratios in the firn by adjusting ("tuning") the effective diffusivity profile.
With few exceptions, the firn air modeling studies found in literature tune their effective diffusivity profile to a single tracer. For the NEEM firn air site in Northern Greenland we have chosen to tune the effective diffusivity to an ensemble of ten tracers, namely CO 2 , CH 4 , SF 6 , CFC-11, CFC-12, CFC-113, HFC-134a, CH 3 CCl 3 , 14 CO 2 and δ 15 N 2 . The studies by Trudinger et al. (1997Trudinger et al. ( , 2002Trudinger et al. ( , 2004 also use a wide variety of tracers, including halocarbons, greenhouse gases and radiocarbon ( 14 CO 2 ), to characterise firn air transport. The main difference with these works is that we show how multiple tracers can be combined in a methodical tuning process. By systematically analysing all the uncertainties, both in the data and in the atmospheric histories of the references gases, we assign a unique weight to each data point in the tuning procedure. Using these uncertainty estimates we define an objective root mean square deviation (RMSD) criterion that is minimised in the tuning. We introduce tracers that have not previously been used in firn air studies, and provide atmospheric reconstructions for these species with realistic uncertainty estimates.
Our approach has several advantages. A central difficulty in reconstructing the diffusivity profile is that the problem is under-determined, with (infinitely) many solutions optimising the fit (Rommelaere et al., 1997). By adding more tracers, each with a unique atmospheric history, the diffusivity profile is constrained more strongly. The final reconstructed profile is a trade-off between the demands of the different tracers. Second, many trace gases of interest, such as halocarbons, have a free air diffusivity that differs from CO 2 by up to a factor of 2. It is not a priori clear whether a profile tuned to CO 2 alone provides a good solution for these gases. The tracers in this study have a wide range of free air diffusivities, where the fastest tracer (CH 4 ) has a free air diffusivity that is three times that of the slowest one (CFC-113). Third, when using ten tracers the available dataset is much larger, with several data points at each sampling depth. This makes the final result more robust, i.e. less susceptible to effects of outliers and analytical biases. Finally, our method is less sensitive to errors in the reconstructed atmospheric history of individual reference gases. Our analysis shows that uncertainties in the atmospheric reconstruction are the largest source of potential error.
Apart from presenting a new methodology for characterising firn air sites, this work will also serve as a reference for other studies using NEEM firn air. The modeling in a number Atmos. Chem. Phys., 12, 4259-4277, 2012 www.atmos-chem-phys.net/12/4259/2012/ of forthcoming publications in this ACP special issue will be based on the diffusivity reconstructions presented here.
We further present a model intercomparison between six state of the art one-dimensional firn air transport models. All the models are tuned to the same dataset, using the same physical firn parameters, such as porosity, free-air diffusion coefficients, etc. To diagnose model performance we introduce four synthetic scenarios that are designed to probe specific aspects of the model physics.
For many sections in this work more information is available in the Supplement. Because of the vast amount of material we have chosen to structure it the same as this work for easy referencing. Sections marked with an asterisk ( * ) in this work have a corresponding section in the Supplement where additional information can be found. The Supplement also includes all the firn air data and atmospheric reconstructions used in this study.

NEEM 2008 firn air campaign *
The firn air used in this study was sampled during 14-30 July 2008 from two boreholes near the NEEM deep ice core drilling site, Northern Greenland (77.45 • N 51.06 • W). The sampling site was 1.5 km outside of the main camp and chosen to avoid contamination by going upwind of the prevailing wind direction. The two boreholes, S2 and S3, were separated by 64 m. Firn air was sampled by drilling through the firn layers to the desired depth, and lowering the firn air sampling device into the borehole. The device consists of a purge and sample line, running through an inflatable bladder to seal off the firn air from the overlying atmosphere during sampling. S2 was sampled using the firn air system of the University of Bern (Schwander et al., 1993); S3 with the US firn air system . For this reason the boreholes are referred to as the "EU" and "US" holes, respectively. The CO 2 mixing ratio was monitored on the purge line with LICOR (US site) and MAIHAK (EU site) CO 2 analysers to assess whether the firn air being pumped had been purged of modern air prior to sampling.

Physical characterisation of NEEM firn air site *
The annual mean site temperature is −28.9 • C, as obtained from borehole thermometry on the EU hole; an isothermal firn column is used for this modeling exercise. Atmospheric pressure is 745 hPa. We use the assumption commonly made in firn air modeling that the firn density profile is in steady state, and that the site has been climatically stable over the study period. For the firn density profile ρ(z) we use an empirical fit to the NEEM main ice core density measurements averaged over 0.55 m segments (Fig. 1a). For each of the three stages of the densification process (grain boundary sliding, sintering and bubble compression; Arnaud et al., 2000) LIZ DZ we use a combination of quadratic and exponential functions to fit the data, keeping the derivative continuous over the transitions between the stages. The fit is generated on a z = 0.2 m grid, which is the spatial resolution used in this study. Following the temperature relationship given by Schwander et al. (1997), a solid ice density ρ ice = 0.9206 g cm −3 is used.
We use an accumulation rate of A = 0.216 m yr −1 ice equivalent, as derived from the 2007 NEEM S1 shallow core (D. Dahl-Jensen, personal communication, 2010). Since the model runs cover the period 1800-2008 CE, we used the average rate over the same period as a best estimate. This longterm average is slightly lower than our best estimate for the current day accumulation of 0.227 m yr −1 .
The total porosity s = 1 − ρ/ρ ice is divided into open and closed porosity (s op , s cl ) using the parameterisation of Goujon et al. (2003): where s co is the mean close-off porosity. The deepest point at which firn air could be sampled successfully was z = 77.75 m. The depth of total pore closure (s op = 0) was determined experimentally in the field by drilling the EU hole to a depth of z = 83 m, and subsequently raising the bladder in 0.5 m steps. The deepest point at which air could be pulled was z = 79 m, though the flow was insufficient for sampling. To be consistent with this observation we use s co = 9.708 × 10 −2 in Eq. (1), which leads to total pore closure (s op = 0) at depth z = 78.8 m, as shown in Fig. 1a. Traditionally, the firn column is divided into three zones, based on the gravitational enrichment with depth (Sowers et al., 1992). For NEEM the δ 15 N 2 together with the zonal division is shown in Fig. 1b. Using the barometric line fitting method , we obtain a convective zone (CZ) thickness of 4.5 m. The zone below is called the diffusive zone (DZ), as diffusion dominates the transport at these depths. The lock-in depth is defined as the depth at which gravitational enrichment stops, which happens at z = 63 m. Below we find the lock-in zone (LIZ), where advection with the ice matrix dominates the transport. This is also the region where the majority of the bubble occlusion occurs, as can be seen from the s cl curve.

Gas measurements *
A total of 345 samples were taken from the boreholes, with atmospheric samples taken at three occasions during the sampling period. Different flask types were used (SilcoCan, stainless steel flasks and glass flasks); for the tracers used in this study no effect of flask type could be found in data comparison. We use firn air data from six different laboratories; an overview is given in Table 1.
For CH 4 and SF 6 , data from IUP were corrected for known calibration offsets to place all data on the NOAA scales used in the atmospheric reconstructions; corrections were always smaller than 3 ppb and 0.06 ppt, respectively. After the calibration correction no systematic offsets were observed between the different laboratories. As discussed in Sect. 2.6, there are significant offsets between the two boreholes, and data from the holes are not combined but treated separately. The 14 CO 2 data and atmospheric reconstruction were both converted to a (mass conserving) ppm scale to allow 14 CO 2 to be modeled like a regular tracer. Wherever multiple data points were available for a certain depth, they were averaged in the following order: first replicate measurements from one laboratory were averaged, then the resulting values from the different labs were averaged. In this way one composite dataset was created that contains 204 data points for the 10 tracers on the EU hole, and 77 data points for the 4 tracers on the US hole. The δ 15 N 2 profile is the same for both holes, and a combination of EU and US depths are used in the final dataset. This gives a final dataset of 260 data points.

Atmospheric histories of reference tracers *
A best-estimate atmospheric history was reconstructed for each of the reference tracers used in the tuning. Additionally, we reconstructed a δ 13 CO 2 history which was used to convert the 14 CO 2 history to a ppm scale (Stuiver and Polach, 1977). All the atmospheric histories start in the year 1800 and have monthly resolution. We use 2008.54 (mid July) as the decimal sampling date in the models. Four different types of data were used in the reconstructions: 1. Direct atmospheric measurements from sampling networks or archived air. Northern hemisphere high latitude stations (i.e. Alert, Summit and Barrow) are used whenever available. When using data from other stations a correction should be made to account for the latitudinal gradient; this is done whenever the gradient could be reliably determined. We used station data from the NOAA-ESRL network for CO 2 , CH 4 , SF 6 and HFC-134a (Conway et al., 1994;Dlugokencky et al., 1994;Geller et al., 1997;; from ALE/GAGE/AGAGE for CFC-11, CFC-12, CFC-113 and CH 3 CCl 3 (Prinn et al., 2000(Prinn et al., , 2005Cunnold et al., 1997;Fraser et al., 1996); from CSIRO for δ 13 CO 2 (Francey et al., 2003). All ALE/GAGE/AGAGE data were converted to the latest NOAA calibration scales that are also used in the firn air analysis; all δ 13 CO 2 data are kept on the CSIRO calibration scale. Furthermore, we used the SIO Mauna Loa record for CO 2 between 1958-1985 CE (Keeling et al., 2009), and the Cape Grim air archive for δ 13 CO 2 between 1978-1991 (Francey et al., 1999). The 14 CO 2 history between 1963-1993 is based on Fruholmen, Norway, for its proximity to Greenland (Nydal and Lövseth, 1996); data from other stations were used to complete time coverage (Manning and Melhuish, 1994;Levin and Kromer, 2004;Levin et al., 2008).
2. High resolution firn air/ice core measurements from the high accumulation Law Dome sites, Antarctica. The reconstructions of CO 2 before 1958, and CH 4 before 1983 are based on Etheridge et al. (1996Etheridge et al. ( , 1998 and MacFarling Meure et al. (2006); δ 13 CO 2 before 1976 is based on Francey et al. (1999).
4. Emission-based estimates using a 2-D atmospheric transport model that includes latitudinal source and sink distribution (Martinerie et al. (2009) and references therein). This has been used to complete time coverage for SF 6 and halocarbons before the onset of direct atmospheric measurements.

Gravitational correction
The gravitational fractionation of gases in the firn column is well established both theoretically and experimentally (Craig et al., 1988;Schwander, 1989;Sowers et al., 1992). Before the modeling, all data were corrected for the effect of gravity: where [X] corr ([X] meas ) is the gravity corrected (measured) mixing ratio of gas species X, M = M X − M air is the molar mass difference between gas X and air, and δ grav (z) is the gravitational fractionation per unit mass difference at depth z (Supplement). The δ grav (z) values are based on measurements of δ 86 Kr ( 86 Kr/ 82 kr) with depth, and corrected for the effect of thermal fractionation (Severinghaus et al., 2001). The rationale for using Kr rather than N 2 , is that its free-air diffusivity is closer to that of most of the tracers we use, so it should represent the disequilibrium effects on gravitational fractionation more accurately. The models are run with either the molecular weight of all gases set equal to that of air (M = M air ) or gravity set to zero (g = 0). This empirical gravitational correction ensures that the effect of gravity is included correctly, and could potentially also be used to correct for mass-dependent sampling artifacts (Severinghaus and Battle, 2006).

Differences between the EU and US boreholes
Though the boreholes are separated by a mere 64 m, within the lock-in zone we find differences in the mixing ratio profiles for CO 2 , CH 4 ( Fig. 2) and SF 6 (not shown) that exceed the estimated uncertainty of the combined sampling-measurement procedure (indicated with errorbar, see Sect. 2.7 for details). Note that we cannot make the same comparison for the other gas species since only data from the EU hole are available. The estimated uncertainty on the depth measurements is 5 cm, whereas at certain depths an error of 50-60 cm is required to explain the observed borehole offset. So although depth measurement errors can certainly contribute to the offset, they cannot explain the full magnitude of it. We attribute the differences to lateral inhomogeneity in the firn stratigraphy, possibly originating from surface wind features that have been preserved in the densification process.
Because of these differences we have chosen to model both holes separately. Furthermore, for SF 6 we observe a ∼ 0.25 ppt offset in the 5-50 m depth range, contrary to the CO 2 and CH 4 profiles that agree well between the holes at these depths. This excludes differences in age distribution of the air as the origin. Alternative explanations we considered, such as incomplete flask flushing, sample contamination, procedural blanks and bladder outgassing, could all be excluded. Since we found no objective reason to reject data from either hole, we account for the discrepancy by assigning an additional errorbar to the SF 6 data from both holes in this depth range (see below).

Full uncertainty estimation *
Having accurate uncertainty estimates is essential for our multiple-tracer methodology. A unique uncertainty estimate has been assigned to each of the 260 individual data points used in this study based on the following seven potential sources of uncertainty: (1) analytical precision as specified by the laboratories; (2) uncertainty in atmospheric reconstructions; (3) contamination with modern air in the deepest firn samples; the estimates are based on HFC-134a, SF 6 and CFCs, which should be absent in the deepest samples; (4) sampling effects estimated from inter-laboratory and inter-borehole offsets; (5) possibility of in-situ CO 2 artifacts (e.g. Tschumi and Stauffer, 2000) and CO 2 enrichment due to close-off fractionation Severinghaus and Battle, 2006); (6) undersampling of seasonal cycle in the monthly atmospheric reconstruction (CO 2 only); and (7) large unexplained EU-US borehole difference in the diffusive zone (SF 6 only). The errors are assumed to be independent of each other. The uncertainties in the atmospheric reconstructions are converted from a time scale to a depth scale by running them through the CIC firn air model (Sect. 3.2). Details are given in the Supplement. The relative contribution of the seven sources, averaged over the firn column, is given in Table 2. For most tracers the atmospheric reconstruction is the largest contributor to the total uncertainty, followed by the analytical precision.

Tuning of the diffusivity profile *
How the diffusive transport changes with depth needs to be reconstructed through an inverse method for each firn air site independently. The procedure consists of forcing the transport models with the atmospheric history of one or several selected reference tracer(s), and subsequently optimising the fit to the measured mixing ratios in the firn by adjusting the effective diffusivity values at each depth (Rommelaere et al., 1997;Trudinger et al., 2002;Sugawara et al., 2003).
Here we will use diffusion as a generic term for mass transfer processes that are driven by a concentration gradient, and well described by Fick's law. The diffusivity profile we reconstruct is composed of two parts: Molecular diffusion is a microscopic process originating in the thermal motion of the molecules constituting the firn air. The effective molecular diffusion coefficient of gases in the open porosity decreases with depth as the diffusive path becomes increasingly tortuous due to the densification process (Schwander et al., 1988). The diffusion coefficient of gas X at depth z is given by where D 0 X is the diffusion coefficient in free air, τ (z) is the tortuosity at depth z, and γ X = D X /D CO 2 is the ratio of diffusion coefficients (Trudinger et al., 1997). It is clear from Eq. (3) that the molecular diffusivity profiles for the different gas species scale linearly with each other. The γ X used in this study are based on measurements by Matsunaga et al. (1993Matsunaga et al. ( , 1998Matsunaga et al. ( , 2002a, or a theoretical formula (Chen and Othmer, 1962) for gas species where no experimental data are available (Supplement). Eddy diffusion refers to mass transfer caused by air flow patterns in the open porosity which the models cannot resolve directly, and are instead parameterised through the inclusion Atmos. Chem. Phys., 12, 4259-4277, 2012 www.atmos-chem-phys.net/12/4259/2012/ of a diffusion coefficient D eddy (z). Unlike molecular diffusion, D eddy is equal for all gases as the process is macroscopic in origin. The first contribution to D eddy is convective mixing in the top few meters due to seasonal temperature gradients and wind pumping (Colbeck, 1989;Severinghaus et al., 2001;Kawamura et al., 2006). Some of the models also include a D eddy term in the deep firn to represent dispersive mass transfer. The classical example of dispersive mixing is Taylor dispersion, where (viscous) shear flow in a circular tube enhances the effective diffusivity of a solute (Aris, 1956). Viscous flow can be induced in the deep firn by low frequency pressure fluctuations at the surface (Schwander, 1989), as well as air expulsion due to pore compaction (Rommelaere et al., 1997). In soils, viscous flow induced by atmospheric pressure fluctuations has been shown to be an important transport mechanism (Massmann and Farrier, 1992). Pore compaction causes a non-uniform velocity distribution between air expelled upwards, and air remaining behind in the pores. This spread in velocities results in additional (dispersive) mixing.
The extensive uncertainty analysis we introduced in Sect. 2.7 assesses how reliable each data point is, and consequently how much weight should be assigned to it in the tuning procedure. This allows us to combine multiple tracers in the tuning, and the final reconstructed diffusivity profile is a trade-off between the constraints placed by the different tracers. All models in this study tuned their effective diffusivity profiles to minimise the root mean square deviation where the d i are the data for a given borehole (all tracers), m i are the linearly interpolated modeled values at the same depths, and σ i give the total uncertainties. The index i runs over all data points, where N = 204 for the EU hole, and N = 77 for the US hole.  Severinghaus and Battle, 2006). Table 3 summarises a selection of relevant model characteristics. The second column indicates the kind of coordinate system used by each model. The CSIRO model is unique in using a coordinate system that moves downwards with descending ice layers (Lagrangian coordinates), giving a downward ice velocity w ice = 0 relative to the reference frame. The other models use a static (Eulerian) coordinate system where the surface stays at z = 0 and the ice layers move down at a finite velocity w ice = Aρ/ρ ice . The choice of coordinate system has consequences for the way advection of air with the ice matrix is handled (third column). In the CSIRO model, the moving coordinate system automatically leads to advective transport without the need to include it explicitly. The models expressed in Eulerian coordinates use either an advective flux as described by Rommelaere et al. (1997), or use boxes of air that are shifted downwards at regular time intervals (Schwander et al., 1993). Convection (fourth column) is handled in two different ways. The LGGE-GIPSA model uses a D eddy term that is tuned by the RMSD minimisation algorithm, in combination with zero gravitational fractionation for z < 4 m. The other models use the parameterisation of Kawamura et al. (2006) where a D eddy term is included that falls off exponentially with depth. The fifth column of Table 3 indicates whether the time stepping used in solving the diffusion equation was either explicit (Euler forward), implicit (Euler backward) or mixed implicit-explicit (Crank-Nicolson method). Finally, some of the models were tuned through manual adjustments of the diffusivity profile, while others used an automated control method (e.g. Rommelaere et al., 1997;Trudinger et al., 2002).

Fit of modeled profiles to the data *
The modeled profiles for all 10 tracers from the EU borehole are shown in Fig. 3, profiles from the US borehole can be found in the Supplement. For CO 2 (Fig. 3a) we find a pronounced mismatch for z > 70 m which is reproduced consistently by all models and in both boreholes. The atmospheric reconstruction is based on the Law Dome record (Etheridge et al., 1996), and consequently the largest source of uncertainty is the interhemispheric CO 2 gradient. To explain the observed mismatch with an error in the atmospheric history would require a 6 ppm interhemispheric gradient in the 1950s, which can be ruled out (Keeling et al., 2010). Also contamination with modern air can be ruled out for these flasks, based on measurements of halocarbons. We cannot, however, exclude in-situ production of CO 2 from, e.g. organic material or (bi)carbonates found in Greenland ice (e.g. Tschumi and Stauffer, 2000). There might also be a small close-off fractionation of CO 2 , of order 1 ‰ or less, based on its effective molecular diameter Severinghaus and Battle, 2006).
Both CH 3 CCl 3 and 14 CO 2 (Fig. 3d-e) have a peak within the LIZ. The peak height and position, which are sensitive to the shape and magnitude of the diffusivity profile, provide an important constraint in the tuning procedure. It is exactly at these peaks that the divergence between the models is most easily visible. It must be noted, however, that equally large model differences are found for other tracers as well (e.g. CO 2 and HFC-134a). Furthermore, models that fit the peak height exactly do not necessarily provide the best overall fit to the data.
To quantitatively assess how well the modeled profiles fit the data, we make a histogram of (m i − d i )/σ i where the index i goes over all 204 data points of the EU borehole. The σ i are the unique uncertainties we assigned to each data point. This is shown in Fig. 4 together with a Gaussian distribution of width σ = 1 and a surface area equal to that of the histogram. The figure furthermore shows the RMSD from the data as given by Eq. (4). For all models we find a distribution that is more narrow than the Gaussian distribution, meaning that within the assigned uncertainties all data from the borehole can be modeled consistently. The good agreement gives confidence in the correctness of our reconstructed atmospheric histories, which will be of use in future firn air studies. For the US hole the RMSD ranges between 0.60-1.06. Two models stand out as having a larger RMSD of 0.92 (CSIRO and OSU). For the CSIRO model this is caused by the absence of a back flow due to pore compression in the transport description (Sect. 4.4.4). For the OSU model we attribute the larger RMSD to the tuning procedure for the molecular diffusivity, which has fewer degrees of freedom than the procedures used by the other models (Supplement).
Differences in RMSD give information on the performance of models, or model configurations, relative to each other; the RMSD as defined by Eq. (4) should not be interpreted in an absolute sense.

Diffusivity profiles *
All the models are tuned separately to optimise the fit to the data. Figure 5a shows the reconstructed molecular diffusivity profiles for CO 2 . The spread between the solutions is caused by two factors. First, it reflects differences in model physics, which are compensated for by adjustments to the diffusivity profile. Second, it relates to the degree in which the inverse problem of diffusivity reconstruction is under-determined. By using 10 different tracers the problem is more strongly constrained than in the case of using only CO 2 , but the solution is not unique. A second important observation is that all models require a non-vanishing diffusivity on the order of 2 × 10 −9 m 2 s −1 within the LIZ (see Fig. 5b). This contradicts the notion that diffusivity vanishes completely in the LIZ due to the presence of impermeable layers, which is frequently expressed in the firn air literature (e.g. Battle et al., 1996;Trudinger et al., 2002;Assonov et al., 2005). Our findings confirm a recent study at Megadunes, Antarctica, that also reports a finite (eddy) diffusivity in the LIZ (Severinghaus et al., 2010). We believe this result to be robust, since it is reproduced by all six firn air models. In particular, the well reproduced height of the 14 CO 2 bomb spike, which falls entirely within the LIZ, provides strong evidence for finite LIZ diffusion at NEEM; letting diffusivity go to zero leads to a 14 CO 2 peak that is too narrow and overshoots the measured values by ∼ 7σ (Fig. 6).
How the LIZ diffusivity is parameterised, however, varies strongly among the models as shown in Fig. 5c. The models use either molecular diffusion, dispersive mixing (eddy diffusion) or a mixture of both. To fit the δ 15 N 2 data, most models require some dispersive mixing in the LIZ, since molecular diffusion alone would lead to continued gravitational Atmos. Chem. Phys., 12, 4259-4277 fractionation. The LGGE-GIPSA model circumvents this problem by using a combination of Fick's and Darcy's transport to describe almost stagnant transport in the lower zones (Witrant et al., 2011), which allows for using purely molecular diffusion in the LIZ. The SIO model uses the ratio of eddy to molecular diffusion in the LIZ as a tuning parameter in the RMSD minimisation, and finds that the fit is optimised for 27 % eddy diffusion (Supplement). Models that reproduce the observations equally well can have completely different parameterisations, so our analysis does not tell us which scheme is more likely to be correct. These model differences do have important consequences for the modeling of isotopic ratios, as is discussed in Sect. 4.4.1.

Gas age distributions *
Firn air does not have a single age, but rather a distribution of ages (Schwander et al., 1993). The modeled gas age distribution, also referred to as Green's function, transfer function, response function or age spectrum, provides a complete description of the model transport properties (Rommelaere et al., 1997). Figure 7 compares CO 2 age distribution densities for the models at the lock-in depth (z = 63 m) and at the bottom of the LIZ (z = 78 m, close to the deepest EU hole sample at z = 77.75 m). The relevant characteristics of the age distributions are summarised in Table 4. As a measure of the relative spread in model results, we list the 2σ standard deviation divided by the mean (µ).
The LIZ lies between the two depths depicted here. From the closed porosity parameterisation (Eq. 1), we find that ∼ 95 % of the air is trapped in this zone. The gas age distribution found in NEEM ice below the full close-off depth (s op = 0) will be intermediate to these two end members, with a tail of older air trapped already in the DZ. Interestingly, at the lock-in depth (z = 63 m), estimates of both the age and the distribution width show a spread of around 25 % between models that reproduce the data equally well. Clearly these properties are not as well determined as one would expect based on the similarity between the modeled curves of Fig. 3 alone. On traversing the LIZ to z = 78 m the relative spread in modeled ages is reduced; the absolute spread grows slightly to about 5 yr, though. The spread in distribution widths remains large.
For the US borehole the spread in the calculated mean ages at z = 63 m is even larger (41 %). This large spread is due to the fact that we have only four tracers on the US borehole, and therefore the mean ages are less strongly constrained by the data.
We can use the gas age distributions to calculate the NEEM modern day age, i.e. the difference between gas age and ice age below the firn-ice transition. From annual layer counting and matching of reference horizons to the NGRIP GICC05 timescale (Rasmussen et al., 2006), the ice at depth z = 63 (78) m is estimated to be 190.6 (252.5) ± 1 yr old (S. O. Rasmussen, personal communication, 2011). To calculate the true age we would require the gas age in the closed porosity. With the model results of Table 4 we can only calculate age op , i.e. the difference between ice age and gas age in the open pores. By averaging results from the six firn air models we find a age op of 181 ± 2 and 183 ± 2 yr at z = 63 and 78 m, respectively. This shows that the LIZ, ice and air are aging at roughly the same rate with depth. An estimated 4-5 % of the air is already trapped above 63 m, which biases our air age estimate young. This air fraction can introduce an error of at most 8 yr. Based on these considerations, our best estimate of the true age is 182 +3 −9 yr (181 +3 −9 yr) for the EU (US) borehole.

Borehole comparison
Here we compare the reconstructed diffusivity profiles and gas age distributions for the EU and US boreholes. Figure 8a shows the total diffusivity with depth for CO 2 on a semilog scale, where we have averaged over the solutions from the different firn air models. We observe that the largest divergence between the boreholes occurs in the LIZ. Also the gas age distributions show divergence mainly in the LIZ. Figure 8b shows the age distributions at two depths, where again the curves represent an average over model output from the different firn air models. At the lock-in depth (z = 63 m) the age distributions of both boreholes are still very similar. After traversing the LIZ (z = 76 m, chosen to be near the final sampling depth on the US hole), we find that the firn air in  Trudinger et al., 2002) at the lock-in depth (z = 63 m) and at the bottom of the LIZ (z = 78 m). All values given in years. The 2σ standard deviation divided by the mean µ gives a measure of the spread in the model results.

Model
Mean the US hole has undergone more diffusive mixing, leading to a broader distribution width. Note that these differences in transport properties appear to have only a minor impact on the age calculated in Sect. 4.2. Polar firn is a layered medium that exhibits density variations with depth caused by seasonal variations in local climatic conditions and deposited snow density (e.g. Hörhold et al., 2011). Snow drift and redeposition at the surface lead to lateral inhomogeneities in the firn stratigraphy. The borehole comparison suggests that firn air transport in the LIZ is very sensitive to these lateral variations, whereas transport in the DZ is not. This is not unexpected since the formation of the LIZ has been linked to the degree of layering (Landais et al., 2006). If the amplitude of density variations is sufficiently large, the high-density layers will reach the closeoff density before the low-density layers do, thus creating sealing layers that impede vertical diffusion and prevent further gravitational fractionation of the gas mixture (Martinerie et al., 1992;Schwander et al., 1993;Sturrock et al., 2002). In Sect. 4.1 we presented evidence for finite vertical mixing in the LIZ despite the presence of such impeding layers. The amount of vertical mixing is expected to depend on the horizontal extent of the sealing layers, as well as the position of (micro)cracks and passageways. This could lead to a strong lateral variation in LIZ diffusion as observed. Note that such lateral processes cannot be represented adequately in onedimensional transport models.

Synthetic diagnostic scenarios *
To diagnose model properties further, we developed four synthetic scenarios that probe specific aspects of the model physics. We present the individual scenarios below, and discuss the model differences they reveal.

Scenario I: diffusive fractionation
For gas species whose atmospheric mixing ratios change over time, the disequilibrium in the firn leads to an isotopic fractionation with depth, even in the absence of changes in atmospheric isotopic composition (Trudinger et al., 1997). As a scenario we use a monotonic CO 2 increase of ∼1 ppm yr −1 without seasonal cycle, chosen to resemble the current anthropogenic increase (Keeling et al., 2010). The surface isotopic ratio is kept fixed at δ 13 CO 2 = 0. Because the 13 CO 2 isotopologue has a lower diffusivity than the 12 CO 2 isotopologue, it is transported less efficiently into the firn, leading to a depletion in δ 13 CO 2 with depth. Ice core and firn air records need to be corrected for the effect of diffusive fractionation (DF). For example, the (site specific) corrections for the recent anthropogenic increase are ∼0.1 ‰ for δ 13 CO 2 at DE08, Law Dome (Francey et al., 1999), and 1.2 ‰ for δ 13 CH 4 at WAIS-Divide (Mischler et al., 2009). Figure 9a shows the modeled isotopic signal with depth for the different models. Results in the DZ are consistent, however they diverge strongly once the LIZ is reached. For δ 13 C of CO 2 , the observed range between the different model solutions in the deepest firn is 0.05 ‰, which is around 40 % of the total DF signal. For isotopologues of CH 4 the DF is larger, because of the larger relative mass difference between the isotopologues. Based on the observed DF model disagreement we estimate an uncertainty of 0.5 ‰ for both δ 13 C and δD of CH 4 . For δD of CH 4 this uncertainty is smaller than the typical measurement uncertainty, as well as the atmospheric signal observed in the firn (Bräunlich et al., 2001;Mischler et al., 2009). However, for both δ 13 CO 2 and δ 13 CH 4 the model disagreement exceeds the typical instrumental precision. At this moment we have no objective way of determining which of the models predicts the DF correctly. This additional uncertainty should therefore be recognised when interpreting ice core and firn gas isotopic records of periods with rapid atmospheric variations. Sites with a thick LIZ, combined with a low accumulation rate, allow for reconstructing atmospheric mixing ratios further back in time. However, a thick LIZ also means a larger uncertainty in DF (and thereby isotopic reconstruction) due to the poorly constrained LIZ transport.
Molecular diffusion leads to DF between isotopologues because their diffusion coefficients are different. Advection, convection and dispersive mass transfer do not discriminate between isotopologues, and consequently do not fractionate the gas mixture. The observed model disagreement originates in the way in which LIZ transport is parameterised (Fig. 5c). The CIC and OSU models both change from purely molecular diffusion in the DZ to purely eddy diffusion in the LIZ. Once the molecular diffusion approaches zero (and D eddy ≫ D X ), the DF ceases to increase with depth giving a horizontal line. The LGGE-GIPSA model, on the other hand, uses molecular diffusion all the way down to the close-off depth, giving a continued fractionation with depth. One way the DF uncertainty could be incorporated in future studies is by testing different parameterisations for LIZ diffusion, or by using transfer functions generated by different models.
The Law Dome-based records of δ 13 CO 2 (Francey et al., 1999) and δ 13 CH 4 (Ferretti et al., 2005) were both corrected for the effects of DF. The correction should not be affected much by our findings here for several reasons. Most importantly, the accumulation rate (and therefore the advective flux) at the Law Dome DE08 and DE08-2 sites is very high. This strongly reduces DF in the LIZ. Furthermore, the LIZ is relatively thin (Trudinger, 2001). The study by Ferretti et al. (2005) combines ice core samples from the aforementioned high accumulation Law Dome sites with firn air measurements from the DSSW20K site, which has a rather low accumulation rate (∼ 0.16 m yr −1 ice equivalent). After DF correction the results agree well. Other published firn air and ice core isotopic records that cover the recent anthropogenic increase have also been corrected for the effects of DF (e.g. Bräunlich et al., 2001;Sugawara et al., 2003;Röckmann et al., 2003;Sowers et al., 2005;Bernard et al., 2006;Ishijima et al., 2007;Mischler et al., 2009). Until the true nature of LIZ mixing is better understood, there is no way of knowing whether the DF magnitude calculated in these studies is correct. Our findings imply that the uncertainty in these atmospheric reconstructions might have been underestimated.

Scenario II: attenuation of a sine wave with depth
Rapid atmospheric variations are attenuated in the firn, and recorded in the ice with a reduced amplitude (Spahni et al., 2003). To compare the firn smoothing effect between the models, we force them with a sinusoidal atmospheric variation in CO 2 that has a 15 yr period. Due to diffusion, the amplitude of the signal decreases with depth in the firn. For mathematical convenience we impose two separate atmospheric scenarios: where t is the time in years. Note that both signals are attenuated to the same degree, and that the π/2 phase angle between them is preserved. To get the signal amplitude with depth, we combine them in the following way: The outcome is shown in Fig. 9b. We observe that signal attenuation in the LIZ is stronger than in the DZ. This is because transport in the DZ is efficient; the mean CO 2 age at the lock-in depth is around 9 yr, which is less than the period of the atmospheric oscillation. The mixing ratios in most of the DZ will be "in phase" with the atmospheric signal, giving little attenuation. By comparison, air in the LIZ is nearly stagnant. It takes around 60 yr to be transported over 15 m, corresponding to 4 periods of the sinusoid. As air of different ages (i.e. different phases of the atmospheric cycle) gets mixed, the signal amplitude is reduced. Within the DZ the models agree well. In the LIZ there are two intervals (65 < z < 68 m and 74 < z < 78 m) where the CSIRO model has significantly less attenuation with depth. These intervals correspond to the depths where the reconstructed CSIRO diffusivity goes to zero (Fig. 5b). On the US borehole this effect is even more pronounced (Supplement).
In Sect. 4.1 we touched upon the question of whether or not there is continued diffusion in the LIZ; our diffusivity reconstructions indicate that there is at NEEM. Scenario II shows that this remnant LIZ diffusivity causes very strong attenuation of the atmospheric signal. Therefore, the question of LIZ diffusivity has important consequences for the temporal resolution at which we can hope to reconstruct atmospheric variations back in time using ice cores.

Scenario III: balance of transport fluxes
The total gas flux is the sum of the advective, diffusive and convective fluxes. Within the DZ, molecular diffusion overwhelms the other transport mechanisms; in this scenario we bring the different fluxes into balance by considering a hypothetical gas X with a very small diffusion coefficient D 0 X = 0.025D 0 CO 2 , and a mass M X = M air +1×10 −3 kg mol −1 . The relative strength of the different mechanisms is shown by the Atmos. Chem. Phys., 12, 4259-4277, 2012 www.atmos-chem-phys.net/12/4259/2012/ gravitational enrichment with depth. This is the only of the four scenarios where gravity is turned on in the model runs. Figure 9c shows the model output together with the true gravitational enrichment as measured for 15 N 2 . In thermodynamic equilibrium the enrichment is given by the barometric equation: δ grav (z) = [exp( Mgz/RT )−1] ≈ Mgz/RT (Sowers et al., 1992). The advective and convective fluxes drive the air column out of equilibrium reducing δ grav (z) below the barometric case. For gas X the diffusive flux is not large enough to oppose the other two, leading to a final δ grav which is about half that of 15 N 2 . In this way the enrichment with depth represents the balance between diffusion on one hand, and advection and convection on the other.
The differences in implementation of the convective mixing are clearly visible. The LGGE-GIPSA model uses no gravitational enrichment in the top 4 m, followed by a tuned eddy diffusion. The convective mixing in the other models, which all use the parameterisation by Kawamura et al. (2006), penetrates much deeper into the firn. In the DZ there are also clear differences; here the gravitational slope mostly represents the balance between diffusion and advection. The CSIRO has a stronger advection term due to the absence of pore compression back flow (see below); this is reflected in a more reduced slope. A higher advective flux can partially be compensated in the diffusivity tuning by a lower diffusivity, and vice versa. This explains why the diffusivity profile reconstructed through the CSIRO model is slightly lower than those of the other models (see Fig. 5a).

Scenario IV: advective transport
Pore closure traps air in bubbles, which are subsequently advected downwards with the ice matrix. A downward flux of air in the open pores compensates for this removal process in the firn-ice transition. In the last scenario we study hypothetical gas Y , which is not subject to diffusion or convection, and is transported into the firn by the advective flux alone. Both eddy and molecular diffusion are set to zero. Its atmospheric history is a linear decrease by 1 ppm yr −1 , until it reaches zero at the final model date. This leads to a mixing ratio of gas Y in the firn that equals the age of the air at each depth. The results are shown in Fig. 9d. In the absence of diffusion, most models find an age of around 900 yr at the bottom of the firn column. The CSIRO model stands out with a much lower gas age. Differences in total air flux, and thereby the total air content in mature ice, affect the advective transport. Also, artificial numerical diffusion in the models reduces the calculated gas ages.
There are significant differences in the way advective transport is treated by the models. The key difference is whether or not the model includes an upward air flow relative to descending ice layers to account for compression of open pores (Rommelaere et al., 1997;Sugawara et al., 2003). Note that this term is smaller than |w ice |, and that the net air flow in the open pores is always downwards. The compres-sion term is neglected in the CSIRO model (Trudinger et al., 1997), which leads to a gas age that equals the ice age (since w air = w ice ). Note also that the observed difference is not related to the choice of reference frame (static vs. moving). In the LIZ air transport is dominated by advection, so this is where we expect to see the effect of the back flow most clearly. Indeed we see that in the LIZ the CSIRO model diverges most strongly from the other models, e.g. in the total reconstructed diffusivity (Fig. 5b) and the diffusive smoothing of scenario II (Fig. 9b).
Neglecting the back flow term corresponds to a physical model of the firn where all the air is transported downwards with the ice layers. This would be possible if, e.g. the denser firn layers succeed in completely sealing off the air below (Martinerie et al., 1992;Schwander et al., 1993). In the case such impermeable layers are present in the LIZ, pressure will build up in the pore space below. At some of the Law Dome sites (e.g. DSSW20K) outgassing was observed from freshly drilled boreholes, which could be caused by the venting of such pressurised pores to the atmosphere. However, at these sites the effects of wind pumping could not be excluded as the origin of the observed air flow. At NEEM, no outgassing from pressurised pores was observed. More evidence for the absence of sealing layers comes from the observation of diffusive mixing in the LIZ (Sect. 4.1). Finally, the total air content implied by the assumption of fully pressurised pores in the LIZ is incompatible with observed air content in mature ice (Martinerie et al., 1992).
For these reasons we believe that at NEEM the back flow due to pore compression can not be neglected when describing the firn air transport. This makes the assumptions in the CSIRO model invalid, explaining why it obtains a higher RMSD than most other models (Fig. 4). Preliminary tests show that by including the advective back flux in the CSIRO model the RMSD reduces to 0.74.

Summary and conclusions
We presented a new multi-tracer method for characterising the firn air transport properties of a site. We applied the method to NEEM, Northern Greenland, by tuning six firn air models to an ensemble of ten tracers. The firn air models used in this study were able to fit the data within a 1-σ normal distribution, meaning that the site can be modeled consistently within the estimated uncertainties. Each of the reference tracers constrains the firn profile differently through its unique atmospheric history and free air diffusivity, making our multiple-tracer characterisation method an improvement over the commonly used single-tracer tuning.
Six different firn air transport models were tuned to the NEEM site in this fashion, allowing a direct model comparison. We find that all the models require a non-vanishing diffusivity in the lock-in zone to reproduce the measured profiles of the reference gases. This is contrary to the commonly held notion that diffusive mixing is absent in the lock-in zone. A comparison between replicate boreholes located 64 m apart suggests that this lock-in zone diffusion is sensitive to lateral inhomogeneities in firn stratigraphy. Furthermore, we find that despite the fact that the models reproduce the measured profiles equally well, the gas age distributions they calculate can differ substantially. Often-used quantities, such as the mean age and distribution width, differed among the models by up to 25 %. The modern day age was calculated to be 182 +3 −9 yr. Finally, we introduced four diagnostic scenarios that are designed to probe specific aspects of the model physics, from which we can draw two main conclusions. First, we studied the consequences of the different ways in which the models implement advection, in particular whether they include a back flow term to account for pore compression. Including the back flow term significantly improves the fit to the data, suggesting this term can not be neglected, as is done in the CSIRO model. Second, we find that the effect of isotopic diffusive fractionation is poorly constrained by the models. Near the close-off depth, the calculated diffusive fractionation differs by 40 % between the models. These differences are related to the way in which lock-in zone diffusion is parameterised. Molecular diffusion leads to continued fractionation, whereas dispersive transport (eddy diffusion) does not. Further work is needed to elucidate the true nature of lock-in zone diffusion. Meanwhile, when interpreting firn air and ice core isotopic records, an additional uncertainty needs to be included to account for the model discrepancy found here.