A Lagrangian model of airmass photochemistry and mixing using a trajectory ensemble : the Cambridge Tropospheric Trajectory model of Chemistry And Transport ( CiTTyCAT ) version 4 . 2

Introduction Conclusions References


Introduction
The Cambridge Tropospheric Trajectory model of Chemistry and Transport (CiTTyCAT), was developed by Wild (1995) and published in the peer-reviewed literature in Wild et al. (1996).Since that time it has been extensively utilised and modified.Evans (1999) and Evans et al. (2000) used CiT-TyCAT to investigate the production and loss of ozone in airmasses arriving at the west coast of Ireland.Methven et al. (2001) introduced the technique of origin averaging to associate air mass origins with concentration signatures of trace species.Emmerson (2002) and Emmerson et al. (2004) T. A. M. Pugh et al.: The CiTTyCAT Lagrangian model added a module to calculate aerosol formation from condensables, and applied this to secondary organic aerosol formation in the UK.Arnold et al. (2004) used CiTTyCAT to investigate the lifetime of acetone in the upper troposphere in response to new calculations of quantum yields for acetone photolysis.Donovan et al. (2005) studied the effects of urban trees on ozone formation and pollutant deposition.Real et al. (2007Real et al. ( , 2008Real et al. ( , 2010) ) and Cain et al. (2012) used the model to investigate intercontinental pollutant transport across the North Atlantic, and isolated the impact of aerosols from forest fire plumes.Cain (2009) also applied the model to study transport and processing of ozone in the West African monsoon flow.As part of these investigations, Cain (2009) and Cain et al. (2012) utilised a novel method of defining a background profile of composition, using an ensemble of trajectories as described in detail below.Most recently Ryder (2005), Hewitt et al. (2009) and Pugh et al. (2010aPugh et al. ( ,b, 2011) ) updated the model chemistry to include detailed isoprene and monoterpene degredation schemes, and applied this to the study of biogenic volatile organic compounds (VOCs) and ozone over a South-East Asian tropical rainforest, including an investigation of the OH recycling hypothesis of Lelieveld et al. (2008).This paper brings together many of these individual developments, and provides the first peer-reviewed documentation with the aim of providing a comprehensive description of the CiTTyCAT model.
Atmospheric chemistry models can be broadly grouped into three classes: box/column models, Lagrangian models, or Eulerian models (e.g.Seinfeld and Pandis, 2006, Chapter 25).Eulerian modelling, where air is advected relative to a fixed grid and species are modelled at all locations at every timestep, is well suited for global or regional simulations, for instance where the aim is to test effects on global climate or model regional air pollution.It is not currently possible to achieve the global resolution required to resolve air-masses as they are stretched and folded by the flow down to scales where their contrast with surroundings is lost.The mixing timescale derived from observations (7-15 days for the free troposphere; Arnold et al., 2007) is much longer than the average exponential stretching rate (only 1-2 days).The effect of coarse resolution is to average emissions, chemistry and transport over grid boxes, implying unrealistically rapid mixing which, given the non-linearity of atmospheric chemistry, can lead to significant errors (e.g.Tan et al., 1998;Wild and Prather, 2006;Real et al., 2010).In addition, the complexity of the chemistry is often reduced by lumping of species (grouping together several real species under one model species), particularly in hydrocarbon chemistry (e.g.Emmerson and Evans, 2009).The errors associated with species lumping used to simplify the chemistry can be significant, particularly when the model is applied outside the chemical conditions for which the chemical scheme was devised (Taraborrelli et al., 2009).For these reasons, particularly when looking at case studies following air-masses for many days, it is advantageous to use a model where advection is separated from other processes.In particular the separation of advection and chemical processing is necessary to evaluate current understanding of the chemistry, since in the real atmosphere variability in measurements of tracers such as ozone from aircraft or surface sites is dominated by changes in air mass origin (see e.g.Methven et al., 2003Methven et al., , 2006)).Two possibilities for effecting this separation are to use a box/column model or a Lagrangian model.Below we provide a brief introduction to, and rationale for, Lagrangian atmospheric chemistry modelling, drawing primarily on published CiTTyCAT studies for illustrative examples.
A Lagrangian photochemical model evolves composition following airmass trajectories that are usually pre-calculated off-line.As the frame of reference moves with the air, no advection term is required.Such trajectory models have a multitude of uses, for instance pollution case studies (Strong et al., 2006(Strong et al., , 2010;;Derwent et al., 2007) or investigation of intercontinental air mass evolution (Real et al., 2007(Real et al., , 2008)).Although some problems are avoided through the use of a Lagrangian model, there are certain disadvantages.Accurate initialisation of chemical concentrations is important, and may be challenging where these are not constrained by observations.The central concept of a distinct parcel of air being advected is itself subject to limitations (Liu and Seinfeld, 1975;Ghim and Seinfeld, 1988).Consider a small volume of air bounded by a material surface which everywhere moves with the velocity of air.Even if it were possible to follow the flow on all scales, molecular diffusion would result in mixing of material between the parcel and its surroundings.In practice, any material surface, no matter how small, is stretched and folded by the flow and this greatly enhances the rate of mixing.In a Lagrangian model a conceptual partition is introduced between advection by the resolved flow and mixing, enhanced by unresolved turbulence, between an airmass and its surroundings.This mixing must be parameterised in Lagrangian model, requiring knowledge of the composition of adjacent parcels.Solutions typically involve a relaxation term to a constant background composition (e.g.Real et al., 2007Real et al., , 2008)), running a "background" box prior to running the box of interest (e.g.MacKenzie et al., 1991), or using evolving fields from a Eulerian model to define the background.Stochastic mixing has also been applied in the Lagrangian framework (Pisso et al., 2009).There is no rigorous partition between the resolved and unresolved flow.However, in practice meteorological analyses are used to define the resolved flow, and these are a blend of global observations and a numerical weather prediction model obtained using data assimilation.Assumptions of balances between variables are incorporated into data assimilation of observations that are both sparse and uncertain.The unresolved flow is noisy with short decorrelation timescales and in the Lagrangian approach these are assumed to be represented by some form of random walk or diffusion.Depending on the meteorological conditions, the timescale for rendering the original air mass incoherent through stirring

Mode Example applications
Single box mode Multi-parameter sensitivity studies, case studies over homogeneous terrain, chemical mechanism evaluation.

Two box mode
Accounting for representation of mixing between two largely discrete entities, e.g.nocturnal boundary layer and residual layer.
Trajectory mode Pollutant transport, urban plume evolution, source attribution Ensemble-trajectory mode Long-range transport events where an evolving background mixing ratio is needed to correctly represent the plume.
by large-scale motions and mixing by unresolved turbulence may range from a few hours to a few weeks, although they are typically of the order of a week (Stohl and Trickl, 1999;Arnold et al., 2007).Box/column models typically consist of a single stationary box or column of boxes.They can be used to study the chemistry at a single point, often considered to be characteristic of a wide area.Box/column models can be constrained or unconstrained.Constrained box/column models use point measurements of trace species as boundary conditions on their calculations.As the measurements implicitly account for any changes in air mass composition caused by advection, they are particularly useful for testing chemical schemes by attempting to reproduce a particular measurement of a short-lived chemical such as OH (e.g.Emmerson et al., 2005Emmerson et al., , 2007;;Kubistin et al., 2010).They can also be used to estimate the concentrations of unmeasured species based upon the current state of knowledge of the chemistry (e.g.Carslaw et al., 2001;Tan et al., 2001).Unconstrained box/column models calculate species concentrations forward in time from initial conditions and are typically run to steady state or a steady recurring diurnal cycle (e.g.Pugh et al., 2010a).Typically an emissions term will be included if the model is positioned near a suitable source.Because emissions into a box are mixed instantaneously throughout the box, a dilution term must also be specified, which for surface emissions is typically rather loosely defined as the depth of the atmospheric boundary layer (see Pearson et al., 2010, for a discussion of various measures for the boundary layer depth relevant to photochemical box modelling).In some cases a venting term may also be applied (e.g.Pike et al., 2010).Unconstrained box/column models are particularly useful for case studies where the emission fluxes, rather than long-lived tracer concentrations, can be constrained, and for planning and scenario-based work where measurements are not available.
CiTTyCAT is capable of operating in four modes: singlebox, two-box, single-trajectory and an ensemble-trajectory mode (Table 1).There is an option to constrain certain species in box mode.CiTTyCAT is particularly suitable for process and sensitivity studies of long-range transport and terrestrial tropospheric chemistry.
CiTTyCAT calculates the evolution of air mass composition, along one or many given trajectories.The trajectories are calculated by a separate model.In most applications of CiTTyCAT meteorological analyses from the ECMWF are used to calculate trajectories using either the ROTRAJ model (Methven, 1997) or FLEXTRA (Stohl et al., 1995).Several other Lagrangian models take a different approach, and calculate both trajectories and composition, in a variety of different ways.For example, ATLAS (Alfred Wegener InsTitute LAgrangian Chemistry/Transport System, Wohltmann and Rex, 2009) calculates trajectories throughout the stratosphere (with a basic troposphere), and optionally includes a chemical scheme with 49 tracers and 170 reactions.ATLAS was designed to be used for both high-resolution process studies and for decadal simulations of the global stratosphere, and handles both transport and chemistry in one model.
An extension of trajectory models (which use the resolved winds to calculate trajectories) is the Lagrangian particle dispersion model (LPDM).LPDMs release particles that are advected by large-scale winds plus a stochastic term to represent unresolved turbulent motions.A well-used and cited LPDM is FLEXPART (Stohl et al., 2005).The UK Meteorological Office's NAME model is an LPDM that now includes the STOCHEM chemistry scheme (70 chemical species and 160 gas-phase reactions, Collins et al., 2000).NAME uses a hybrid Lagrangian-Eulerian approach: the chemistry is calculated on a fixed grid, where all particles located within a grid cell are brought together for the chemistry calculation (Jones et al., 2007).
Using either approach (trajectory or LPDM) it is necessary to exchange material between trajectories to represent mixing.In LPDMs turbulent mixing is represented by the random walk of particles and then a grid is imposed and the properties of particles within grid cells are mixed by averaging.With passive tracers this can be done once at the end of the trajectories (a retroplume calculation).There are choices to be made about the grid-size and degree of mixing.However, to represent the nonlinear coupling between mixing and chemistry it would be necessary to perform regular re-gridding.CiTTyCAT takes an alternative approach to mixing.Rather than use a random walk, it is assumed that mixing is dominated by diffusion of a vertical profile (obtained by averaging the trajectory ensemble) plus a constant flux parameterisation in the boundary layer.The advantage of this approach is that the mixing ratio tendencies associated with mixing can be added to those from photochemical reactions so that the processes are coupled on the ordinary differential equation (ODE) integrator time-step (rather than a much longer re-gridding timescale).This is particularly important for short-lived species such as NO x and isoprene.
As CiTTyCAT uses pre-calculated trajectories, many sensitivity tests can be carried out on the same set of trajectories.Using a LPDM, this is not possible, as the stochastic parameterisation of turbulence means that the same particle releases will not follow exactly the same paths each time they are run.
The Edinburgh-Lancaster Model for Ozone (ELMO-2) is a UK-specific Lagrangian chemistry-transport model (CTM) (Strong et al., 2010), which simulates chemistry along HYS-PLIT trajectories (Draxler and Rolph, 2003), pre-calculated backwards from a planetary boundary layer measurement.ELMO-2 includes the STOCHEM chemistry scheme and a well mixed planetary boundary layer, however there is no mixing when trajectories are in the free troposphere.Although ELMO-2 and CiTTyCAT are based on the same approach, the latter is more sophisticated in its treatment of chemistry and mixing.
The current version of the model, described herein, is 4.2.Section 2 describes the basic model routines common to all modes.This is followed by a description and evaluation of the box modes in Sect.3, and of the trajectory modes in Sect. 4. Section 5 describes special provision for sensitivity studies and a summary is provided in Sect.6. Model performance statistics and software requirements are listed in Appendix A.

Basic model overview
CiTTyCAT is written in Fortran 90.It is compiled and run on linux/unix using the Make utility and consists of some 50 000 lines of code in 64 different subroutines.Figure 1 shows the model layout and in the following section the routines common to all modes are described.The primary dimensional units for CiTTyCAT are centimetres, grams and seconds.Altitude is described on pressure surfaces.A flow chart explaining how to run the model is given in Appendix B.

Differential equations for evolution of chemical species
The model is formulated on the conservation law for the mass of constituents: where c denotes mixing ratio of a trace species, ρ is air density (molecules cm −3 ), F (molecules cm −2 s −1 ) is a 3dimensional (3-D) non-advective mass flux vector representing any transport not attributable to advection by the resolved 3-D velocity vector, u (cm s −1 ), and S represents material sources or sinks (s −1 ).Combining with the mass continuity equation: where the Lagrangian derivative Dc/Dt = ∂c/∂t + u.∇c represents the rate of change following fluid parcels.In the model, two assumptions are made regarding the nonadvective fluxes.Firstly, the vertical flux divergence dominates, which is a typically a good approximation since vertical gradients are often much greater than horizontal.Secondly, the non-advective flux term is assumed to be composed of a diffusive term and a linear flux profile across the boundary layer of prescribed depth, h (cm): where F z is the vertical component of the non-advective flux, F s is the flux at the ground (z = 0) and F d is the diffusive part which obeys a flux-gradient relation: where κ is a diffusion coefficient (cm 2 s −1 ) and a no flux boundary condition on the diffusion term is used at z=0.In Sect.4.2, the details of a numerical scheme to define the diffusive flux in terms of a background vertical profile, C(z), are derived by vertically averaging a trajectory ensemble and a sub-grid mixing term is given.
Writing the surface flux in terms of emissions, ρE, and loss by dry deposition, ρV dry , and wet deposition in the form of a material sink at rate, r wet ,we obtain the model equation: where P (s −1 ) represents photochemical production terms not dependent on c, L is photochemical loss rate (s −1 ), E is the emission rate into the boundary layer (cm s −1 ), h is the mixing height of the boundary layer (cm), V dry is the dry deposition velocity (cms −1 ) and r wet (s −1 ) is the first order wet removal rate coefficient (see Sect. 2.5 for details of the deposition schemes).The surface flux term in Eq. ( 4) is only active when the air parcel is inside the planetary boundary layer (0 < z < h) and density is assumed to be approximately uniform across the boundary layer.
The different modes of operation for the model refer only to the way in which the environment of an airmass changes with time and the formulation of the diffusive flux divergence.In the "box mode", the airmass temperature and pressure are prescribed and the airmass is assumed to be stationary in one location for the purposes of calculation of solar zenith angle, the photolysis scheme and emission/deposition fluxes.In "trajectory mode", temperature, pressure, humidity mixing ratio, boundary layer height and precipitation rate (if used by the wet deposition scheme) are all given by interpolating analyses (or numerical weather forecasts) in space and time to points along pre-calculated trajectories.
In box mode and single trajectory mode, the diffusive mixing is either assumed to be zero (F d = 0) or the flux divergence term is represented by a relaxation to a background mixing ratio at a specified mixing rate (see Sect. 4.2).The background mixing ratio can either be specified for each species or obtained by interpolation from a global 3-D model (usually at fairly coarse resolution).In these modes there can be no communication of emissions and deposition through the boundary layer top except indirectly through time varying boundary layer depth, h.
In ensemble trajectory mode, the diffusive flux divergence term is modelled by mixing between multiple (usually more than 100) trajectories, each of which evolves photochemically.In this case, vertical mixing between the trajectories propagates the effect of surface fluxes beyond the boundary layer top.The formulation of the model is flexible and allows for a more sophisticated treatment of mixing via non-advective fluxes.However, the simple approach taken here performs well when evaluated against observational data from the ITCT-Lagrangian (International Transport and Chemical Transformation) experiment (Methven et al., 2006).

Chemistry schemes
The CiTTyCAT chemistry scheme is designed to be flexible, allowing easy modification for different studies.Rate coefficients for any of the bimolecular and termolecular reactions can be modified in the appropriate data file as updated values become available, and this procedure does not require recompilation.This section describes the current state of the chemistry scheme and then outlines the procedure for performing modifications to this code using the DELOAD module (Brown et al., 1993).Currently 185 gas-phase species undergo 373 bimolecular, 37 termolecular and 120 photolysis reactions.

DELOAD
One of the tenets of the CiTTyCAT approach is that it should be relatively straightforward to change the chemical scheme, particularly for individual species or sensitivity studies.The CiTTyCAT reaction files for bimolecular, termolecular and photolysis reactions are stored as ASCII files that can be fed to the DELOAD programme (Nejad, 1986;Brown et al., 1993), which then generates Fortran code containing the production and loss terms for the ordinary differential equations.The DELOAD module is maintained separately from the model, and is run only when reactions are added to, or removed from, the chemical scheme.The kinetic data contained within the reaction files can be modified without having to run DELOAD or recompile the model.

Core chemistry
The core CiTTyCAT chemistry contains inorganic, NO x , O x and HO x chemistry, along with a treatment of methane and sulphur species.The anthropogenic hydrocarbon scheme (Hough, 1991) includes oxidation by OH of 9 species from ethene to benzene and has been updated to include acetone and methanol (e.g.Arnold et al., 2004).All rate coefficients in these schemes have been updated according to IUPAC (2006) if available, and JPL06-2 (Sander et al., 2006) or the Master Chemical Mechanism version 3.1 (Saunders et al., 2003) if not.Toluene and xylene chemistry follow the more explicit scheme of Jenkin (1996), which produces a potentially condensable final product that can be used to drive an aerosol scheme.Isoprene and monoterpene chemistry are detailed in the following sections.

Isoprene chemistry
The original CiTTyCAT isoprene treatment (Evans et al., 2000) did not include organic nitrates, which have been shown to be very influential for the NO x budget (e.g.Poschl et al., 2000;von Kuhlmann et al., 2004;Horowitz et al., 2007).The second edition of the Mainz Isoprene Mechanism (MIM2) (Taraborrelli et al., 2009), has now been added to the model.MIM2 gives a much more comprehensive treatment of isoprene oxidation, having 69 species undergoing 178 reactions, and contains a more detailed representation of intermediate products such as peroxy radicals, alkyl nitrates and peroxyacylnitrates.MIM2 closely follows the Master Chemical Mechanism (MCM) chemistry (Saunders et al., 2003), with lumping of species only used when the species in question have very similar reactivities and are nearly always in constant ratio regardless of the conditions.In line with the CiTTyCAT philosophy, it is designed to be flexible to allow future updates in chemistry to be easily incorporated in a way which MIM did not.Given the current uncertainty in isoprene chemistry and its importance for gas-and aerosolphase chemistry (e.g.Lelieveld et al., 2008;Kiendler-Scharr et al., 2009;Pugh et al., 2010a;Stone et al., 2011), the absence of tuning of reaction rate coefficients in MIM2 (in order to improve the fit for certain species at the expense of others; a common practice in condensed mechanism development) is useful as modifications can be made to the chemistry without unintentionally de-tuning the scheme All bimolecular and termolecular reactions and rate coefficients are implemented as described in Taraborrelli et al. (2009).However, as CiTTyCAT is designed to be used at all altitudes in the troposphere, the simple MCM photolysis rates assigned in Taraborrelli et al. (2009) have been replaced with absorption cross-sections and quantum yield data, which has been assigned as listed in Table 2.As measured values are not available for many species, proxy species have been assigned following the MCM (Saunders et al., 2003).These values are then fed into the CiTTyCAT photolysis scheme (Sect.2.3).The photolysis reactions for PAN and PAN-like species have also been invoked as these are likely to be important in the upper troposphere.

Monoterpene chemistry
After isoprene, monoterpenes are believed to be the most abundantly emitted group of BVOCs, with emissions of ∼130 Tg yr −1 (Guenther et al., 1995).Reacting rapidly with OH, O 3 and, where present, NO 3 , (Atkinson and Arey, 2003), monoterpenes play an important role in the chemistry of the boundary layer, including significant potential to form aerosol (Hallquist et al., 2009).The α-pinene chemistry scheme incorporated in CiTTyCAT (Jenkin, 1996) is designed to generate realistic yields of condensible secondary products which may contribute to secondary organic aerosol formation.As α-pinene is one of the less reactive monoterpenes with respect to OH -having a lifetime of 2.6 h at [OH] = 2 × 10 6 molecules cm −3 (Atkinson and Arey, 2003) -chemistry of the more reactive monoterpene limonene (lifetime with respect to OH of 49 min at [OH] = 2 × 10 6 molecules cm −3 , Atkinson and Arey, 2003) has also been included (Ryder, 2005).The limonene scheme used is that of Stockwell et al. (1997).This limonene scheme should be used with caution in studies where the isopreneexclusive oxidation product methacrolein (MACR) is a key output, as it apportions a fraction of limonene oxidation products to MACR.As no other chemistry in the model depends on these monoterpene schemes, either scheme can be effectively turned off by setting emissions and initial mixing ratios of the relevant monoterpene to zero.

Heterogeneous chemistry
This version of CiTTyCAT does not include heterogeneous chemistry, although DELOAD is able to handle such reactions.Including such chemistry requires assumptions regarding aerosol/cloud particle composition and size which have not been justifiable in recent applications of the model.Quasi-heterogeneous chemistry is included in the gas-phase for N 2 O 5 hydrolysis.Future applications of the model must assess whether heterogeneous chemistry is important for their case, and add reactions if necessary.

Photolysis
This version of CiTTyCAT currently uses an isotropic twostream approach, based on the Harwell model of Hough (1988).The poor performance of this method at low solar zenith angles is circumvented by using a Chapman function to calculate an effective solar zenith angle for all angles greater than 75 • (Wild, 1995).Real et al. (2007) previously coupled CiTTyCAT with the Fast-J photolysis scheme of Wild et al. (2000), which offers several advances over the two-stream model, including being proven to handle accurately an arbitrary mix of cloud and aerosol layers.Fast-J Table 2. Source of photolysis data for the MIM2 species used in CiTTyCAT.The MCM primary species are listed.Some of the MCM species are used as proxies for species for which dedicated data is not available, as per Saunders et al. (2003)
is currently being implemented in the combined CiTTyCAT model.
As the O 3 column shows considerable latitudinal variation, a zonally averaged O 3 climatology (Li and Shine, 1995) is read in according to the model latitude and month.This climatology used the Solar Mesosphere Explorer near-infrared airglow data and is available at http://badc.nerc.ac.uk/data/ ugamp-o3-climatology/ugamp help.html.This dataset has 19 pressure levels between 997 mb and 4 mb, which are interpolated to the pressure levels used by the photolysis scheme.Two further levels are defined in the stratosphere, at 0.03 mb and 0.04 mb, which do not change latitudinally or temporally.These are based on climatology data assembled by Rumbold (2007), which is scaled to the main climatology at 4 mb.Alternatively, the US Standard Atmosphere can be used.Cloud coverage can be user defined, use ECMWF output, or set using the incorporated mean global climatology based on London (1957) and Rodgers (1967).Three discrete cloud levels are available; low (assumed to be stratus), medium (mixture of altostratus and altocumulus) or high (cirrus).The wide variability of aerosol levels found in the lower atmosphere makes treatment of particular scattering and absorption difficult to generalise; an average aerosol profile is coded into the model, based on that of Braslau and Dave (1973), with elevated levels of aerosol in the lowest model levels appropriate for urban conditions (Demerjian et al., 1980).If required this may be replaced with another profile interpolated to kilometre levels.A global mean column optical depth of 0.1 is used as a default (Hough, 1988).
The photolysis scheme has been tested with surface NO 2 photolysis data from Mauna Loa at 3400 m (Shetter et al., 1992), and is in very good agreement with this data (Wild, 1995).A further test was carried out by running the model for Danum Valley, Malaysian Borneo, with cloud cover adjusted to replicate the measured j(O 1 D) profile at that location.The modelled j(NO 2 ) agreed well with measurements made at that site (Fig. 2).Note, however, that the two-stream scheme has problems converting observed cloud/aerosol cover to the correct photolytic flux.In instances where this is critical, Fast-J should be used instead.

Surface emissions
A variety of surface emission schemes is incorporated, covering biogenic and anthropogenic sources.A strong emphasis has been given to biogenic emissions which have been the topic of considerable interest and several major field campaigns in recent years (e.g.Karl et al., 2007;Kuhn et al., 2007;Lelieveld et al., 2008;Hewitt et al., 2010).However, further emission sources such as biomass burning, aircraft and oceanic emissions or higher resolution datasets can easily be added where these datasets are available.For instance, the new global dataset of Lamarque et al. (2010) developed for the IPCC fifth assessment report would be straightforward to add if so required.Emission fluxes are read in at each model step, converted to number fluxes (molecules cm −2 s −1 ), and instantaneously mixed over the prescribed boundary layer height (if simple or no mixing is used).If vertical mixing with an evolving background is used, the emissions fluxes are part of the surface boundary condition for the flux profile.

Anthropogenic emissions
Global anthropogenic emissions of CH 4 , CO, NO and non-methane VOCs come from the EDGAR Fast Track 2000 global database (Olivier et al., 2001) providing averages for the year 2000 on a 1 • × 1 • grid.EDGAR provides a lumped non-methane VOC value which must be speciated for input to the large number of anthropogenic species represented in CiTTyCAT.This presents a challenging task on the global scale, as VOC speciation varies widely across the globe.CiTTyCAT uses a speciation based on the European CORINAIR emissions (Evans, 1999); this speciation is not necessarily suitable for locations outside Europe.An optional temporal modification according to the time of day and day of week is included.For studies over South-East Asia, the 0.5 • × 0.5 • resolution Regional Emissions inventory in Asia (REAS), which provides total emissions for the year 2003 (Ohara et al., 2007), has been added.This inventory also provides speciated hydrocarbon data, carbon monoxide, methane and anthropogenic NO x .This avoids the use of the CORINAIR VOC speciation.It is beyond the scope of this paper to assess the validity of these emission datasets, which is the responsibility of the individual user.However, it is noted that different inventories often give quite different values for the same grid square.

Soil NO x emissions
Soil NO x is a potentially important contributor to the global NO x budget (e.g.Yienger and Levy, 1995;Delmas et al., 1997) and therefore the 1 • × 1 • dataset of Yienger and Levy (1995) has been added to the model.This database gives monthly emissions avergaed over the period 1985-1995 and was created using an empirical model, incorporating the effects of different biomes, pulses after rainfall and canopy uptake.Other databases exist, however, the global emissions of soil NO x inventories show significant variation from 5.5 Tg N yr −1 (Yienger and Levy, 1995), through 8.6 Tg N yr −1 (Steinkamp and Lawrence, 2011) and 9.7 Tg N yr −1 (Potter et al., 1996) to 21.0 Tg N yr −1 (Davidson and Kingerlee, 1997) and a decision between them cannot currently be made on the basis of measurement validation.The Yienger and Levy (1995) database was adopted because it implicitly accounts for the effect of canopy reduction on soil emissions, a vital component for a boundary layer and troposphere model that is fed by out-ofcanopy emissions.Canopy losses are thought to reduce the raw soil NO x fluxes by around 50 % worldwide (Yienger and Levy, 1995;Ganzeveld et al., 2002).It is straightforward for a user to incorporate a specific soil NO x emissions data set into the model, for example over a particular region of interest.

Biogenic hydrocarbon emissions
CiTTyCAT offers a range of options for biogenic emissions.Isoprene emissions can be taken directly from the global inventory of Müller et al. (2008), although recent measurements show this inventory to overestimate substantially the emissions of tropical rainforest in South-East Asia (Pugh et al., 2010a;Langford et al., 2010).Alternatively emissions can be calculated on-line using the MEGAN model (Guenther et al., 2006).This approach is much more flexible than using an inventory and also allows emissions of monoterpenes or any other biogenically emitted gases for which emission factors are available to be calculated.Also available is the MEGAN light and temperature dependence algorithm alone, neglecting the other datasets fed into MEGAN.Finally the model can be driven with user supplied biogenic emissions.We discuss these implementations in the following sections.No marine sources of biogenic VOC emissions are currently considered as they are very small compared to land emissions, although Arnold et al. (2009) suggest that oceanic emissions can maintain a few tens of pptv of isoprene in remote marine areas and this may be enough to increase radical sources such as HCHO in the marine boundary layer.

MEGAN model on-line
MEGAN estimates above-canopy fluxes of biogenically produced species including isoprene, several monoterpenes and sesquiterpenes, NO, CH 4 and CO.It takes into account both past and current temperature, photosynthetically active radiation (PAR) and leaf area index (LAI), along with solar zenith angle and maps of plant functional type.It is also capable of accounting for soil moisture and production and loss within the canopy.The basic MEGAN algorithm is, where, (µg m −2 h −1 ) is an emission factor which represents emission into the canopy under standard conditions, and is modified for different conditions by the canopy environment activity factor (γ CE ), leaf age emission activity factor (γ age ) and soil moisture emission activity factor (γ SM ).The parameter, ζ , is a ratio which accounts for production and loss within plant canopies.A more detailed description of the algorithms employed, along with various alternative techniques is given in Guenther et al. (2006).
As the MEGAN algorithms are computationally straightforward, and the principal variables required to generate emissions -temperature and PAR -are available in CiTTy-CAT, running MEGAN on-line is a low-cost solution providing flexible emission estimates.CiTTyCAT directly supplies the current surface temperature (either calculated internally from the box temperature or brought forward from the trajectory) and PAR (calculated internally by the photolysis scheme or specified on a trajectory).Daily average temperature and PAR is sourced from a monthly mean global temperature dataset on a 96 × 73 grid.LAI is sourced from the monthly-mean datasets for the year 2003 supplied on the MEGAN data portal website (see http://acd.ucar.edu/∼ guenther/MEGAN/MEGAN.htm).This data is sufficient to drive MEGAN using the PCEEA algorithm (Guenther et al., 2006) to calculate the canopy environment activity factor, in place of the MEGAN canopy environment model.
The leaf age emission activity factor is calculated as outlined in Guenther et al. (2006) using current and previous LAI and average temperature data.The soil moisture emission activity factor is assumed to be unity, as providing data on soil moisture at each trajectory step would require past weather data at that point, although this could be changed when running in box mode.Since water stress has a proven impact on plant emissions (e.g.Ormeno et al., 2007), it is important to consider this limitation in scenarios where a trajectory passes over a surface where soil moisture is likely to be a significant issue, and therefore this factor may be modified off-line.The canopy loss/production factors are also assumed to be unity in the standard implementation.However, they can also be edited as desired.It has been shown that the results from using a variety of simple and complex canopy environment models are within the uncertainty range of the flux calculation (Lamb et al., 1996) and so the lack of an explicit canopy environment model, for which CiTTy-CAT cannot provide the driving variables, is not expected to present a limiting uncertainty.
The emission factor for isoprene is calculated for the current model grid square using global maps of six plant functional types (broadleaf tree, deciduous needleleaf tree, evergreen needleleaf tree, grass, crops, shrubs) and maps of the global variation of emission factors for each of these groups (Guenther et al., 2006).For the monoterpene species the plant functional type maps are used in conjunction with a single global emission factor for each group.Currently datasets of 30 min resolution are used, in line with the resolution of other emissions.Datasets of 150 s resolution are also available.The highest resolution data that might be used to drive CiTTyCAT over rural areas without approaching the limitations of the assumption of instantaneous boundary layer mixing that is implicit within the model is ∼5 km 2 (approximately equivalent to 150 arc second resolution at the equator).Currently only isoprene and monoterpene emissions are fed from MEGAN into CiTTyCAT.Total monoterpene emissions are split between the α-pinene and limonene schemes.The fraction of total monoterpenes emitted as αpinene and limonene must be reviewed for each case study, and should be based on the observed fraction of more reactive (i.e. more limonene like) and less reactive (more α-pinene like) monoterpene species.
This implementation has been tested for isoprene by comparing to measured emissions over the Amazon (Karl et al., 2007) and South-East Asian rainforest (Langford et al., 2010).Figure 3 shows the results of this comparison for SE Asia rainforest.The measured eddy covariance flux of isoprene is shown by the black line.This is several times smaller than the base flux of isoprene generated by MEGAN for the same location (4 • 58 59 N, 117 • 50 39 E ).However the MEGAN input datasets were primarily compiled using data for the Amazon.Using the calculated canopy-scale isoprene basal emission rate for this location of 2.0 mg m −2 h −1 (Langford et al., 2010), yields a result very close to the measurements (red line).A similar result is yielded using the light and temperature algorithm only (green line), i.e. neglecting the effects of the LAI dataset.This shows that, at least for this region, the assumptions made regarding past temperature and PAR using global datasets do not impair the model's ability to generate reasonable isoprene fluxes.
To test the model where fewer constraints are available, a run was carried out over Amazonia using the peak temperature (30 • C), location (−2.612 • N 60.21 • W) and time of year (September) specified in Karl et al. (2007).The model generates peak fluxes of almost 20 mg m −2 h −1 , much more than the measured 7.8 mg m −2 h −1 .However the peak PAR generated using the inbuilt climatological cloud cover is almost 2500 µmol m −2 h −1 , much higher than would be expected for this location.No measurement for PAR is given, but adjusting it to the ∼1500 µmol m −2 h −1 peak seen at a similar latitude in Borneo and applying the measured basal emission factor of 5.8 mg m −2 h −1 , instead of the 8.1 mg m −2 h −1 generated by the model, produces a peak emission of 9.4 mg m −2 h −1 .This analysis highlights two issues.Firstly, the importance of using appropriate cloud cover to generate the correct PAR, and secondly, that global emission factor datasets can currently only supply an approximation to the actual conditions.Nonetheless, the analysis here demonstrates that the CiTTyCAT implementation of MEGAN is able to generate realistic fluxes of isoprene.

MEGAN model off-line
The model has been updated to use data from the 1995-2006 emissions inventory by Müller et al. (2008), which was created using the latest version of MEGAN combined with the detailed canopy environment model MOHYCAN (MOdel for HYdrocarbon emissions by the CANopy, Wallens, 2004) which calculated the leaf temperature and the radiation fluxes within the canopy.The isoprene emissions data are for the monthly mean diurnal cycle (24 times per day, from 00:30 Z to 23:30 Z) on a 0.5 • × 0.5 • grid for each month of 2001.The year 2001 was chosen as this was the only year for which the diurnal cycle was readily available to the authors.However, the interannual variability in this dataset between 1995 and 2006 is a maximum of 20 %.The driving meteorology is from the ECMWF operational analysis.
The MEGAN implementation of Müller et al. (2008) differs from the CiTTyCAT implementation in several ways.Since leaf temperature and the PAR vary with respect to height within the canopy, MOHYCAN calculates these variables for each of eight layers.The canopy model requires direct and diffuse values of PAR and near infra-red radiation at the canopy top, as well as air temperature, relative humidity and wind speed just above the canopy.These are taken from the ECMWF analyses.Soil moisture data is also drawn from ECMWF analyses.An additional distinction between evergreen and deciduous broadleaf trees is made based on the global ecosystem database of Olson et al. (2001).
For the South-East Asian rainforest scenario described above for testing the online version of MEGAN, Müller et al. (2008) estimate a midday isoprene emission of 6.4 mg m −2 h −1 , ∼60 % greater than the value measured by Langford et al. (2010).As described previously, the overestimation of MEGAN in this region is attributable to incorrect estimation of the basal emission factors.For the Karl et al. (2007) measurement site, Müller et al. (2008) calculate an isoprene emission of 7.3 mg m −2 h −1 compared to the average of 7.8 mg m −2 h −1 that was measured, an excellent agreement.However this probably gives a misleading impression of the accuracy, as comparisons to different Amazonian measurements in Müller et al. (2008) showed that the dataset tended to overestimate isoprene fluxes by a factor of 1.7.Nonetheless, using comparisons with observations of the formaldehyde column as measured by GOME (Global Ozone Monitoring Experiment), Müller et al. (2008) show that their results are closer to the observations than those of Guenther et al. (1995).

Other options
Isoprene emissions can also be driven by the MEGAN light and temperature dependent algorithm alone (Guenther et al., 2006).This algorithm is frequently used in analysis of flux datasets and such datasets may be used to optimise the empirical parameters in the algorithm (e.g.Langford et al., 2010;Misztal et al., 2011).Its inclusion in CiTTyCAT allows a more straightforward comparison of the model with such analyses (e.g.Hewitt et al., 2011).A further option that can be used to compare with campaign data is to feed measured fluxes directly into CiTTyCAT.

Dry deposition
Dry deposition is parameterised as described in Eq. ( 6).Dry deposition velocity, V dry , determines the dry deposition flux of a species at the surface according to the concentration of that species at a reference height (usually 10 m or less).In CiTTyCAT, species within the boundary layer are assumed to be well mixed, and thus the surface dry deposition flux is apportioned uniformly across the depth of the boundary layer via V dry / h.Dry deposition velocities may be provided by the user, but CiTTyCAT is pre-programmed with a set of defaults following Derwent and Jenkin (1991), who give the deposition velocities for different atmospheric constituents for summer and winter day-time and night-time, for each of the following land surface categories: water, forest, grass, tundra/desert and ice/snow.The land surface type is categorised using the MEGAN plant functional type dataset (Guenther et al., 2006) at a resolution of 0.5 • × 0.5 • .The value of the dry deposition velocity at a given point is a weighted average of the dry deposition velocities for each land type in the grid cell which contains the current trajectory latitude and longitude.There is a sinusoidal seasonal and a sinusoidal diurnal variation applied to the deposition velocities, based on the summer/winter and day time/night time values.As described in Sect.2, dry depostion is only active when an air parcel is inside the planetary boundary layer, although in trajectory ensemble mode (Sect.4.2) these surface fluxes are propagated vertically.
With the addition of the more complex biogenic chemistry schemes, more deposition velocities have been added from the literature (Appendix C).Literature deposition velocities for isoprene nitrates are particularly uncertain, varying from similar to those for PAN (0.4-0.65 cm s −1 , Shepson et al., 1996;Giacopelli et al., 2005), to those for nitric acid (4-5 cm s −1 Rosen et al., 2004;Horii et al., 2006).In CiTTyCAT, they have been allocated the deposition velocity of nitric acid following the work of Horowitz et al. (2007).Acetone deposition velocities show reasonable agreement for both land and sea in the literature (Jacob et al., 2002;Singh et al., 1994Singh et al., , 2003)), hence a single value is used throughout.Methanol deposition velocities show notable differences between values measured over land (Karl et al., 2004;Millet et al., 2008) and sea (Singh et al., 2003) and that is reflected in the chosen values.In the absence of data for ice and desert, these have been assumed the same as for water.
The CiTTyCAT dry deposition scheme is simple compared to the commonly-used resistance scheme of Wesely ( 1989), but is more appropriate for a box model where micrometerological parameters are not calculated.It also allows very easy user modification to allow the model to be parameterised by field measurements.This is particularly pertinent in the light of new research showing much higher deposition velocities for many organic compounds than is predicted by the Wesely (1989) scheme (Karl et al., 2010;Pugh et al., 2010a).

Wet deposition
The wet deposition scheme used was first implemented into CiTTyCAT by Real et al. (2008), described therein as S-WET2, and then implemented into this version of the code.It is based upon the work of Walton et al. (1988), specifying the wet deposition rate for a species, r * (s −1 ), as where S is a scavenging coefficient, p is the column integrated precipitation rate (cm h −1 ) and α spec is a solubility factor which varies from 0, for insoluble gases, to 1 for very soluble species such that α HNO 3 = 1.The factor α spec is defined in Real et al. (2008) following the work of Crutzen and Lawrence (2000) and based upon the Henry's Law coefficient for each species.Values used for S are 2.4 cm −1 for stratiform precipitation and 4.7 cm −1 for convective precipitation, following the analysis of nitrate scavenging using six years of precipitation data in Penner et al. (1991).
In order to account for the sub-grid scale nature of precipitation, r * is modified to produce an effective wet deposition rate, r * eff , according to, where, t, is the model timestep in seconds and f is the fractional area of the grid cell over which precipitation is occurring.For convective precipitation f = 0.3 and for stratiform precipitation f = 1.0 following Walton et al. (1988).
No more recent values appear to be available in the literature, however appropriate values of f for convective precipitation may vary according to the size of the area the CiTTyCAT box is being used to represent.This scheme is applied under the caveat that the column integrated precipitation rate, p, is supplied as a rate at the surface (usually taken from a numerical weather prediction model or local measurements).The distribution of precipitation in the vertical is unknown.Therefore the assumption is made that if the precipitation rate is non-zero, its effects will be felt by an air parcel anywhere in the column.An exception to this is descending parcels, which are almost invariably sub-saturated, and as a result will only experience precipitation if they happen to run under precipitation falling from a saturated air-mass above.In the absence of information regarding the relative humidity of the column, a condition is applied that the wet deposition rate in descending air parcels is zero.
In order to calculate α spec the solubility of a gas in water must be defined by its Henry's Law coefficient, k H .This is usually defined as, where C a is the concentration of a species in the aqueous phase and P g is the partial pressure of that species in the gas phase.The coefficient is temperature dependent, with greater solubility typically observed at lower temperatures.Sander (1999) defines the temperature dependence of the Henry's Law coefficient as, where T is the temperature in K. Dissolution of acids has the effect of drawing more of that species into the aqueous phase and must be considered when considering the amount of a gas partitioning into the aqueous phase.For a monoacidic species the effective Henry's Law coefficient, k * H accounting for this dissociation can be defined as, where K c is the dissociation constant of the acid and [H+] is the hydrogen ion concentration in solution.CiTTyCAT has been modified to adjust the basic Henry's Law coefficient according to the box temperature at that timestep and a specified [H+] (pH = 5.6 -the pH of rainwater at an ambient CO 2 mixing ratio of 350 ppm).The number of species deposited has been extended from the 10 species considered in Real et al. (2008) to include all the aldehyde, peroxide and peroxyacyl species in the extended organic chemistry schemes (Sect.2.2) which may have a significant solubility.
The species deposited and their Henry's Law coefficients are listed in Table 3.Only a few studies have been carried out to test the solubility of these organic gases and hence the k H for many species has been inferred using measurements for species with similar arrangements of functional groups.These "proxy" species are listed, where appropriate, in Table 3.The wet deposition scheme has been tested by Real et al. (2008) who modelled the photochemical evolution of a plume from the New York region in its transit across the North Atlantic.They found that incorporation of this wet deposition scheme, SWET-2, yielded a significant improvement in the model fit to the observations for HNO 3 and NO y .Much of the trajectory followed in the work of Real et al. (2008) was in the free troposphere, away from surface emission sources.In the first two days of their run, when the trajectory was in the boundary layer, and hence tracers were constantly replenished by surface emissions, there was a substantial difference between the results of the wet deposition scheme described here and a simple constant loss rate approach, with the wet deposition scheme performing better in comparison to the measurements.

Integration of chemical reactions
The chemical integrator is the core of the CiTTyCAT model, and is the part of the model that requires the most computing time.Coupled ordinary differential equations for the evolution of mixing ratios (Eq.6) are integrated using the DVODE (Variable-coefficient Ordinary Differential Equation) solver using variable-coefficient Adams-Moulton and Backward Differentiation Formula methods in Nordsieck form with a variable time step (Brown et al., 1989).For the ensemble mode this timestep is always smaller than the "mixing time step" (typically one hour) which is the frequency at which the background vertical profile is updated by averaging across the ensemble.The DVODE integration is then re-started at every mixing time step.
Importantly, processes with rates proportional to the mixing ratio of the species being updated are treated as "loss terms", −Lc, so that the calculation of the Jacobian of the rate equations by linearization of the ODEs reflects the exponential decay associated with such terms.This avoids the spurious generation of negative mixing ratios.The Jacobian method is used to make the numerical solution of the stiff set of ODEs more efficient in achieving a specified tolerance on both absolute and fractional error in mixing ratios of all species.Chemical production and processes not proportional to c come into the equations through the P term.The Jacobian is regularly updated at the "physics time-step" (typically 300 s) allowing for dependence of the coefficients such as L and P on other species and the changing physical processes in the environment of the airmass (e.g.temperature, pressure, photolysis, emission and deposition rates).The only chemical species not integrated are O 1 D, H 2 O, N 2 and O 2 .O 1 D is calculated instantaneously from its production via the photolysis of O 3 , and its loss via reaction with CH 4 , H 2 O and O 3 , or collision deactivation with N 2 or O 2 .N 2 and O 2 are held fixed as their mixing ratios vastly exceed those of the species of interest in the model.H 2 O is updated by interpolation from meteorological analyses to trajectory points and is not affected by the chemistry scheme.

Description of single box mode
The single box mode can be driven in two ways, specified using the flag, ltraj.The first (ltraj = false) allows for very quick assessments to be made by making a number of assumptions, the second allows the assumptions to be reduced for a more detailed analysis, by supplying additional data.
Simple assessments can be made by specifying the latitude, longitude and pressure altitude of the box, along with a start day and time and a run length.The model then chooses an appropriate temperature and boundary layer height (h) based upon internal datasets (Law and Pyle, 1993).If desired, a diurnal temperature amplitude can also be specified.The model then picks up emissions and deposits species as per the routines outlined in Fig. 1 (model layout) and described in Sect. 2. If MEGAN is called for biogenic emissions, then surface temperature is estimated by scaling according to the dry adiabatic lapse rate.As emissions will only apply in this mode if the air parcel is in the boundary layer, then the dry adiabatic lapse rate is a reasonable assumption.The simple box mode given by ltraj = false is excellent for making initial assessments or for teaching purposes, however for a detailed assessment of a scenario, for instance for comparison with measurements, it is necessary to provide more input data regarding variables such as temperature, boundary layer height, cloud cover, rain rate and specific humidity.To supply these variables the model can be run as a stationary trajectory (ltraj = true).That is, all the required variables are supplied in a file which is prepared offline and read in by the model using the same scheme as for the trajectory mode.Currently variables which can be supplied are, box temperature, surface temperature, photosynthetically active radiation (PAR), convective and stratiform rainrates, specific humidity and low, middle and high cloud cover fraction.These input variables are interpolated to the model timestep.
Box studies are limited to situations where advection is not dominant, or situations where the surface over which the air is advected is relatively homogeneous over the timescale of interest.They are commonly used for analysis of data collected at surface sites (e.g.Evans et al., 2000).In this manner it has been tested extensively by Hewitt et al. (2009) over a tropical rainforest and oil palm plantation in Malaysian Borneo, and was used to study stagnant anticyclonic conditions over the UK west midlands in Donovan et al. (2005).Under stagnant conditions, advection timescales are relatively long, allowing the box model to be run to steady state to investigate effects of changing the emissions on ozone chemistry.Although there are currently problems in the understanding of oxidation processes over high isoprene environments such as tropical rainforest and oil palm (Lelieveld et al., 2008;Butler et al., 2008;Pugh et al., 2010a), it has been demonstrated that models can still generate good predictions of NO x and ozone chemistry in these regions (Pike et al., 2010;Pugh et al., 2010a).Figure 4 shows the change in ozone mixing ratio for a change in NO x mixing ratio, which was driven by a change in NO emissions, for two different VOC emission scenarios; a rainforest environment (green) and an oil palm plantation environment (red).Measured boundary layer O 3 and NO x mixing ratios for these two environments are plotted in black and grey, respectively (Hewitt et al., 2009).These show that the model was able to generate ozone and NO x mixing ratios within error of the measurements for two different VOC emission scenarios.

Description of 2 box mode
When operating in box mode at the surface, the influence of the nocturnal residual layer on morning boundary layer mixing ratios can be important.To account for this the model has where C L and C U are the mixing ratios of species i at time t for the upper and lower box respectively, h is the mixing height (height of the lower box), m is the rate of mixing height rise and t is the model timestep.When in two-box mode the model automatically detects a rise in mixing height specified in the input file, and will initiate mixing with the residual layer according to the rate of this rise, however the time for planetary boundary layer collapse must be set in the model.Although the residual layer box continues to integrate during the day, its results during this time are ignored until its mixing ratios are reset again upon boundary layer collapse.This method is useful for two reasons.Firstly, it allows more meaningful comparison with measurements during the night due to the distinction between the nocturnal boundary layer (NBL) and the residual layer.Care must still be taken here though, as the NBL is typically strongly stratified at night and hence tracer mixing ratios may vary widely through its depth if there is a night-time emission source.Meanwhile, mixing events between the NBL and the residual layer may take place during the night, as suggested by Ganzeveld et al. (2008), based upon the work of Gao and Li (1993) and Turner et al. (1994).Hence the residual layer may not be truly isolated from the surface during the night.Under some conditions, these effects may be compensated for by adjusting the deposition velocities to reflect a stratified regime in the boundary layer (Pugh et al., 2010b), and by extending the mixing scheme to include some exchange during the night.
The effect of the modified night-time scheme on daytime tracer mixing ratios can be substantial.Figure 6 shows an example of the effect of accounting for residual layer mixing ratios as opposed to running a single box during the night (Pugh et al., 2010a).Daytime mixing ratios of NO 2 are lower in two-box mode because the residual layer box is not subject to emissions over the course of the night, so when the two boxes are mixed together in the morning (between 08:00 and 10:00 LT in this scenario) there is a much sharper decline in mixing ratio than that experienced due to photochemistry alone in the single box.This effect leads to less OH in twobox mode, which has a substantial effect on the mixing ratio of isoprene.Ozone has reduced day-time photochemical production, due to less NO 2 , balanced by the mixing-in of high residual layer ozone mixing ratios in the morning which have not undergone substantial overnight deposition.Using the measurements gathered during the Oxidant Particle and Photochemical Processes (OP3) field campaign (Hewitt et al., 2010), it has not been possible to definitively ascertain whether the two-box mode more accurately predicts boundary layer mixing ratios than a single box.However, as a result of the effects described above, if the analysis of Pugh et al. (2010a) had used the single-box model rather than the two box model, the particular problems they encountered in reconciling measured NO x mixing ratios to soil NO x fluxes would have been greatly enhanced and the goodness of fit of species such as isoprene (C 5 H 8 ) during the day would have deteriorated.
As the effect of model mode on daytime mixing ratios can be so substantial, a simple way to avoid having to parametrise the details of mixing events or dry deposition rates under a stratified regime is to fit the height of the NBL according to the model fit to daytime mixing ratios.This is the approach adopted by Strong et al. (2010) and Pugh et al. (2010a).By using the NBL height as a fitting parameter in order to gain the correct morning mixing ratios, modified dry deposition rates and mixing between the NBL and the residual layer are implicitly accounted for.Overall, the two-box mode offers an option for increased physical insight in conditions of regular boundary-layer evolution.

Trajectory mode
In trajectory mode, the notional air parcel travels along the path of a given trajectory, which describes the position and meteorological conditions at any given time.In the examples shown here, the trajectory information is interpolated from the ECMWF analyses, however other sources of trajectory information could be used.The model runs along trajectories pre-calculated using winds and temperature from meteorological analyses.In addition, boundary layer height and precipitation rates, output from the driving forecast model during its analysis cycles, are interpolated to trajectory points and used as inputs to the mixing and various wet deposition schemes.The single trajectory mode is the same as the single box mode (ltraj = true) described above, but using a moving trajectory in place of a stationary trajectory.In the ensemble mode, many trajectories are run simultaneously and communicate through the vertical mixing scheme.

Single trajectory mode
Running CiTTyCAT along a single trajectory is faster than along an ensemble, and therefore useful for preliminary work, experiments that do not require sophisticated mixing, and for teaching illustrations.To set up a single trajectory run, initial conditions and trajectory information are required (as for the box run).Tracers in a single trajectory run may undergo no mixing, as for the single box, or they may be relaxed either towards a user specified constant background mixing ratio, or towards a 3-D tracer field specified by a CTM.The rate of relaxation is prescribed using a diffusion coefficient, κ.Prior to the development of the ensemble, the single trajectory model was used to study long range transport, for example by Wild et al. (1996), Evans et al. (2000) and Real et al. (2007Real et al. ( , 2008)).As the treatment of single trajectory mode has been covered thoroughly in these publications, and the principle of running a single trajectory is the same as for the box mode described in Sect.3, a detailed description is not given here.

Ensemble mode (co-evolution of trajectory and background)
The aim of the ensemble mode is to represent variability within an airmass and its contrast with surroundings so that ) to form the background mixing ratios in that layer.For instance, at t = 1, the mixing ratios in one trajectory define the background in the lowermost model level, however, for the second level mixing ratios from four trajectories are averaged to find the background.Note that the trajectories do not have to start at precisely the same timepoint, i.e. they can be released at time intervals along a flight track.
mixing can be coupled fully with photochemistry.Here it is implemented through the use of a background vertical profile that evolves with the ensemble, taking into account the influence of surface fluxes below all trajectories.Simply relaxing towards static background mixing ratios has limited validity, as in many cases the actual background will change as a trajectory travels thousands of kilometres over many days.Real et al. (2010) recently described a method (ZooM-CiTTy) that used an ensemble of trajectories to represent the contribution of several air masses to a measurement at a particular location.Mixing is performed at the end of the 5-day trajectories for comparison with measurements.The method showed utility in examining the evolution of reactive tracers, and in capturing observed mixing lines between different air masses, but Real et al. (2010) note that a more realistic treatment of mixing "en route" is required to capture the nonlinear coupling between chemistry and mixing.This section details an ensemble mode in which the model is run simultaneously along many trajectories that communicate with each other through an on-line mixing scheme which is active throughout the simulation.There are two components to the mixing scheme: (i) the evolution of a background vertical profile through diffusive mixing, and (ii) a representation of mixing of individual airmasses within a layer towards the background mixing ratio of that layer.This ensemble explicitly represents the variability within an air mass, which is essential for comparing any model with data.First, an ensemble of trajectories is initialised; in the example here aircraft data at ten second intervals are used.Typically greater than 100 trajectories are used and CiTTyCAT is integrated forwards along each trajectory.Each trajectory evolves by photochemistry, emissions, wet and dry deposition (as described in Sect.2) and the mixing scheme described below.Full diagnostics are not output by the model for every trajectory.Instead, one trajectory is selected to be the focus of the study, for which full diagnostics are saved.Henceforth, this is referred to as the "reference trajectory".Typically the reference trajectory lies near the centroid of the trajectory ensemble so that it is always surrounded by others above and below for the purposes of mixing.
At the start of the simulation, a background profile is calculated by dividing the troposphere into layers of equal depth, z, as shown in the schematic in Fig. 7.The ensemble members that lie within a layer are averaged, and their mean is taken to be the background mixing ratio in that layer.As the trajectory ensemble that is initialised from aircraft measurements generally does not sample the entire depth of the troposphere (e.g.Fig. 8) "shadow trajectories" are used to simulate the evolution of the background profile above and below the height range of the ensemble.These shadow trajectories follow the same horizontal path of the reference trajectory and evolve photochemically, but do not move vertically (as shown in Fig. 8).Their purpose is only to provide suitable far-field mixing ratios to the background profile.The assumption of no wind shear in the shadow trajectories introduces an uncertainty in the background composition, as in reality an air mass will come into contact with air masses of different origins.However, the scheme represents an improvement compared to a simple relaxation to a static background.An example of the hourly background profiles for the first day of a simulation is shown in Fig. 9. Initialisation of the background trajectories is as follows: below the ensemble the airmass is assumed to be well mixed and the mixing ratio is set to the value of the lowest layer of the ensemble; above the ensemble, it is either a linear interpolation (as in Fig. 9) or a step-change to the upper tropospheric background mixing ratio obtained from campaign-average data.
Once the background profile is established, non-advective fluxes (F d in Eq. 6) in each layer are calculated, which act to evolve the background profile mixing ratio.This evolution can be seen in Fig. 9, as over the first day, the effects of dry deposition can be seen near the surface as the O 3 mixing ratio decreases with time.The effect of averaging within layers can also be seen, as the initial inhomogeneity in the plume is smoothed with time (e.g.blue profiles to green profiles).
The boundary layer and the free troposphere are treated differently in the new mixing scheme (as described in Sect.2.1), and there is a non-zero flux through the boundary layer top, which communicates surface effects vertically (the single trajectory version of CiTTyCAT has no flux through the boundary layer top).Hence the effects of emission and deposition at the surface can be felt even by air masses running above the boundary-layer.This is vital for considering air mass evolution on timescale greater than one day.
In the examples shown in this section and Sect.4.3, trajectories have been calculated offline from the ECMWF analysed fields using the Reading Offline TRAJectory Model (ROTRAJ) (Methven, 1997).The horizontal resolution of the analyses used in this work is T159 (∼0.75 • ).The vertical resolution in hybrid-pressure co-ordinates retains 60 levels (L60).There is a linear interpolation in time and horizontal space, and a cubic interpolation in the vertical, of the meteorological fields (e.g.wind velocity, humidity, temperature) between analysis times, which are every six hours.The location of the trajectory is calculated by integrating the velocity with respect to time using a 4th order Runge-Kutta method (Methven, 1997).

Evolution through diffusive mixing: theory
Diffusion affects mixing ratio through the non-advective flux divergence (from Eq. 6): where ρ is assumed to fall exponentially with density scale height, H : where ρ s is density at the surface and H is typically 7200 m.
In the free troposphere, the flux-gradient hypothesis is assumed to calculate the diffusive flux: where κ is the free tropospheric diffusion coefficient.The gradients are discretised using the standard centred finite difference scheme with regular level spacing, z, assuming constant diffusivity: where L denotes the level index and half-levels are midway between full-levels.
The finite difference equation above is a second order approximation and is only accurate for sufficiently smooth profiles.Therefore, it is applied directly to the background profile C(z L ) obtained by averaging the ensemble.However, in general the mixing ratio on each trajectory c differs from the background mixing ratio around the same level.Consider a special case when a layer (depth D) of high mixing ratio, c, is flanked above and below by the same background mixing ratio, C. In this case (assuming uniform density) Eq. ( 17) reduces to: where τ is a mixing timescale defined by D 2 /(2κ).Arnold et al. (2007) inferred a value for the dilution rate, K mix = 1/τ , from samples with multi-component hydrocarbon measurements taken in polluted air masses crossing the Atlantic in the five ITCT-Lagrangian case studies.The depth D in these cases was estimated from observations from the aircraft flying a vertical profile.Using these values, κ, can be estimated.Typical values in the free troposphere were κ = 0.5-1.5 m 2 s −1 .Similar values were calculated by Pisso et al. (2009).In other cases, κ may not be so well constrained by observations, and must be estimated, dependent on the meteorological situation.Under stable conditions, κ 0.5 m 2 s −1 would be typical.If the meteorological conditions are more turbulent, 1.5 m 2 s −1 would be more suitable.The authors suggest that κ is estimated based on observations if possible, or a comparison with literature values, and that sensitivity tests are carried out to establish the uncertainty in this parameter.In cases where there are weak gradients in the background profile, varying κ may only have a small effect on the results.However, if there are strong gradients (e.g. a biomass burning plume), the choice of κ may be important.

Evolution through diffusive mixing: model
The basic input parameter is the effective turbulent diffusivity for the free troposphere, κ.In the ITCT-Lagrangian cases this was estimated from hydrocarbon data using Bayesian inference as described above.The effects of diffusive mixing are treated using two terms: where the free tropospheric dilution rate, K mix = K FT = 2κ/ z 2 , represents "sub-grid mixing" within each layer used to define the background profile C L , and in this model equation L is the layer within which the trajectory of interest lies currently.Note that the first term corresponds to diffusion of the background profile and is calculated every "physics time-step".The background profile itself is also evolved using Eq. ( 19) without the sub-grid mixing term, plus the linear flux profile mixing the surface fluxes across the boundary layer (Eq.4).The solution for diffusion of the background profile uses a no flux boundary condition at the ground, since the effects of surface fluxes are already accounted for in the linear flux profile.A fixed value boundary condition is used at the top of the background profile.Each trajectory evolves its own background profile to be consistent with the surface fluxes it passes over.Since the trajectories within the ensemble do not follow exactly the same horizontal paths, they pass over different emissions and environments (varying land surface type, precipitation, temperature and so on).So the background profile is periodically re-defined by averaging across the ensemble every "mixing time-step".The mixing timestep is chosen so that the background profiles carried on each trajectory do not have time to diverge too far from the ensemble average before being re-defined.
As turbulence in the boundary layer tends to be greater than that in the free troposphere, a separate dilution rate is used to represent sub-grid mixing within the boundary layer defined by: where h is the boundary layer depth and the user specifies κ BL (typically set to 10 κ).Importantly, this only affects the sub-grid term and the same diffusion coefficient, κ, is used at all levels for diffusion of the background making it better conditioned numerically.Recall that, in addition, the boundary layer is parameterised using a flux profile varying linearly between the surface fluxes (emissions and deposition) and zero at the top of the boundary layer.The combination of fluxes transports the emissions out of the boundary layer at a rate determined by the free tropospheric diffusion coefficient and influenced by the change in boundary layer height with time.
The mixing scheme defined by Eq. ( 19) can be partitioned into a mixing "loss term", −K mix c, and a "production term" (the remaining terms), which are then added to the photochemical production and loss terms and integrated over the (variable) time step by the DVODE integrator, as described in Sect.2.6.

Implications
The scheme in Sect.4.2 is a reasonable approximation as long as the ensemble of trajectories travel coherently, and do not diverge too greatly in the model run.Care must be taken every time the model is set up to avoid choosing ensembles of trajectories that diverge too much.This is a judgement that must be made based on the case in question.A column of evolving background trajectories follows each trajectory separately.Every mixing time step (typically 1 h), the background profiles are averaged.In some cases, the trajectory ensemble may diverge such that this averaging is inappropriate.For example, if a plume splits with one half passing over ocean and the other half passing over a continent, then it may be better to simulate the two sub-plumes separately.The approach also assumes that the gradients in composition are greater in the vertical than the horizontal.Figures 10 and 11 show an example simulation from the ITCT-Lagrangian 2004 experiment.The aim of this experiment was to study transport across the North Atlantic by taking Lagrangian measurements of plumes.This was a case study of transport of forest fire emissions from Alaska and Canada across the N. Atlantic to Europe.The main plume was low in humidity and travelled at ∼4-7 km altitude over five days.The same plume was intercepted by research aircraft three times during these five days (case 2 in Methven et al., 2006).The top and middle panels in Fig. 10 show results from the single trajectory mode (Real et al., 2007), the top panels with photochemistry only and the middle panels with photochemistry and mixing (relaxation towards a background value).The lower panels show results using the ensemble mode with photochemistry and mixing with an evolving background (dry and wet deposition are not important in this dry, high altitude case).The reference trajectory is shown by the orange arrow.This is similar to, but not the same as, the single trajectory from Real et al. (2007), as a different model was used to calculate the trajectories.
A benefit of running an ensemble is that it gives a spread in the results, and thus provides more information about the variability in composition downstream, compared to just running one single trajectory.The spread of the model results will show whether there is much sensitivity to the initial conditions and trajectory followed, and whether or not the plume becomes more homogeneous with time with regard to each species.In the example in Fig. 10, the variability in the plume of both O 3 and CO reduces with time, for both the model and the observations (note that the model is initialised with observations).
For CO, a comparison of the chemistry only single trajectory simulation, the single trajectory with mixing and the ensemble simulation shows that the introduction of mixing brings the model closer to the observations.This is because the high-CO fire emissions plume is situated at high altitude, close to a low-CO stratospheric intrusion (Methven et al., 2006), thus mixing acts to reduce the CO mixing ratio.The results suggest that the ensemble mixing is too strong over the first two days.This is consistent with the meteorological conditions as described by Real et al. (2007) and analysed in detail by Pisso et al. (2009): the trajectory was in clearsky, quiescent conditions until 20 July (day 202), followed by more turbulent conditions within a warm conveyor belt after this time.In their single-trajectory study, Real et al. (2007) use this additional observational insight to parameterise weak mixing prior to 20 July and stronger mixing after, and hence rationalise the model-observation difference in terms of known processes.For O 3 , the effect of mixing is less obvious, as the surrounding air has a similar composition to that within the plume.
To summarise, the ensemble mixing scheme parameterises mixing through vertical diffusive fluxes, and provides a physically based mechanism of communicating between trajectories.Its advantages over other parameterisations are that it does not confine surface effects (emissions and dry deposition) to the boundary layer, and that it provides a spread of mixing ratios instead of a single value.Other, complementary, ensemble-based approaches are also being developed following Real et al. (2010).

Provision for sensitivity studies
One of the great advantages of Lagrangian box models is that they can be used to make wide-ranging sensitivity studies to an extent that would not be possible with an Eulerian model.(Real et al., 2007).The variability in the air mass captured by the modelled ensemble is clearly illustrated, allowing a fuller evaluation of the simulation as compared to the measurements.
For instance, global modelling studies such as von Kuhlmann et al. ( 2004), Wild and Palmer (2008), Butler et al. (2008) typically select a small number of scenarios to test different concepts, perhaps one run using a best estimate of global isoprene emission, and a second run with that number increased by 50 %.However, using a box or trajectory model it is possible to explore the parameter space around the variables of interest in much greater detail (e.g.Derwent, 1987), which can be valuable for ascertaining uncertainty in decision-making (Committee on Models in the Regulatory Decision Process, 2007).Pugh et al. (2010a) explore a hexa-variate parameter space using a "brute force" sensitivity study consisting of 10 000 runs.CiTTyCAT includes provision for such sensitivity studies by attaching multiplicative factors to parameters of interest.These factors are specified in the run shell script which calls the model.However, unlike single runs, sensitivity studies use loops to vary the magnitude of the factors before each call to the model.Several loops can be nested inside one another to carry out runs covering a multi-parameter space.In order to make the generated output more manageable, output from each run is appended sequentially to the standard output files.
Such multi-variate CiTTyCAT sensitivity studies have been used in several ways.Hewitt et al. (2009) use a trivariate sensitivity study to explore the variation in ozone mixing ratios produced by modifying the emissions of isoprene, monoterpenes and NO into a box model.Pugh et al. (2010a) use a similar tri-variate emission sensitivity study, but compare the output model mixing ratios to measurements using a cost function in order to estimate appropriate

Summary and future applications/developments
The CiTTyCAT model has been updated extensively since its first inception, and represents a flexible tool for studying atmospheric chemistry case studies, particularly for local/regional airmass-following analyses initiated from aircraft or surface data, or for box model studies involving biogenic organic compounds.A comprehensive mixing scheme for trajectories has been added, along with detailed biogenic VOC chemistry and on-line emissions from the MEGAN model.The model has been shown here to perform well for a range of different case studies, including intercontinental plumes and the tropical rainforest.Previous studies have also demonstrated good performance in the middle latitudes.The new mixing scheme offers a novel method to account for the changes in background mixing ratios, which can be difficult to address in Lagrangian studies.
A previous version of CiTTyCAT (Emmerson et al., 2004) incorporated a simple primary and secondary aerosol scheme.The plugs for this scheme still exist in the model code, and it is intended to fully integrate this routine in the future.Likewise, the Fast-J photolysis scheme (Wild et al., 2000), which includes a more detailed treatment of the effect of aerosols on radiative transfer, was also incorporated into a previous version of CiTTyCAT (Real et al., 2007), and it is intended that this will also be included in the core version of CiTTyCAT held in the Subversion repository.This work is on-going, but the individual components are available.The CiTTyCAT model is available for download by contacting the authors.

Model performance
The model was tested by running on a single CPU with a clock speed of 2.2 GHz.In this instance the model was compiled using the Intel Fortran compiler version 10.1 and −O3 optimisation.CiTTyCAT is run using double precision arithmetic on linux/unix and is initialised using the bash shell.It is tested on the Portland group (pgf90), GNU (gfortran), Intel (ifort) and Sun (f95) Fortran 90 compilers.The make utility and the Netcdf Fortran toolbox must be installed.The model is available for download from a Subversion (http://subversion.apache.org)controlled repository (user must have Subversion installed) by contacting the authors.
Model documentation beyond the current paper exists primarily as in-code comment; further descriptions of algorithms and rationale for choices are provided in Wild (1995), Evans (1999) and Cain (2009).Release of a new model version is by consensus of the authors of this paper.Errortrapping is facilitated by code-sharing and identified errors are logged and corrected within the code. 1 Derwent and Jenkin (1991).2As for PAN. 3 As for HNO3 following Horowitz et al. (2007). 4Jacob et al. (2002). 5Singh et al. (2003). 6Karl et al. (2004).
2) yields the classical Lagrangian equation for mixing ratio:

Fig. 2 .
Fig. 2. Comparison of photolysis rates measured at Bukit Atur, Malaysian Borneo during the OP3 campaign (Hewitt et al., 2010) (black line, ±1 standard deviation, dotted lines) with model output (blue line) for j(O 1 D) (left) and j(NO 2 ) (right).Model cloud cover was adjusted to approximate the j(O 1 D) fit.

Fig. 3 .
Fig. 3. Comparison of CiTTyCAT-MEGAN isoprene emission estimates with flux measurements made at Danum Valley, Malaysian Borneo.The black line is the mean flux measurement (dotted lines ±1 S.D.).The blue line is the basic CiTTyCAT-MEGAN estimate using global emission factor datasets (Guenther et al., 2006).The Red line shows CiTTyCAT-MEGAN output when the measured basal emission factor is used, and the green line is the result of using the measured basal emission rate with the MEGAN temperature and light dependent algorithm only (i.e.excluding γ age , γ SM and ζ from Eq. 7).

Fig. 4 .
Fig. 4. Comparison of model NO x and O 3 output with data for the boundary layer above rainforest and oil palm.The green and red lines show the output with rainforest and oil palm VOC emissions respectively under differing NO emissions (shaded area ±50 % VOC emission).The black and grey error bars (rainforest and oil palm respectively) show the interquartile range of the measured NO x and O 3 .

Fig. 5 .
Fig. 5. Conceptual sketch of the two box model.

Fig. 6 .
Fig. 6.Comparison of single box model (red), with output from the two box model.The solid blue line is the lower box and the dashed blue line is the residual layer box (only valid outside the times of the well mixed daytime boundary layer).

Fig. 7 .
Fig. 7. Schematic to illustrate how the background profile is defined.The coloured regions denote different model levels, with a height z.The blue lines represent individual trajectories within the ensemble (black dots are the start of the trajectory).Trajectories within the same level are averaged together at each mixing timestep (t = 1, t = 2, etc.) to form the background mixing ratios in that layer.For instance, at t = 1, the mixing ratios in one trajectory define the background in the lowermost model level, however, for the second level mixing ratios from four trajectories are averaged to find the background.Note that the trajectories do not have to start at precisely the same timepoint, i.e. they can be released at time intervals along a flight track.

Fig. 8 .
Fig. 8. Example of an ensemble of trajectories and the shadow trajectories (horizontal lines).The shadow trajectories provide suitable background mixing ratios outside the height field encompassed by the ensemble.Shadow trajectories evolve photochemically, but do not move vertically.

Fig. 9 .
Fig. 9.Example of the evolution of the background profile of O 3 over a period of 7 days for a simulation of anthropogenic pollution transport across the N. Atlantic (case 3 fromMethven et al., 2006).Profiles every hour are shown, and are coloured by the time through the simulation.The initial background profile is blue, the final profile is red, with each line representing the background O 3 profile at 1 h intervals.In this case there is a general trend for O 3 loss from the background profile over the course of the simulation, driven by deposition to the surface and photochemical loss.The vertical communication of the vertical effects by the mixing scheme is seen in the smooth increase of O 3 mixing ratios with height.

Fig. 10 .
Fig.10.An example of a single trajectory photochemistry only simulation (upper), a photochemistry and mixing (middle), and an ensemble simulation (lower), for a case of transport of emissions from boreal forest fires from North America across the N. Atlantic (ITCT-Lagrangian 2004 case 2 fromMethven et al., 2006).The modelled O 3 and CO evolution is shown for the ensemble of trajectories (coloured by time of initialisation), with the reference trajectory shown by the orange arrow on the y-axis.Aircraft observations downstream are shown by the black crosses on the lower panels, and by red bars (mean and standard deviation) on the upper panels.Single trajectory simulations are after(Real et al., 2007).The variability in the air mass captured by the modelled ensemble is clearly illustrated, allowing a fuller evaluation of the simulation as compared to the measurements.

Fig. 11 .
Fig. 11.Ensemble of trajectories for simulation in Fig. 10.Trajectories are coloured by time of initialisation.

Table 1 .
Available modes in the CiTTyCAT model.

Table A1 .
Example run-times for the model modes (4 day run at 5 minute timestep).Chemistry, wet and dry deposition and surface emissions were enabled.

Table A2 .
Example CPU time usage percentages for the major model components (single trajectory run).s timesteps for the various modes.All chemistry is included in the integrations and MEGAN is run on-line.Table A2 breaks down the typical CPU time usage percentage for the various model components.

Table C1 .
Dry deposition velocities for all chemical species that experience dry deposition, for summertime day and night.SpeciesDeposition velocity (cm s −1 ) by CiTTyCAT land cover type Ref.

Table C2 .
Dry deposition velocities for all chemical species that experience dry deposition, for wintertime day and night.