Tropospheric CO vertical profiles deduced from total columns using data assimilation : methodology and validation

This paper presents a validation of a method to derive the vertical profile of carbon monoxide (CO) from its total column using data assimilation. We choose version 3 of MOPITT CO total columns to validate the proposed method. MOPITT products have the advantage of providing both the vertical profiles and the total columns of CO. Furthermore, this version has been extensively validated by comparison with many independent data sets, and has been used in many scientific studies. The first step of the paper consists in the specification of the observation errors based on the chi-square ( χ2) test. The observations have been binned according to three types: over land during daytime, over land during night-time, and over sea. Their respective errors using the χ2 metric have been found to be 8, 11 and 7 %. In the second step, the CO total columns, with their specified errors, are used within the assimilation system to estimate the vertical profiles. These are compared to the retrieved profiles of MOPITT V3 at global and regional scales. Generally, the two data sets show similar patterns and good agreement at both scales. Nevertheless, total column analyses slightly overestimate CO concentrations compared to MOPITT observations. The mean bias between both data sets is +15 and+12 % at 700 and 250 hPa, respectively. In the third step, the assimilation of total column has been compared to the assimilation of MOPITT vertical profiles. The differences between both analyses are very small. In terms longitude–latitude maps, the mean bias between the two data sets is+6 and+8 % at the pressure levels 700 and 200 hPa, respectively. In terms of zonal means, the CO distribution is similar for both analyses, with a mean bias which does not exceed 12 %. Finally, the two analyses have been validated using independent observations from the aircraft-based MOZAIC program in terms of vertical profiles over eight airports. Over most airports, both analyses agree well with aircraft profiles. For more than 50 % of recorded measurements, the difference between the analyses and MOZAIC does not exceed 5 ppbv (parts per billion by volume).


Introduction
Carbon monoxide (CO) is an important atmospheric species as it influences tropospheric chemistry and climate (Crutzen and Andreae, 1990).The main sources of CO emissions are biomass burning, fossil fuel and the oxidation of methane and non-methane hydrocarbons (Granier et al., 2000).For this reason and because of larger anthropogenic emissions in the Northern Hemisphere (NH) than in the Southern L. El Amraoui et al.: CO total column assimilation Hemisphere (SH), tropospheric CO background values are much higher in the NH than in the SH.The major global sink of CO in the troposphere is the chemical reaction with the hydroxyl radical (OH).Therefore, CO concentrations are higher in winter than summer owing to the seasonal variations of OH abundances.Since OH is the only significant tropospheric sink for CO and many other atmospheric trace gases emitted into the troposphere, CO has the potential to indirectly control the oxidation capacity of the troposphere.Therefore, an increase in CO emissions could reduce OH concentrations and, consequently, the oxidation capacity of the troposphere and its ability to remove pollutants (Mahieu et al., 1997).
Most of the CO in the troposphere is found in the lower troposphere or boundary layer.Compared to its interhemispheric mixing time of several years, CO is not well mixed in the free troposphere, where it has a relatively long lifetime of several weeks to a few months.This makes CO a useful tracer of air pollution, and allows for studies of longrange transport of pollutants in the troposphere.
For more than 10 years, global observations of tropospheric CO have been performed from several satellite instruments, which provide many opportunities to study tropospheric CO on a global scale.The Infrared Atmospheric Sounding Interferometer (IASI) onboard the MetOP-A (Meteorological Operational Programme) satellite launched in October 2006 provides augmented horizontal resolution of the CO total column.Monitoring of this species will continue with the MetOP-B satellite, which carries a suite of sophisticated instruments.These two satellites are polar orbiters and provide global observations.The data they collect on the atmosphere and the environment are complementary and allow for monitoring of the atmospheric composition and its evolution in near-real time.
Most tropospheric sensors operate with a nadir-viewing geometry and typically provide vertically integrated information, implying limited vertical resolution.This could present a limitation for some process studies, such as long-range transport of pollutants because of missing information on vertical levels.Furthermore, most chemistry and transport models (CTMs) are subject to large uncertainties concerning the distribution of CO concentrations.This is because CO sources are not well known since their estimates are generally derived from inventory-based, bottom-up techniques, which are as a whole highly uncertain (e.g.Jones et al., 2003).Another issue concerns the CO emissions from biomass burning, which have unexpected sources in terms of time, location and magnitude and thus are subject to large uncertainties (Bian et al., 2007).
Chemical data assimilation consists of combining in an optimal way observations provided by instruments with a priori knowledge about a physical system such as model output.It allows for constraints to be put on models using observations, and thus can be used to overcome model deficiencies.It also provides a four-dimensional (time and space) description of the dynamical and chemical state of the atmosphere.Typically, data assimilation systems produce observation-minus-forecast (OMF) statistics that are used for monitoring biases between the observations and the models (e.g.El Amraoui et al., 2010).The specific objective of chemical data assimilation is to produce a self-consistent picture of the atmosphere taking into account both the available chemical observations and our theoretical understanding of the atmospheric system.
Assimilation of CO satellite observations in the troposphere has been performed using different sensors.These include MAPS (Lamarque et al., 1999), IMG (Clerbaux et al., 2001), MOPITT (e.g.Pradier et al., 2006;Claeyman et al., 2010;El Amraoui et al., 2010) and SCIAMACHY (e.g.Tangborn et al., 2009).Most of the CO analyses in these studies have revealed improvements of the CO distribution in comparison to the free model run.However, no assessment of the impact of the assimilation of the total column on the CO vertical profile has been done hitherto.
The main goal of this study is to assess the benefit of the CO total column assimilation on the CO vertical distribution at global and regional scales.The philosophy of this study is the following: the CO total column is generally deduced from the profiles using a simple integration over the vertical levels.The question we pose is, can we derive the CO vertical profile from its total column using the adjoint of the integration operator within an assimilation system?
We choose version 3 (V3) of MOPITT CO measurements to validate the proposed method.The motivation for this choice is presented in Sect.2.1.The proposed method has the advantage of allowing for fast computation of the vertical profiles and the analyses of CO.
Note that this method will be particularly useful for small centres with limited resources.Nevertheless, some operational meteorological centres will have the necessary resources in terms of storage and computing to assimilate the vertical profiles with all their corresponding characteristics (kernels and covariances).The method presented will be particularly useful in the future, when there will be many missions providing large volumes of data for which level 2 retrievals with their corresponding characteristics (covariance matrices and averaging kernels) could be very expensive in terms of computer resources (i.e.IASI onboard MetOP-A and MetOP-B or future geostationary missions).Furthermore, the assimilation of such data in CTMs taking into account all these characteristics will likely be very costly in terms of time computation and memory.This will be a significant shortcoming regarding the operational use of these data.Thus, the validation of the method proposed in this paper could provide an alternative way to produce CO fields at the global scale with relatively modest resources.
First, we describe the approach, which consists of deducing the vertical distribution of CO in the troposphere from the assimilation of total MOPITT column measurements (hereinafter denoted TOTCOL_ANALYSES).Second, we validate the CO vertical profiles deduced from TOT-COL_ANALYSES with the MOPITT-retrieved vertical profiles.In the third step, we compare TOTCOL_ANALYSES against the assimilation of MOPITT CO vertical profiles taking into account the corresponding error covariance matrices and averaging kernels (hereinafter denoted PRO-FILE_ANALYSES).Finally, both analyses, from total column and from profiles, have been validated using independent in situ MOZAIC observations.The paper outline is as follows: Sect. 2 presents the MO-PITT CO measurements as well as the corresponding total columns, the data assimilation system used in this study, and the data used for the evaluation of the vertical profiles deduced from the assimilation of CO MOPITT total column: the official vertical profiles of MOPITT V3 measurements.The method used for the assimilation of MOPITT CO total columns, the specification of the errors and the a posteriori diagnostics are presented in Sect.3. The comparisons of CO vertical profiles deduced from TOTCOL_ANALYSES to those of MOPITT observations are presented in Sect. 4. Section 5 presents a validation of the CO vertical profiles calculated from TOTCOL_ANALYSES against PRO-FILE_ANALYSES.Conclusions are presented in Sect.6.

Terra-MOPITT carbon monoxide observations
The MOPITT instrument (Drummond and Mand, 1996) onboard the Terra platform has been monitoring global tropospheric CO from March 2000 to date.The pixel size is 22 km ×22 km and the vertical profiles for MOPITT version 3 are retrieved on seven pressure levels (surface,850,700,500,350,250 and 150 hPa) with the maximum likelihood method (Rodgers, 2000).The retrieved profiles are characterized by their error covariance matrices and their averaging kernels, providing information on the vertical sensitivity of the measurements.In particular, the degrees of freedom for signal (DFS) parameter the trace of the averaging kernel matrix, indicates the number of independent pieces of information in the measurements.It depends, via the surface temperature, on latitude and time of day.The MOPITT V3 CO level 2 product consists of retrieved values and estimated uncertainties of the CO total column and CO profile (see http://www.acd.ucar.edu/mopitt/retrievals.shtml).The retrieved CO total column is obtained as a byproduct of the retrieved profile by integrating the retrieved profile from the surface to the top of the atmosphere (see www.acd.ucar.edu/mopitt/avg_krnls_app.pdf).
The main motivation for using the MOPITT V3 is because these data have been extensively validated against many independent data sets (e.g.Emmons et al., 2004Emmons et al., , 2007Emmons et al., , 2009;;Deeter et al., 2007;Yurganov et al., 2008).The temporal and spatial behaviour of MOPITT V3 data are well understood (e.g.Emmons et al., 2009).
The DFS parameter for MOPITT V3 is low for vertical profiles as well as for the total columns.Figure 1 shows an example of the spatial variation of the DFS of MOPITT V3 profiles in terms of longitude-latitude averaged over the month of August as well as its frequency distribution over the same period.The typical value of DFS for MOPITT V3 profiles is around 1.5 and is located primarily on the sea over the tropics.
The method presented in this paper considers data with low values of DFS.It considers only the adjoint of the integration operator.For other data with high values of DFS, we have to evaluate the validity conditions of the adjoint operator with respect to different values of the DFS.This is the subject of ongoing work.

MOCAGE CTM and data assimilation system
The assimilation system used in this study is MOCAGE-VALENTINA (e.g.Emili et al., 2014), which is an extension of the MOCAGE-PALM system (e.g.El Amraoui et al., 2008a, b) (Courtier et al., 1991).
The MOCAGE horizontal resolution used for this study is 2 • both in latitude and longitude and the model uses a semi-Lagrangian transport scheme.It includes 47 hybrid (σ , P ) levels from the surface up to 5 hPa, where σ = P /P s ; P and P s are the pressure and the surface pressure, respectively.MOCAGE has a vertical resolution of about 800 m in the vicinity of the tropopause and in the lower stratosphere, whereas in the boundary layer MOCAGE has seven levels with a vertical resolution between 40 and 400 m.In the free troposphere, MOCAGE has a vertical resolution which varies from 400 to 800 m.The technique implemented within VALENTINA and used for the assimilation of MOPITT CO observations is the 3D-FGAT (first guess at appropriate time) method.This method is a compromise between the 3D-Var and 4D-Var techniques (Fisher and Andersson, 2001).It compares the observation and background fields at the correct time and assumes that the increment to be added to the background state is constant over the entire assimilation window.The choice of this assimilation technique limits the size of the assimilation window, since it has to be short enough compared to chemistry and transport timescales.This technique has already produced good-quality results compared to independent data, especially for O 3 and CO (e.g.Semane et al., 2007;El Amraoui et al., 2010;Claeyman et al., 2011;Rabier et al., 2010;Bencherif et al., 2011;Lahoz et al., 2012).
3 Assimilation of MOPITT CO total column: methodology and error specification

Assimilation methodology
For variational systems, the assimilation method is based on the minimization of the cost function, J .These systems exist in a variety of formulations.We use the notation of Ide et al. (1997): The first term on the right-hand side of Eq. ( 1) is the misfit to the background state, and the second term represents the misfit to the observations.x b (t 0 ) and y(t i ) are the background state at the initial time and the observation at time t i , respectively.B and R are the background and the observation error covariance matrices, respectively.x(t i ) is the model state at the observation time, t i , and represents the propagation of the initial state, x(t 0 ), by the model operator, M: H i is the observation operator, generally non-linear, which maps the model state x(t i ) to the measurement space where y o (t i ) is located.The subscript i refers to time and N is the number of time steps in the assimilation window [t 0 , t N ].For the incremental variational 3D-FGAT method, the cost function, J , in Eq. ( 1) can be expressed as (3) is the increment vector which represents the difference between the assimilation state x and the background state x b at time t 0 .The first term on the right-hand side of Eq. ( 3) is the background cost function (J b ), and the second term represents the observation cost function is the departure, at time t i , between the observation vector y o (t i ) and its model equivalent in the observation space H i x b (t i ) .The H operator is the tangent linear of the H operator.
For the assimilation of MOPITT total columns, the observation vector y o contains the 2-D field of CO total columns, while the model state x, and consequently the background state x b , is the 3-D field of CO vertical profiles updated by the model during the forecast step.The observation operator H, which maps the model state to the observation space, is then a vertical integration over all model levels taking into account the vertical profile of both the pressure and the density of air.c and d) at 700 hPa (left panels) and 250 hPa (right panels).The corresponding relative differences between both data sets (TOTCOL_ANALYSES -observations) are indicated in the bottom panels for both pressure levels (e for 700 hPa and (f) for 250 hPa).Blue and red colours indicate negative and positive differences, respectively.Note that this figure corresponds to an average over August 2008 for all observations carried out over land and sea during daytime and night-time.
Note that, in this study, although we assimilate the CO total column, the control variable is the 3-D CO field.The assimilation process seeks for the optimal 3-D increment δx of the CO vertical profiles and thus the observation component of the cost function acts as just one constraint.Another constraint, regularizing the solution by keeping it in the proximity of the background information, is the background cost function (J b ) in which we use the background error covariance matrix B. The assimilation increment is therefore a 3-D field and its vertical structure depends on the H operator through its adjoint, H T , mapping back a variation in the 2-D total column space, toward the model 3-D space, and the vertical correlation coefficients in B.
More explicitly, the variational 3D-FGAT method consists of minimizing the cost function of Eq. ( 3).Since the observation operator is linear, the analysis state can be expressed as The update of x a after the minimization of the cost function is done by using (5) d is the innovation vector.The correction δx a (analysis increment) to be added to x b to obtain x a is first normalized by HBH T + R −1 after it is introduced into the model space (here the CO vertical profile) via H T and is finally multiplied by B.

Background error covariance matrix
The background error covariance matrix is a key component in data assimilation.It contributes first to filter and propagate spatially the observed information, and second to define the correlations between the control variables of the models during the assimilation process.For the MOCAGE-VALENTINA assimilation system, the background covariance matrix B is split into a diagonal matrix of the forecast error variances of the assimilated species in each grid point of the model and a positive definite symmetric correlation (6) The correlation matrix C contains both horizontal and vertical operators.The horizontal correlation is modelled using a two-dimensional diffusion equation (Weaver and Courtier, 2001) with a homogenous length scale both in latitude and longitude.The vertical correlation is modelled using a Gaussian function in terms of the logarithm of the pressure.Thus the vertical correlation (C v i,j ) between two pressure levels (p i and p j ) is as follows: The dimensionless parameter, k, is determined from many validation experiments of the MOPITT V3 vertical profiles assimilation in comparison to other independent data such as AIRS and MOZAIC (e.g. El Amraoui et al., 2010).It was found that k = 100 gives better analyses compared to the independent data and consequently better characterizes the vertical correlation of the B matrix in the troposphere.
The horizontal correlation (C h k,l ) between two points (k and l) separated by a distance (δ k,l ) is L x and L y are the longitude and latitude length scales in kilometres, respectively.
R e is the Earth's radius (6371.22km) and α x and α y are the longitude and latitude length scales, respectively, in degrees.
In this study, both α x and α y are constant and fixed to 2 • , which corresponds to a length scale of about 220 km.

Error specification
The first step of the proposed method consists in specifying the observation error covariance matrices.The assimilation process needs, at least, specification of the error covariance matrices (R and B matrices in Eqs. 1 and 2).
To validate the method, we assume in this study that the CO total column from MOPITT has neither error covariance matrix nor averaging kernel information.We specify the corresponding errors of the CO total columns based on the chisquare (χ 2 ) test (e.g.El Amraoui et al., 2010): observation errors of the MOPITT CO total columns are estimated using this test.Different values of the observation error have been selected and several assimilation tests with these values have been conducted over a one-month study, August 2008.The appropriate value of the observation error is that for which the χ 2 test is the closest to 1.A value of χ 2 close to 1 indicates consistency between both error-covariance matrices (R and B), whereas a value of χ 2 lower (greater) than 1 implies an overestimation (underestimation) of the observation and/or background errors (e.g.Lahoz et al., 2007a).
Since the sensitivity of MOPITT measurements in the thermal infrared (TIR) wavelength depends, via the surface temperature and thermal contrast, on daytime and night-time periods, specification of the measurement error is made by binning the observations according to day, night, land and sea.The specification of the errors will be done for three types of measurements: over land during daytime (LAND_DAY), over land during night-time (LAND_NIGHT) and over sea during daytime and nighttime (SEA).For each type of measurement we assume that all observations have the same percentage error and that errors are uncorrelated.
Figure 2 shows an example of the time evolution of the χ 2 test over the period of study, August 2008 (left-hand side), and the corresponding Gaussian fit of the normalized χ 2 test (right-hand side) for different observation error values (diagonal of R) corresponding to the measurement type SEA.We note that the χ 2 test is very sensitive to the observation error value.For low values of R, the χ 2 test gives high values, and vice versa.The optimal observation error value (diagonal of R) for which χ 2 is the closest to 1 for SEA measurements is Table 1 summarizes the χ 2 results for all type of measurements.The optimal values of the observation error (diagonal of R) are indicated in boldface.They are 8, 11 and 7 % for LAND_DAY, LAND_NIGHT and SEA measurements, respectively.These values will be used as the observation error values for each corresponding type of measurements in the assimilation of MOPITT CO total column measurements.

A posteriori diagnostics
Each of the three types of MOPITT measurements (LAND_DAY, LAND_NIGHT and SEA) has been assimilated, in terms of total column, using the corresponding observation error selected according to the χ 2 test discussed in Sect.3.3.Figure 3 shows the OMF and the OMA (observation minus analysis) diagnostics for TOTCOL_ANALYSES for the whole assimilation period (August 2008).Figure 3, left, shows the OMF distributions normalized by the observation errors for the three types of measurements.The OMF histograms are fitted by a Gaussian function.The comparison between the OMF histograms for all types of measurements and the corresponding fitted Gaussian function is very good.This agreement supports the assumption that the specified observations and their corresponding forecasts have Gaussian errors.We note that the mean of all the normalized OMF values is positive but close to zero (lying between 0.4 and 0.7), which suggests that the bias between the model and the observations is very small for all the three types of measurements.
Figure 3, right, shows the OMA and OMF histograms for all MOPITT CO total columns during the whole assimilation period.For the three types of measurements, the OMA histogram is narrower than that for OMF and the bias is reduced.Furthermore the standard deviation of OMA is smaller than that of OMF: σ OMA = 4.9, 4.8 and 4.0 DU(Dobson units) for LAND_DAY, LAND_NIGHT and SEA measurements, respectively.The corresponding values for σ OMF are 14.1, 13.7 and 7.1 DU, respectively.This indicates that, as expected, the analyses for the different types of measurements are closer to the observations than the forecasts.

Comparison in terms of horizontal maps
In this section we validate the vertical profiles deduced from TOTCOL_ANALYSES in comparison to the MOPITT CO observations in terms of vertical profiles at global and regional scales.Figure 4 presents a comparison between both data sets in terms of longitude-latitude maps at 700 and 250 hPa for the three types of observations.Since the sensitivity of MOPITT measurements through the averaging kernels is not vertically uniform, TOTCOL_ANALYSES in terms of vertical profiles have been smoothed by the MO-PITT averaging kernels to take into account the vertical resolution as well as the a priori information used in the retrieval process of MOPITT vertical profiles.This is performed through the transformation of the vertical profile issued from TOTCOL_ANALYSES (x assim ) using the averaging kernels of the MOPITT CO vertical profiles (A) and the a priori CO profile (x apriori ) to create an analysed vertical profile (x comp ) appropriate for a quantitative comparison to the MOPITT CO retrievals: Note that the two quantities -MOPITT observations and x comp -have been averaged in boxes of 2 • × 2 • (corresponding to the grid mesh of the MOCAGE model) over the month of comparison, August 2008.Figure 4 shows that the general features of both data sets are consistent over the globe at 700 and 250 hPa and that the CO concentrations in the two fields have the same patterns particularly over the emission regions over central Africa, southeastern Asia and northern South America.Generally, the fields of TOT-COL_ANALYSES slightly overestimate CO concentrations, especially at 700 hPa.The maximum differences between both data sets for this type of measurements range from −10 to 40 % for the 700 hPa pressure level with a mean bias of 15 %.However, at the pressure level of 250 hPa, the differences range from −10 to 15 % with a mean bias of 12 %.The mean differences between both data sets are slightly higher at 700 hPa than at 250 hPa.This could be explained by the way the assimilation system redistributes the increment after the minimization of the cost function.The information in terms of CO content given to the system is important in the lower levels compared to the higher levels.

Comparison in terms of zonal means
In this section, we compare the CO vertical profiles calculated from TOTCOL_ANALYSES, in terms of zonal means, to the MOPITT observations.Figure 5 shows, for the three types of measurements, CO monthly zonal means of MOPITT observations and their corresponding collocated TOTCOL_ANALYSES in terms of vertical profiles for August 2008 from the surface up to 150 hPa: the upper level of the MOPITT V3 observations.For the three types of measurements, the two zonal mean distributions (observation and TOTCOL_ANALYSES) show similar patterns.They both show the regions of CO emissions, particularly the biomass burning region in the latitude range between 0 and 20 • S as well as the CO emissions in the NH.
For LAND_DAY and LAND_NIGHT measurement types, the CO vertical distribution is similar in both fields: over the emission regions, the maximum of CO extends up to 220 hPa over Africa and up to 150 hPa in the subtropical regions of the NH.These features of upper troposphere CO outflow reflect surface CO emissions lifted by convection.Nevertheless, MOPITT total column analyses slightly overestimate CO concentrations in the NH and in the tropical regions (up to +30 % for LAND_NIGHT and +20 % for LAND_DAY).For the SEA measurement type, both zonal means have generally the same distributions.In the NH, both fields show high CO concentrations corresponding to the anthropogenic emissions over North America, Europe and Asia.However, in the SH the maximum difference between the two zonal means for the SEA type ranges between −10 and +20 %.Generally, in the SH, both fields show very moderate CO concentrations, reflecting very low CO emissions over this region.

Comparison in terms of vertical profiles at regional scales
In this section, we compare the vertical profiles calculated from TOTCOL_ANALYSES at different regional scales to the MOPITT observations.Figure 6 shows the main regional domains for which the evaluation of MOPITT total column analyses is done by comparison to MOPITT observations.These domains are considered as the regions having the bulk of the CO sources which are significantly different.The choice of these domains is consistent with the results of Liu et al. (2006), who state that the most important sources of CO variability in the troposphere are synoptic disturbances which have spatial scales of hundreds to thousands of kilometres and timescales from hours to days.The CO distribution is highly variable over these spatio-temporal scales, reflecting a range of processes such as emissions, transport and chemical transformations.The tropospheric average of CO concentrations can fluctuate considerably from day to day depending on these processes, especially near the sources at synoptic and local scales (e.g.Liang et al., 2004).Liu et al. (2006) also state that large horizontal gradients in the distribution of CO at the synoptic scale have been observed in the MOPITT data.These fluctuations in CO can be as large as 50-100 % and occur over spatial scales of ∼100 km.These variations usually last one to several days, can span horizontal distances of hundreds of kilometres, and can appear over a range of pressure levels from 850 to 150 hPa.Consequently, it is important to have a statistical assessment of the variability of the two fields (MOPITT observations and TOTCOL_ANALYSES) over these regional areas.This will allow us to examine their respective behaviour with respect to different types of emissions at the different regional scales.
Figure 7 presents a comparison between MOPITT CO profiles and their co-located profiles derived from the CO total column analyses over the six regional domains.Both data sets are averaged over each domain for the seven MO-PITT levels.Over the six domains, the different mean profiles match very well.Note also that the CO concentrations over sea are generally lower than those over land, especially at lower levels.The vertical profiles from the two data sets are very similar and agree within their standard deviations.Note also that the most significant variabilities of both data sets over all domains, especially domains 5, 6 and 3, are located at the lowermost levels (between the surface and 700 hPa).This reflects the variability of CO sources near the surface in Africa, South America and southeastern Asia.
The mean bias as well as the corresponding RMS (root mean square) between both data sets over the six domains of comparison for the three types of measurements are presented in Fig. 8.The absolute mean bias does not exceed 14 %, and is generally higher at lower levels (from the surface up to 700 hPa).For LAND_NIGHT and SEA types, the mean bias is generally positive for all domains at all pressure levels, reflecting an overestimation of the vertical profile deduced from TOTCOL_ANALYSES in comparison with MOPITT observations.The LAND_DAY type is generally characterized by a large positive bias with a corresponding RMS higher than that of other types, particularly at the lowermost levels.This reflects a higher variability of TOT-COL_ANALYSES for LAND_DAY compared to the other types of measurements.
For all types of measurements over all domains, both the bias and the RMS are large between the surface and 700 hPa (on average around 12 and 35 % for the bias and the RMS, respectively).This is in agreement with the results of Fig. 7 showing high variability in this altitude range.From 500 hPa up to 150 hPa, both quantities have generally small values (on average around 5 and 10 % for the bias and the RMS, respectively).The vertical profile of the correlation coefficient between both data sets over the six domains of comparison is presented in Fig. 9.The correlation coefficient ranges from ∼ 0.6 to 0.95.The correlation is generally good in the midtroposphere (500 hPa).This means that the added value to the model from MOPITT total column is more pronounced in the mid-troposphere compared to the lower levels.This is due to the redistribution of the CO column information by the assimilation system, which is important in the lower levels compared to the high levels.

Comparison of CO deduced from total column assimilation and CO deduced from vertical profile assimilation
In this section, we compare the vertical profiles calculated from MOPITT CO TOTCOL_ANALYSES for which the observation errors have been specified using the method presented in this paper (see Sect. 3.3) and the vertical profiles issued from PROFILE_ANALYSES.The objective is to evaluate the differences between both analyses.

Comparison in terms of horizontal maps
In this section, we compare the vertical profiles derived from both analyses in terms of horizontal maps at different pressure levels.
Figure 10 presents a comparison, at 700 hPa, between the vertical profiles calculated from TOTCOL_ANALYSES with those from PROFILE_ANALYSES.These latter are considered as the reference since they are assimilated with all their retrieval characteristics.Consequently, they should present the most realistic state of the atmosphere.Both fields are presented at the global scale and averaged over the month of August 2008.The CO total column analyses and vertical profile analyses are very similar at 700 hPa.The mean bias between both quantities over the globe is very low (∼ 6 % on average).This mean bias is still in the range of the mean specified observation errors (∼7-8 %) except over some local areas where the maximum difference ranges between ∼ −12 and ∼ +14 %.However, CO fields are different from the fields of the model free run highlighting the added value of the assimilation results (Fig. 10).For example, over the regions of South America, central Africa and Asia, the free-run results differ from the analyses at 700 hPa (the differences could be greater than 60 %). Figure 11 presents the same comparison as for Fig. 10 but at 200 hPa.The same conclusion as for 700 hPa can be deduced: the profiles deduced from TOT-COL_ANALYSES are very close to those issued from PRO-FILE_ANALYSES with the same patterns especially over the emission regions: Africa and southern Asia.The maximum mean bias between both fields ranges between −3 and +10 %.However, the comparison between the model freerun field and the vertical profile analyses shows a bias which exceeds 60 % even if the general patterns between both fields are almost the same.These results confirm again that the CO fields deduced from PROFILE_ANALYSES and obtained from TOTCOL_ANALYSES are almost the same with very small differences.The relative mean bias between the two data sets is very small (on average around +5 %) and is generally within the specified errors.

Comparison in terms of zonal means
In this section, we evaluate the differences between the two analyses in terms of zonal means.In this way, we present in Fig. 12 a comparison of CO zonal mean fields between the PROFILE_ANALYSES, TOTCOL_ANALYSES and the MOCAGE free-run model.The CO distribution is similar for both analyses (total column and vertical profiles).Over the SH in the extratropics, both fields show moderate values of CO from the surface up to the midtroposphere (∼ 400 hPa).CO concentrations from TOT-COL_ANALYSES are slightly overestimated compared to those from PROFILE_ANALYSES.The mean bias between both analyses (Fig. 12 -middle) is positive and does not exceed 12 % over the vertical.In the tropics, both fields show strong CO emissions over Africa that can reach ∼ 200 hPa.Over this region, the differences between the two fields are very small, ranging from −5 to +9 %.In the NH, the two fields show very high CO concentrations in the mid-troposphere.These high CO concentrations correspond to anthropogenic emissions from North America, Europe and East Asia.The mean bias between both analyses ranges between −12 and +12 %.These values are consistent with other results concerning the validation of MOPITT observations compared to independent data.In fact, the validation results found by Emmons et al. (2004) when comparing MO-PITT observations to aircraft-independent in situ profiles indicate a good quantitative agreement with an average bias less than 20 ppbv at all levels.Moreover, regarding the distributions of both zonal means, we can conclude that both fields are very similar over the altitude range from the surface up to 150 hPa.However, the comparison between the zonal means deduced from PROFILE_ANALYSES against those of the MOCAGE model free run (Fig. 12 -Bottom) shows a bias ranging between −35 and 45 %, particularly in the midtroposphere of the tropical regions and the lower troposphere of the extratropics (< 40 %).These results show that the information derived from the total columns using data assimilation is capable of modifying the vertical structure of the CO distribution over the whole troposphere, showing features very similar to those obtained from the assimilation of MOPITT CO profiles.

Behaviour of the increments
Figure 13 shows the assimilation increments (δx a = x a − x b ) of MOPITT CO PROFILE_ANALYSES and TOTCOL_ANALYSES both zonally averaged for the month of August 2008.Note that the magnitude of the increment depends on the relative magnitudes of both the background error and observation error covariances (e.g.Oke et al., 2008).Moreover, the structure of the increment also depends on the structure of the localized background error covariance, BH T (see Eq. to increasing (decreasing) CO in the analysis compared to the model's first guess.
The behaviour of the increment concerning PRO-FILE_ANALYSES shows a maximum values at the pressure levels 500 and 700 hPa.This is consistent with the fact that the highest sensitivity of MOPITT observations, via their averaging kernels, is located at these two levels (see e.g.Deeter et al., 2007).Figure 13b shows that the increment of TOT-COL_ANALYSES decreases with respect to the altitude.The total column increments seem to be more important in the lower troposphere compared to those of the vertical profiles.This is consistent with the fact that the vertical structure of the assimilation increment depends on the H operator through its adjoint, H T (see Eq. 5).The assimilation increments for both analyses are generally negative between the latitude range ∼30 • S-60 • N at all pressure levels.The largest difference between both increments is located in the latitude range ∼ 30-75 • S between 500 and 200 hPa, where the differences between both analyses are very weak (the largest mean bias between both analyses is ∼ 12 %; see e.g.Fig. 12).Nevertheless, Fig. 12c shows that the largest differences between PROFILE_ANALYSES and TOTCOL_ANALYSES (up to +12 %) are located at high latitudes in the upper troposphere between 350 and 250 hPa.This difference appears to come from the upward extension of the positive increments at high latitudes that the PROFILE_ANALYSES would put at lower altitudes.On the one hand, this could be explained by the fact that total column approach using H T shifts the increments downwards in the atmosphere and vertically smooths them.This pattern becomes stronger at low altitudes and weaker higher up.On the other hand, the PRO-FILE_ANALYSES increments reduce CO in the tropics and northern midlatitudes and increase it at higher latitudes.

Comparison in terms of vertical profiles at regional scales
In this section, we evaluate the differences between the two analyses in terms of vertical profiles at regional scales.We compare, at the same regional scales shown in Fig. 6, the CO vertical profiles deduced from TOTCOL_ANALYSES and those obtained from PROFILE_ANALYSES.The vertical profiles calculated from both analyses are averaged over different domains for August 2008.Figure 14 presents the vertical profiles with their associated standard deviations.The latter represent the variability of the CO concentration over each domain for the month of August 2008.The profiles calculated from both analyses as well as their associated standard deviations are similar for all domains.
Both analyses show the same behaviour for the CO fields in terms of vertical structure at the regional scales, and have similar variability.The maximum standard deviation is generally found at pressure levels between the surface and 700 hPa for both analyses, especially for domains 5 (Africa), 6 (South America) and 3 (East Asia).This is consistent with the results of Fig. 8, which again illustrates the variability of CO sources over Africa, South America and East Asia.Figure 14 confirms that TOTCOL_ANALYSES and PRO-FILE_ANALYSES provide almost the same vertical structure over regional scales.This shows again that the assimilation of total column impacts all the vertical levels of the profile in the same way as the assimilation of the vertical profiles.
Figure 15 presents the vertical profiles of the mean bias and the corresponding RMS between the two assimilation setups (TOTCOL_ANALYSES and PROFILE_ANALYSES) averaged over each domain for the month of August 2008.For all regional domains, the mean bias has low values at all pressure levels and is generally less than 10 % except for domains 2 (Europe) and 3 (East Asia) at 150 hPa, where it reaches ∼ 13 %.These domains are the regions of intercontinental transport of pollution.Our results regarding higher mean biases compared to other domains are consistent with the findings of Kopacz et al. (2010), which state that MO-PITT observational errors are in the 10-30 % range, highest over pollution outflow regions.Consequently, since the errors of TOTCOL_ANALYSES are fixed to be constant into account the averaging kernels as well as the error covariance matrices (blue).Both datasets are averaged over the month of August 2008 and over all the regional domains defined in Fig. 6.

36
Figure 14.Mean CO vertical profiles and their associated standard deviations in parts per billion by volume (ppbv) deduced from MOPITT CO TOTCOL_ANALYSES (red) compared to MOPITT PROFILE_ANALYSES taking into account the averaging kernels as well as the error covariance matrices (blue).Both data sets are averaged over the month of August 2008 and over all the regional domains defined in Fig. 6.
during the assimilation, it is reasonable that the differences between both analyses are large in these regions compared to the other regions.The values of the RMS range between +10 and +15 % for most domains.All these values are smaller or in the range of the expected errors of the assimilation results, and are generally smaller than the observation error values used in the assimilation process.The only exception concerns domain 6 (South America), for which the corresponding RMS is about 20 % in the altitude range between the surface and 400 hPa.This could be attributed to the large variability of the CO field in this domain (see Fig. 12).The vertical profiles of the correlation coefficient between the two analyses over the six domains of comparison are presented in Fig. 16.For the different domains, the correlation coefficients range between 0.75 and 0.99, with most of the values close to 0.9, which shows a good qualitative agreement between the two assimilation results.
The results shown in this section concern the following statistics calculated between both data sets: bias, RMS and correlation coefficient.They show that the comparisons between the vertical profiles deduced from TOTCOL_ANALYSES and those obtained from PRO-FILE_ANALYSES are consistent with each other.

Validation of the analyses with MOZAIC independent data
To further evaluate both analyses, we compare them to MOZAIC measurements.The MOZAIC programme was launched in January 1993.The measurements started in August 1994, with the installation of ozone and water vapour sensors aboard five commercial aircraft.In 2001, the instrumentation was upgraded by installing carbon monoxide sensors on all aircraft and a total odd nitrogen instrument (NO y ) aboard one aircraft.Ozone is measured by UV absorption (Thermo Instruments, model 49-103).The instruments are calibrated before and after each period of deployment (around every 12 months) and in-flight quality control is achieved, both for bias and calibration factor, with a built-in ozone generator.A comparison of the first 2 years of MOZAIC observations with data of the ozonesonde network showed good agreement (Thouret et al., 1998).For CO measurements, the infrared (IR) gas filter correlation technique is employed (Thermo Environmental Instruments, model 48CTL).This IR instrument provides excellent stability, which is important for continuous operation without frequent maintenance.The sensitivity of the instrument was improved by several modifications (Nédélec et al., 2003), achieving a precision of ±5 ppbv (parts per billion by volume) or ±5 % for a 30 s response time.
The comparison was conducted with collocated vertical profiles for the three data sets (MOZAIC, TOT-COL_ANALYSES and PROFILE_ANALYSES) over eight MOZAIC airports visited over the assimilation period (Atlanta, Caracas, Dallas, Frankfurt, Hyderabad, London, Philadelphia and Windhoek).These airports are located in the domain lat [51.6 • N-22.6 • S], long [96.8 • W-78.4 • E].
For the three data sets, collocated observations are selected in a 2 • radius area over each of the eight airports.The comparisons of TOTCOL_ANALYSES and PROFILE_ANALYSES to MOZAIC observations at all visited airports are presented in Fig. 17.The two analyses behave similarly over all airports.The general qualitative agreement of both analyses compared to MOZAIC is very good.We note that the difference between MOZAIC and both analyses exceeds 40 ppbv at only one level (850 hPa) over Caracas.Note also that the difference between the analyses and MOZAIC is in the range of 20-25 ppbv for only 6 % of measurements.This difference does not exceed 5 ppbv for more than 50 % of measurements.

Conclusions
The aim of this paper is to describe a method to derive the vertical profile of CO from its total column with no associated error covariances and averaging kernels using data assimilation.We have chosen version 3 of MOPITT CO total columns to validate the proposed method since it has the advantage of providing both the vertical profiles and the total columns of CO.
The method is based on the estimation of the observation error covariance matrices (diagonal of the R matrix), using the χ 2 test to obtain consistency between model and observation errors.This specification has been done by discriminating the observations according to day, night, land and sea.The appropriate observation errors are 8 and 11 % for measurements performed over land during daytime (LAND_DAY) and over land during nighttime (LAND_NIGHT), respectively.For measurements performed over sea during daytime and night-time (SEA), the observation error is 7 %.The a posteriori diagnostics concerning the analyses for all specified total column observations confirm that the specified errors, for different types, using the proposed method as well as the corresponding forecasts error, have a Gaussian structure.
In the first comparison, CO profiles from MOPITT total column analyses and MOPITT observations show similar patterns in terms of longitude-latitude maps at 700 and 250 hPa.The mean bias at 700 hPa between the two data sets is 15, 18 and 12 % for LAND_DAY, LAND_NIGHT and SEA types, respectively.At 250 hPa, these respective mean biases are +12, +8 and +7 % for LAND_DAY at 250 hPa.The comparison of the zonal means shows that the CO vertical distribution is homogeneous in both fields from the surface up to 150 hPa.At regional scales, the comparison of the two data sets in terms of vertical profiles shows that the mean bias is generally large at low levels but does not exceed +10 % in magnitude.
In the second comparison, the results show that, over the globe, the general aspect is consistent between the analyses issued from the MOPITT vertical profiles and the CO total column analyses.The CO fields present the same features particularly over the emission regions in central Africa, southeastern Asia and northern South America.The mean bias between both data sets is 6 and 8 % at 700 and 200 hPa, respectively.In terms of zonal means, the CO distribution is similar for the two analyses with very low differences.The total column analyses tend to slightly overestimate the CO concentrations.The maximum mean bias does not exceed 15 % over all levels.
Over regional scales, the comparison of the vertical profiles calculated from both analyses gives a very small mean bias which generally does not exceed +10 % in magnitude, whereas the vertical profile of the correlation coefficient ranges from 0.75 to 0.99.These results concerning the CO distributions, vertical profiles, mean bias, RMS and correlation coefficient confirm that the analyses of the CO total column assimilation are in very good qualitative agreement with the analyses calculated from the assimilation of the MOPITT CO profiles both at global and regional scales.
Both analyses have also been validated using in situ MOZAIC-independent data.The comparison was conducted with collocated vertical profiles for the three data sets over eight airports visited over the assimilation period located in the domain lat [51.6 • N-22.6 • S], lon [96.8 • W-78.4 • E].The comparisons of the two analyses to MOZAIC data over all the visited airports show a very good agreement.The difference between MOZAIC and the two analyses exceeds 40 ppbv at only one level (850 hPa) over Caracas.However, the difference between the analyses and MOZAIC is in the range of 20-25 ppbv for only 6 % of measurements.This difference does not exceed 5 ppbv for more than 50 % of measurements.
Note finally that the DFS of MOPITT V3 is relatively low for vertical profiles (∼ 1.5) as well as for the total columns (∼ 1).In this paper we have demonstrated that, for this kind of data, the present method consisting of deducing the profiles from the total columns remains valid when only using the adjoint of the integration operator.Note that for other types of data for which the DFS is greater than that of MO-PITT V3, the method presented has to be tested and evaluated against independent observations.

Fig. 1 .Figure 1 .
Fig. 1. (Top):Lon-lat map of the averaged DFS over August 2008 corresponding to the vertical profiles of MOPITT 3. (Bottom): the frequency distribution of the DFS corresponding to all vertical profiles measured during the same period.

Fig. 3 .Figure 3 .
Fig. 3.A posteriori verification of the observation error specification for the analyses issued from the MO CO total column for which the observation errors are estimated using the proposed method.(Left): histog of OMF (Observations Minus Forecast) differences normalized by the specified observation errors.The red is a Gaussian fit to the histogram.The good agreement between the histogram and the fit function suppor assumption of Gaussian errors in the observations and the forecast.(Right): histograms of Observations M Analysis (OMA: red lines) and OMF (blue lines).

Fig. 4 .Figure 4 .
Fig. 4. Comparison of CO analyses obtained by the assimilation of MOPITT V3 CO total column observations with the optimal error estimated by the χ 2 test ((a) and (b)) to the operational MOPITT V3 CO retrieved profiles ((c) and (d)) at 700 hPa (left panels) and 250 hPa (right panels).The corresponding relative differences between both datasets (TOTCOL ANALYSES -observations) are indicated in the bottom panels for both pressure levels ((e) for 700 hPa and (f) for 250 hPa).Blue and red colors indicate negative and positive differences, respectively.Note that this figure corresponds to an average over August 2008 for all observations carried out over land and sea during daytime and nighttime.

Fig. 5 .Figure 5 .
Fig. 5. Zonal mean of MOPITT CO TOTCOL ANALYSES (left panels) compared to the zonal mean of the MOPITT CO observations (right panels) for August 2008.The comparison is done for observations made: over land during daytime (upper panel), over land during nighttime (middle panel) and over sea during daytime and nighttime (bottom panel).

Fig. 6 .Figure 6 .
Fig.6.Main domains of CO emissions considered for the regional validation of the proposed method dealing with the validation of CO TOTCOL ANALYSES.

Fig. 7 .Figure 7 .
Fig. 7.The mean CO vertical profiles in parts per billion by volume (ppbv) deduced from MOPITT V3 CO TOTCOL ANALYSES (blue) compared to the operational MOPITT V3 observations (red).Both datasets are averaged over August 2008, over all the domains defined in Fig. 6 and are associated with their corresponding standard deviations.

Fig. 8 .Figure 8 .
Fig. 8.The mean bias and the corresponding RMS (Root Mean Square) between CO vertical profiles deduced from the MOPITT V3 CO TOTCOL ANALYSES and the MOPITT V3 observations.The comparison is made for observations carried out over land during daytime (red), those carried out over land during nighttime (black) and those carried out over sea (blue).

Fig. 9 .Figure 9 .
Fig. 9. Same as Fig. 8 but for the correlation coefficient between CO vertical profiles deduced from MOPITT CO TOTCOL ANALYSES and the MOPITT V3 observations.The comparison is made for each level of the MOPITT V3 retrievals.

Fig. 10 . 32 Figure 10 .
Fig. 10.Maps of CO field at 700 hPa for: (a) : MOPITT PROFILE ANALYSES taking into account averaging kernels and observation error covariance matrices; (b) : MOPITT TOTCOL ANALYSES, and (c) : the MOCAGE free-run field.The Figures in the bottom present the difference in % between TOTCOL ANALYSES and PROFILE ANALYSES (d), and the difference between the model and PROFILE ANALYSES (e).

Fig. 12 .Figure 12 .
Fig. 12. Zonal means of CO field for the month of August 2008 as obtained by: (a) : MOPITT PRO-FILE ANALYSES taking into account averaging kernels and observation error covariance matrices ; (b) : MOPITT TOTCOL ANALYSES ; and (d) : the MOCAGE free-run model.The figures in the right present the difference in % between TOTCOL ANALYSES and PROFILE ANALYSES (c) and the difference between the free-run model and PROFILE ANALYSES (e).

Fig. 13 .Figure 13 .
Fig. 13.Zonal means of CO increment in ppbv for the month of August 2008 as obtained by: (a) : MOPITT PROFILE ANALYSES taking into account averaging kernels and observation error covariance matrices ; (b) : MOPITT TOTCOL ANALYSES.

Fig. 14 .
Fig. 14.Mean CO vertical profiles and their associated standard deviations in parts per billion by volume (ppbv) deduced from MOPITT CO TOTCOL ANALYSES (red) compared to MOPITT PROFILE ANALYSES taking

Fig. 15 .Figure 15 .
Fig. 15.Same as Fig. 14 but for the mean bias and the corresponding RMS (Root Mean Square) between both analyses (TOTCOL ANALYSES and PROFILE ANALYSES).

initially developed in the framework of the www.atmos-meas-tech.net/7/3035/2014/ Atmos. Meas. Tech., 7, 3035-3057, 2014 L. El Amraoui et al.: CO total column assimilationTable 1 .
Mean and median values of chi-square (χ 2 ) test for different error values of MOPITT V3 total column observations.The error values of the observations for which the χ 2 test is the closest to 1 are indicated in boldface.They are 8 % for LAND_DAY, 11 % for LAND_NIGHT and 7 % for SEA.These error values are fixed within the assimilation system for all experiments concerning MOPITT V3 total columns.