Retrieval algorithm for CO 2 and CH 4 column abundances from short-wavelength infrared spectral observations by the Greenhouse gases observing satellite

The Greenhouse gases Observing SATellite (GOSAT) was launched on 23 January 2009 to monitor the global distributions of carbon dioxide and methane from space. It has operated continuously since then. Here, we describe a retrieval algorithm for column abundances of these gases from the short-wavelength infrared spectra obtained by the Thermal And Near infrared Sensor for carbon Observation-Fourier Transform Spectrometer (TANSOFTS). The algorithm consists of three steps. First, cloud-free observational scenes are selected by several cloud-detection methods. Then, column abundances of carbon dioxide and methane are retrieved based on the optimal estimation method. Finally, the retrieval quality is examined to exclude low-quality and/or aerosol-contaminated results. Most of the retrieval random errors come from instrumental noise. The interferences due to auxiliary parameters retrieved simultaneously with gas abundances are small. The evaluated precisions of the retrieved column abundances for single observations are less than 1% in most cases. The interhemispherical differences and temporal variation patterns of the retrieved column abundances show features similar to those of an atmospheric transport model. Correspondence to: Y. Yoshida (yoshida.yukio@nies.go.jp)


Introduction
Atmospheric carbon dioxide (CO 2 ) and methane (CH 4 ) are well-known major greenhouse gases.The concentration levels of these gases have rapidly increased over the last 250 yr, most probably due to human activities (IPCC, 2007).However, our current knowledge of the carbon cycle is still insufficient because the observations of CO 2 and CH 4 are spatially and temporally limited on the globe.This leads to large uncertainties in future climate predictions.Satellite measurement is one of the most effective approaches to monitoring the global distributions of greenhouse gases at high spatiotemporal resolution and is expected to improve the accuracy of source and sink estimates of these gases (e.g., Rayner and O'Brien, 2001;Houweling et al., 2004;Patra et al., 2003;Chevallier et al., 2007Chevallier et al., , 2009;;Kadygrov et al., 2009;Baker et al., 2010).Over the past decade, two types of satellite observations of atmospheric CO 2 and CH 4 have been performed.The first is thermal infrared (TIR) observations, which are typified by the Atmospheric InfraRed Sounder (AIRS) onboard the Aqua satellite (Crevoisier et al., 2004;Chahine et al., 2005;Xiong et al., 2008).The second type is short-wavelength infrared (SWIR) observations by the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) onboard the Envisat satellite (Frankenberg et al., 2005a, b;Barkley et al., 2006;Buchwitz et al., 2006;Schneising et al., 2008Schneising et al., , 2009)).TIR observations are sensitive to CO 2 and CH 4 in the middle to upper troposphere, whereas SWIR observations are also Published by Copernicus Publications on behalf of the European Geosciences Union.GPV (SR) GPV (DB) AOD ( 1) SPRINTARS (SR) (1.0) 2 (fixed) surface pressure [hPa] (1) GPV (SR) (20) 2 (fixed) temperature profile bias [K] (1) 0.0 (fixed) (20) 2 (fixed) wavenumber dispersion (3) 0.0 (fixed) (10 −5 ) 2 (fixed) surface albedo ( 22) MODIS (DB) (1.0) 2 (fixed) wind speed [m s −1 ] (1) GPV (SR) GPV (DB) adjustment factor (2) 1.0 (fixed) (0.05) 2 (fixed) sensitive to gas abundances near the surface.Because the major sources and sinks of CO 2 and CH 4 exist near the surface, SWIR observations are more suitable than TIR observations from the viewpoint of carbon flux estimation (Chevallier et al., 2005).The accuracy of the flux estimation depends significantly on the precision and accuracy of the column abundance retrieved from satellite measurements.Obviously, the retrieved column abundances at high precision yield a low uncertainty in the estimated flux.For data averaged spatially (over a few hundred to several thousand square kilometers) and temporally (weekly to monthly), a precision of 1% order or better in CO 2 column abundances is required to improve our current knowledge of surface CO 2 fluxes (Rayner and O'Brien, 2001;Houweling et al., 2004;Patra et al., 2003).Random errors in the retrieved column abundance have a smaller impact on the estimated flux than bias in the retrieved value, which can lead to significant biases in the estimated flux (Patra et al., 2003;Houweling et al., 2005).Understanding the precision and accuracy of the retrieved column abundance is important not only to improve the retrieval algorithm itself but also for flux estimation.SCIAMACHY is the first satellite showing the global distribution of the CO 2 column abundance from space, although its instrument was not specialized for CO 2 observations.The precision (1 to 2%) and accuracy (−1.5%) of CO 2 column abundances by using the current retrieval method of SCIAMACHY (Schneising et al., 2008) are still insufficient for flux estimation, and the retrieval algorithm has been continuously improved.Recently, two satellites have been developed to measure CO 2 column abundances with higher precision.The Orbiting Carbon Observatory (OCO) was developed for CO 2 observations (Crisp et al., 2004).However, the satellite was lost because of a launch failure (Palmer and Rayner, 2009).Currently, the OCO reflight mission (OCO-2) is aiming to launch no later than February 2013 (http://oco.jpl.nasa.gov/).The Greenhouse gases Observing SATellite (GOSAT), which was launched on 23 January 2009, is designed to monitor the global CO 2 and CH 4 distributions from space.GOSAT was put in a sun-synchronous orbit at 666-km altitude with 3-day recurrence and the descending node around 12:48 local time.It is equipped with two instruments: the Thermal And Near infrared Sensor for carbon Observation-Fourier Transform Spectrometer (TANSO-FTS) and the Cloud and Aerosol Imager (TANSO-CAI) (Kuze et al., 2009).The TANSO-FTS performs high-spectral-resolution measurements from SWIR to TIR (0.758 to 14.3 µm).Here, we focus on the retrieval algorithms of CO 2 and CH 4 column abundances from the SWIR spectra.These are used for the operational data processing of GOSAT; the corresponding data product names and versions are Level 2 CO 2 /CH 4 column amount (SWIR) V01.10, V01.20, and V01.30.Note (1) the different versions of the Level 2 product indicate the different versions of the input TANSO-FTS Level 1B product (measured spectrum); the corresponding Level 1B versions are V050.050,V080.080, and V100.100, respectively.(2) The major differences in the Level 1B product are the detection criteria for signal saturation and spike-noise during acquisition of an interferogram; that information is given as quality flags in the Level 1B product.For the TANSO-FTS TIR band, Saitoh et al. (2009) have described the CO 2 retrieval method in detail.
Many algorithms have been developed to retrieve column abundances of trace gases.The differential optical absorption spectroscopy (DOAS) retrieval method has been widely used to retrieve the column abundances of trace gases (see Table 1 of Hönninger et al., 2004).For the retrieval of SCIAMACHY data, several DOAS-based algorithms have been developed (Buchwitz et al., 2000;Barkley et al., 2006;Frankenberg et al., 2005a).In those algorithms, however, aerosol scenarios and/or the surface albedo (two key parameters for the optical path length modification) are assumed.When the actual equivalent optical path length differs from the assumed path  length, large errors exist in the retrieved results.Oshchepkov et al. (2008Oshchepkov et al. ( , 2009) ) proposed a new DOAS-based retrieval algorithm that simultaneously retrieves the photon path length probability density function parameters (Bril et al., 2007) to correct the optical path length modification effect.Focusing on methane retrieval, Frankenberg et al. (2005b) proposed a CO 2 proxy method that simultaneously retrieves the CO 2 and CH 4 column abundances using the no-aerosol assumption and uses the CH 4 to CO 2 column ratio to remove most of the optical path length modification effect.Connor et al. (2008) and Reuter et al. (2010) developed retrieval algorithms for OCO and SCIAMACHY.
In Sect. 2 of this paper, a brief introduction of GOSAT and the strategy of the SWIR retrieval algorithm are given.
The SWIR retrieval algorithm consists of three parts: data screening suitable for the retrieval analyses, optimal estimation of gaseous column abundances, and quality checking of the retrieved results.Details of each step are described in Sects.3 to 5. The basic performance of the actual GOSAT observations is briefly introduced in Sect.6.The summary is given in Sect.7. The main purpose of this paper is to describe the retrieval algorithm.Detailed discussions of the spatial pattern and seasonal cycle of the retrieval results will be presented in other papers.The detailed mathematical formulations of the GOSAT operational processing, including pre-and post-processing for the retrievals, are summarized in the algorithm theoretical basis document.This is available from the data distribution website of the GOSAT project (https://data.gosat.nies.go.jp).

GOSAT instruments
TANSO-FTS is the main instrument of GOSAT.It has three narrow bands in the SWIR region (0.76, 1.6, and 2.0 µm; TANSO-FTS Bands 1, 2, and 3, respectively) and a wide TIR band (5.5 to 14.3 µm; TANSO-FTS Band 4) at a spectral resolution (interval) of about 0.2 cm −1 .The full width at half-maximum (FWHM) of the instrument line shape function (ILSF) is about 0.27 cm −1 , which clearly identifies the absorption lines of CO 2 and CH 4 in the spectral measurement data (Fig. 1).For the SWIR bands, the incident light is divided by a polarization beam splitter.It is then simultaneously recorded as two orthogonal polarization components (hereafter called P /S components).The TANSO-FTS instantaneous field of view (IFOV) is 15.8 mrad, which corresponds to a nadir circular footprint of about 10.5 km in diameter at sea level.The TANSO-FTS has a pointing mechanism that makes it possible to observe the off-nadir direction within the pointing mirror driving angles of ±35 degrees in the cross-track direction and ±20 degrees in the along-track direction.
Details on TANSO-FTS and TANSO-CAI, along with the conversion method from raw interferograms to spectra (i.e., Level 1 processing) and pre-launch calibration results, have  Kuze et al. (2009).Although the initial geometric and radiometric calibrations for TANSO-CAI have been performed, those for TANSO-FTS are still underway.The correction of the radiometric degradation of the TANSO-FTS as a function of time is applied to the measured spectra for V01.10, V01.20, and V01.30 of the SWIR Level 2 products.

Requirements for a forward model to simulate TANSO-FTS SWIR spectra
To interpret the spectral measurement data, a forward model that describes the physics of the measurement process is necessary.Incoming solar irradiance is reflected from the Earth's surface and scattered and absorbed by atmospheric molecules, aerosols, and clouds.Only the light reflected and scattered in the satellite direction is observed by TANSO-FTS SWIR.During reflection and scattering, the polarization state of the light can change.Furthermore, the distributions of atmospheric composition, aerosols, and clouds are highly variable in space and time.In addition, the spectral resolution of a forward model must be finer than the spectral resolution of TANSO-FTS SWIR (0.2 cm −1 ) to simulate the TANSO-FTS spectrum.
A forward model that can describe the scattering process of polarized light in a three-dimensional space at line-by-line spectral resolution is desirable for accurate retrieval, but it is not suitable for the operational processing of GOSAT data because of the huge computational cost.Because TANSO-FTS measures two orthogonal polarization components, a total intensity spectrum can be constructed from the measured spectra (see Appendix A).Although the structure of the three-dimensional atmosphere is particularly important for cloud identification, those effects are not taken into account for the forward model to give priority to computational efficiency.Therefore, we adopted a one-dimensional line-byline scalar radiative transfer model that could treat the scattering process for our forward model.Spectroscopic databases with finer spectral resolution than that of TANSO-FTS have been prepared.Absorption crosssections were calculated in 0.01-cm −1 spectral intervals and then used to construct a three-dimensional look-up-table as a function of wavenumber, pressure, and temperature.The absorption line profile of the O 2 A band was calculated based on Tran et al. (2006) and Tran and Hartmann (2008).That of the CO 2 1.6-µm band was based on Lamouroux et al. (2010), and the Voigt profile with the HITRAN 2008 database (Rothman et al., 2009) was used for CH 4 and H 2 O.The high resolution solar irradiance database (0.004 to 0.01 cm −1 ) reported by Dr. R. Kurucz (http://kurucz.harvard.edu/sun/irradiance2008/) was used as the incident solar spectrum.

Outline of the TANSO-FTS SWIR retrieval processing
Here, we focus only on cloud-free measurement scenes for which column abundances are retrieved.We then evaluate the characteristics of GOSAT retrievals for those scenes.Several cloud-detection methods were applied to identify cloud-contaminated scenes.Further data screenings were conducted to select appropriate measurement scenes to avoid significant discrepancies between the measurements and the forward model.Column abundances of CO 2 and CH 4 (hereafter referred to as VCO 2 and VCH 4 ) were retrieved based on the optimal estimation method (Rodgers, 2000).The gaseous profiles of CO 2 and CH 4 were retrieved first, and then VCO 2 and VCH 4 were obtained as the final outputs (Rodgers and Connor, 2003).The spectral region of the 1.6µm CO 2 absorption band (spectral wavenumber/wavelength range from 6180 to 6380 cm −1 /1.567 to 1.618 µm), the 1.67-µm CH 4 absorption band (5900 to 6150 cm −1 /1.626 to 1.695 µm), and the 0.76-µm O 2 absorption band (12950 to 13200 cm −1 /0.7576 to 0.7752 µm) were used for the retrieval.To suppress the computational costs in the operational processing, the number of layers for the forward model was minimized.As described in Sect.4.2, the number of layers for retrieval is 15.To consider the temperature and pressure dependencies of the gaseous absorption, each layer was divided into 12 sub-layers in the calculation of the optical thickness due to the gaseous absorption.The cumulative optical thickness of each layer was used in the forward model.Furthermore, the P and S polarization components of the observed spectra were synthesized to produce a total intensity spectrum (see Appendix A).This total intensity spectrum was compared as a measurement vector with the forward model based on the scalar radiative transfer code.
The radiometric sensor degradation and dispersion of the wavenumber axis of the observed spectrum were corrected before the polarization synthesis.The channels contaminated with Fraunhofer lines in the solar atmosphere were removed for the retrieval to avoid possible bias errors.After the retrieval, the quality of the retrieved result was evaluated.For the VCO 2 and VCH 4 that passed quality control criteria, the column-averaged dry air mole fractions (hereafter referred to as XCO 2 and XCH 4 ), which are the ratios of the column abundances to dry air column abundance, were calculated and also provided as SWIR Level 2 products.Figure 2 depicts the brief flow of the retrieval procedure.The detailed description of each retrieval process was given in Sects.3, 4, and 5.  3 Data screening

Cloud detection
The cloud-detection method by TANSO-CAI is based on the algorithm developed by Ishida and Nakajima (2009).At first, several threshold tests are applied to each TANSO-CAI pixel: single reflectance tests and reflectance ratio tests.Each test defines a clear confidence level (float value between 0 and 1), and then the integrated clear confidence level is evaluated in a cloud-conservative approach.In the TANSO-CAI cloud flag test, the integrated clear confidence level is treated as a bit flag (0 or 1) with a given threshold value.Consequently, all TANSO-CAI pixels are categorized as clear or cloudy pixels.The optical path length modification due to clouds within the TANSO-FTS IFOV brings a bias error in the retrieved column abundances.Currently, if more than one cloudy pixel is found within the TANSO-FTS IFOV, such a measurement scene is identified as a cloud-contaminated scene and rejected for the retrieval.The magnitude of the bias error due to the cloud depends on its three-dimensional distribution within and around the TANSO-FTS IFOV, along with the cloud optical properties.The magnitude of this bias error will be evaluated in future work.
From the actual data processing, we found that the TANSO-CAI cloud flag tended to categorize sub-pixel sized clouds as clear pixels over the ocean.In addition, Nakajima et al. (2008) noted that TANSO-CAI often fails to detect optically thin cirrus clouds because it does not have any thermal infrared channels sensitive to clouds in the upper troposphere.To mitigate the abovementioned TANSO-CAI limitations, other cloud-detection methods (a TANSO-CAI spatial coherence test and a TANSO-FTS 2-µm-band test) are applied to select cloud-free scenes.
The TANSO-CAI spatial coherence test looks for the existence of sub-pixel size clouds over the ocean.In most cases, the ocean reflectance within the TANSO-FTS IFOV is expected to be almost uniform.However, the magnitude of the perturbation in TANSO-CAI radiance due to sub-pixel sized clouds changes randomly for each TANSO-CAI pixel.Therefore, if the standard deviation of TANSO-CAI radiance within the TANSO-FTS IFOV exceeds some threshold value, it is expected that the measurement scene contains sub-pixel-sized clouds.The threshold value is empirically determined to separate eye-checked clear/cloudy scenes from TANSO-CAI images.The TANSO-CAI spatial coherence test is only applied to ocean measurement scenes because the spatial variability of the ground surface albedo within the TANSO-FTS IFOV is relatively high for the land measurement scenes.
The TANSO-FTS 2-µm-band test looks for the existence of elevated scattering particles (mainly cirrus cloud) using the measurement radiance of the strong water vapor (H 2 O) absorption band (5150 to 5200 cm −1 / 1.923 to 1.942 µm) included in the TANSO-FTS 2-µm band (TANSO-FTS Band 3).In this test, the 10 most H 2 O-absorptive channels are selected to avoid contamination of the surface-reflected light.If there is no scattering, almost no radiance in the strong H 2 O absorption band region is observed by TANSO-FTS.Because the H 2 O mixing ratio decreases exponentially with height under standard atmospheric conditions, the radiation scattered by elevated particles should not be completely absorbed by H 2 O and reaches the TANSO-FTS, depending on the partial column abundance of H 2 O above the scattering.Therefore, if the average radiance of the selected strong H 2 O absorption channels is larger than the noise-level radiance, some scattering particles are expected to exist within the TANSO-FTS IFOV.It is obvious that this method has less sensitivity to low-altitude clouds and aerosols but high sensitivity to highly elevated scattering particles.

Other screening items
Even when TANSO-FTS measurement scenes are identified as cloud free by cloud-detection methods, some measurement scenes are not suitable for retrieval analysis because the quality of the spectrum is low, and the data acquisition conditions are inconsistent with assumptions made in the retrieval analysis.
Some interferograms have signal saturations around the zero-path-difference (ZPD) position at center burst because of strong reflection by clouds or deserts.Signal-saturated interferograms are removed because they give incorrect absorption spectra and cause large retrieval errors.
In general, observations by FTS require that the incident light into the FTS be unchanged during the acquisition of an interferogram.Nominal observation by TANSO-FTS takes about 4 s, and the satellite moves about 28 km during this period.Therefore, TANSO-FTS has an image-motioncompensation (IMC) mechanism to point at the same IFOVarea.However, the IFOV-area sometimes fluctuates due to the instability of the pointing-mirror mechanics and of the satellite attitude.Such measurement scenes are also rejected from the retrieval.
In the retrieval analysis, we assumed a plane-parallel atmosphere.
An air-mass factor for a plane-parallel atmosphere shows more than 1% error when the solar zenith www.atmos-meas-tech.net/4/717/2011/Atmos.Meas.Tech., 4, 717-734, 2011 angle is greater than 72 degrees.Therefore, the retrievals are limited to TANSO-FTS measurement scenes with solar zenith angles less than 70 degrees.Furthermore, ground surface undulation within and around the TANSO-FTS IFOV conflicts with the plane-parallel assumption.This effect on the retrieval results will be evaluated in a future study.Consider eight lines of sight (LOS) that are shifted by 1.3 mrad around the center LOS of TANSO-FTS; we calculate the mean surface elevation within TANSO-FTS IFOV for each LOS.Here, TANSO-FTS measurement scenes are removed when the maximum difference between the mean surface elevation for the center LOS and those for the eight shifted LOSs is greater than 25 m.The mean surface elevation in TANSO-FTS IFOV is estimated from the Global 30 Arc-Second Elevation Data Set (GTOPO30).The measurement scenes removed by this criterion are limited to those near high mountain ranges such as the Rockies, the Andes, and the Himalayas.

Formulation of the maximum a posteriori (MAP) retrieval
The measurement vector y is expressed by a forward model where x is the state vector to be retrieved, b is a model parameter vector necessary to describe the forward model that is not retrieved, and ε is an error vector comprising the measurement error and the forward model error.The optimal estimation of x for non-linear maximum a posteriori (MAP) retrieval is obtained by minimizing the cost function, defined as (Rodgers, 2000) where x a is the a priori state of x, S a is the a priori variancecovariance matrix (VCM), and S ε is the error covariance matrix.To obtain the optimal estimation of x stably, the Levenberg-Marquardt method (Levenberg, 1944;Marquardt, 1963) is adopted.The solution can be obtained from the following iterative equation: where the subscript i denotes the index of the i-th iteration; K is the Jacobian matrix, which is the derivative of the forward model as a function of the state vector x, i.e., K = ∂F (x,b) ∂x; λ is a non-negative factor chosen at each step to minimize the cost function (initial value of λ is zero); and T is defined by the Cholesky decomposed matrices S ε = T T ε T ε and S −1 a = T T ainv T ainv .The iteration continues until changes in the normalized chi-square χ 2 = J (x) m and x between iterations become sufficiently small, where m indicates the number of channels used in the retrieval analysis, or the number of iterations reaches the predetermined maximum number (currently 20 iterations).
The error covariance matrix of the retrieval state S is expressed as the sum of measurement noise S m , smoothing error S s , and interference error S i where S e is the covariance matrix of the ensemble of status, and subscripts x and c indicate the submatrices for target (CO 2 or CH 4 ) and auxiliary elements, respectively.The submatrices of the averaging kernel matrix for target elements and of interference between targets and auxiliary elements are represented by A xx and A xc , respectively, which are expressed using the gain matrix Using the dry air partial column h, the column abundance V target , the column-averaged dry-air mole fraction X target and its error components σ Y , and the column averaging kernel a x are calculated as where the sub-script target indicates the target gaseous species (CO 2 or CH 4 ), n is the number of layers of the retrieval vertical grid, w dry,l is the partial dry air column abundance of the l-th layer, and 1 is the n-form column-vector with elements of unity.In each iteration step, w dry,l is recalculated from the solved surface pressure and H 2 O concentration profile.

State vector
The current retrieval algorithm uses the TANSO-FTS SWIR spectra within three selected wavenumber regions: 12 950 to 13 200 cm −1 (hereafter referred to as the O 2 sub-band), 6180 to 6380 cm −1 (CO 2 sub-band), and 5900 to 6150 cm −1 (CH 4 sub-band).In addition to the CO 2 and CH 4 concentration profiles, the H 2 O concentration profile, the ground surface albedo for land, the surface wind speed and the radiance adjustment factor for ocean, aerosol optical depth (AOD), surface pressure, temperature profile bias, and the wavenumber dispersion are included in the state vector as auxiliary parameters and retrieved simultaneously.The state vector and its a priori and the a priori VCM are summarized in Table 1 and will be described subsequently.
The atmospheric layer is divided into 15 layers from the surface to 0.1 hPa with a constant pressure difference, and the average gas concentration for each layer is retrieved.The a priori profiles of CO 2 and CH 4 are calculated for every observed day by an offline global atmospheric transport model developed by the National Institute for Environmental Studies (NIES; hereafter referred to as NIES TM; Maksyutov et al., 2008).Eguchi et al. (2010) evaluated the realistic uncertainties of CO 2 and CH 4 concentration profiles using NIES TM data of the past few years and observation-based reference data.These are summarized as VCMs for each month and each grid box (0.5 × 0.5 degrees).The global distribution pattern of the available reference data was also considered to evaluate the uncertainties of those gases.To avoid unexpected strong constraints on the a priori values and to gain as much information from the observed spectra as possible, the original CO 2 and CH 4 VCMs were multiplied by a factor of 10 2 and used in the retrieval analysis.This factor is a tentative value and should be tuned in the near future.The H 2 O profile of the grid point value (GPV) objective analysis data (this had a temporal resolution of 6 h and horizontal resolution of 0.5 degrees, with 21 vertical levels) provided by the Japan Meteorological Agency (JMA) was used as an a priori.The corresponding VCM was also prepared for each month and grid box using the GPV data of the past few years.
The radiance level of the observed spectrum was highly variable according to the target reflectance, the solar zenith angle, and the satellite viewing angle.Although the last two angles are easily determined, the target reflectance varies with surface-type and should be retrieved.In the retrieval analysis over land, the ground surface was assumed to be Lambertian, and the surface albedo was adopted as an auxiliary parameter.To represent the spectrum structure within the retrieval sub-band, the surface albedo was assumed to be represented by several grid-point values and to vary linearly from one grid value to the next.The distance to the adjacent grid was set to 25 cm −1 for the CO 2 and CH 4 sub-bands, and 250 cm −1 for the O 2 sub-band.The official Moderate Resolution Imaging Spectroradiometer (MODIS) land surface albedo product (MOD43B3; Schaaf et al., 2002) was used as the a priori data for the surface albedo.The original MODIS product was given as the 5-yr average from 2000 to 2004 at 10 MODIS wavelengths for 23 sixteen-day periods per year, with a 1-minute spatial resolution.Thinned-out MODIS measured surface albedos with 1-degree resolution at wavelengths of 0.858, 1.64, and 2.13 µm were prepared for the a priori surface albedo database for TANSO-FTS Bands 1 (O 2 sub-band), 2 (CO 2 and CH 4 sub-bands), and 3 (not used in this paper), respectively.The a priori variance of surface albedo was set to (1.0) 2 .The assumption of a Lambertian surface is not adequate for an ocean surface.The bidirectional reflectance distribution function for an ocean surface is calculated based on the slope probability distribution function proposed by Cox and Munk (1954).The Cox-Munk assumption parameterizes the reflectance of the ocean surface over the entire spectral range with a single parameter of surface wind speed.However, if the relative radiometric calibration accuracy between TANSO-FTS Bands 1 and 2 is not sufficient, one surface wind speed parameter cannot represent the reflectance levels of the entire TANSO-FTS spectral range.We use a factor that adjusts this difference between TANSO-FTS bands.This factor (hereafter referred to as the radiance adjustment factor) and the surface wind speed are adopted as auxiliary parameters for the ocean cases.The surface wind speed of the GPV is used as an a priori, and its variance was calculated for each month and grid box of GPV data.An a priori value of unity and variance of (0.05) 2 are used for the radiance adjustment factor.
As stated above, aerosol scattering modifies the equivalent optical path length, which may lead to large errors in the retrieved column amounts.To represent the equivalent optical path modification, AOD and the surface pressure are included in the state vector.The model-calculated AOD and the GPV surface pressure are used as a priori values, and the corresponding variances are set to (1.0) 2 and (20 hPa) 2 , respectively.The aerosol optical properties are calculated for every observed day by an offline threedimensional aerosol transport model, the Spectral Radiation-Transport Model for Aerosol Species (SPRINTARS; Takemura et al., 2000Takemura et al., , 2002)).SPRINTARS calculates the mass concentration distribution of soil dust, carbonaceous, sulfate, and sea-salt aerosols.The optical depth, single-scattering albedo, and phase function of aerosols are calculated taking into account the size distribution and composition.The calculated single-scattering albedo and phase function are treated as fixed model parameters.The aerosols are assumed to be uniformly distributed within a 2-km layer from the surface.Takemura et al. (2002) compared the optical depth and single-scattering albedo simulated by SPRINTARS with those obtained by satellite and ground-based observation networks using a sunphotometer.The mean difference between the simulation and observations was less than 30% for the optical depth and less than 0.05 for the single-scattering albedo in most regions.In the future, we plan to use aerosol optical properties derived from TANSO-CAI.However, we do not Y.Yoshida et al.: Retrieval algorithm for CO 2 and CH 4 column use them at the present because the calibration accuracy of the TANSO-CAI is still insufficient, and the retrieved aerosol properties themselves need to be validated.
It is well known that the temperature dependency of the line parameters of the O 2 A band is large.The differences between true and assumed temperature profiles may cause some errors in the retrieved dry mole air fractions.In our retrieval analysis, temperature profiles from GPV data are used.A constant bias from the GPV temperature profile is retrieved simultaneously with the column amounts.The a priori value and variance are set to 0.0 K and (20 K) 2 , respectively.
The TANSO-FTS wavenumber axis is slightly variable due to self-apodization and the temperature dependency of the sampling laser's wavelength.During the TANSO-FTS SWIR retrieval, this wavenumber dispersion is corrected twice.First, the correction factors for each retrieval sub-band are determined in such a way that the cross-correlation between the observed and reference spectra is maximized.The corrected wavenumber axis ν cor is expressed in terms of a correction factor ρ times the observed spectral wavenumber ν obs , as The spectral channels used in the retrieval analysis are selected based on this wavenumber axis.Then, a very small dispersion ρ to the correction factor ρ is retrieved with the column amounts to obtain the optimal solution under the best-fitted spectra.The wavenumber axis for the retrieval analysis is expressed as The a priori value and variance for ρ are set to 0.0 and (10 −5 ) 2 , respectively.

Jacobian
Throughout the optimal estimation process, the calculation of the forward model and Jacobian matrix is the most time-consuming procedure.To suppress the computational cost, the fast radiative transfer model proposed by Duan et al. (2005) is used as the forward model.This fast radiative transfer model calculates the single-scattering radiance accurately and the multiple-scattering radiance approximately based on the equivalence theorem with a double-k distribution approach.By utilizing this fast radiative transfer model, we obtain the Jacobian for the absorption optical depth and that for the surface albedo.First, we briefly introduce this fast radiative transfer model, along with our extensions to take into account the wavenumber dependencies of surface albedo and cloud/aerosol single-scattering properties.In the double-k approach, the multiple-scattering components can be parameterized as (Duan et al., 2005) where k is the total optical depth due to the gaseous absorption, k is the absorption optical depth from the TOA to a layer where substantial scattering occurs, ξ 0 (k) is the average value of ξ for each fixed k, and g(k) and β (k) are the interpolated parameters.Values for g(k), β (k), and ξ 0 (k) are calculated and tabulated from accurately calculated multiplescattering radiances for a small set of k values in k space.
A scalar radiative transfer code based on the discrete ordinate method (Siewert, 2000) is used to calculate the accurate multiple-scattering radiances.
To include the wavenumber dependency of the singlescattering properties of clouds and aerosols, two sets of g(k), β (k), and ξ 0 (k) tables are prepared for the single-scattering properties at each end of the wavenumber range of interest.
Here, we assume that the single-scattering properties change monotonically with wavenumber.The radiance at arbitrary wavenumbers is interpolated from the radiances calculated by each set of tables.
The wavenumber dependency of the surface albedo is treated as follows.If the ground surface is assumed to be Lambertian, the satellite observed radiance I α for surface albedo of α is (Liou, 2002) where α m is the mean surface albedo within the wavenumber range of interest, I α m and I 0 are the satellite-observed radiances with surface albedos of α m and zero, respectively, and r is the spherical albedo.The multiple-scattering component of I α can be calculated by tabulating the multiple-scattering components of I α m and I 0 .
The Jacobians for absorption optical depth and surface albedo are given as the derivatives of Eqs. ( 17) and ( 19), respectively.We consider an atmosphere of n layers, and let the gaseous absorption optical depth of the l-th layer equal τ l .Because the total absorption optical depth and its derivatives are written as the derivatives of Eq. ( 17) by τ l are written as Here, we neglect the k-dependency of β.When the absorption optical depth from the TOA reaches k at the l sca -th layer, k can be written as Fig. 3. Jacobians (scaled to the same amplitude) of common retrieval parameters for both land and ocean (a), only for land (b), and only for ocean (c).Jacobians of CO 2 , CH 4 , and H 2 O profiles are plotted every other layer.The vertical layer is ordered from the top of the atmosphere ("L01") to the surface ("L15")."AOD", "P SRF ", " T ", " ρ", "Alb.","Wind", and "Adj."indicate the aerosol optical depth, surface pressure, temperature profile bias, wavenumber dispersion, surface albedo, wind speed, and adjustment factor, respectively."(O 2 )", "(CO 2 )", and "(CH 4 )" indicate those parameters for O 2 , CO 2 , and CH 4 sub-bands, and "(B1)" and "(B2)" indicate those parameters for TANSO-FTS Bands 1 and 2, respectively.

It follows that
The Jacobian for the surface albedo can be obtained from Eq. ( 19) as The Jacobians for other retrieval parameters are calculated numerically by taking the difference between the nominal and perturbed radiances.A Jacobian example is shown in Fig. 3.

Quality control
Finally, the quality of the retrieved state is checked.The trace of matrix A xx indicates the degree of freedom for signals (DFS), which describes the number of independent quantities obtainable from the observation.Therefore, the retrieved VCO 2 or VCH 4 are dismissed when each DFS of the target gas is less than unity because the observed TANSO-FTS spectrum does not have enough information to retrieve them, and hence the retrieved values depend on the a priori values.When the discrepancy between the forward model and the observed spectrum is significant, the mean squares of the residual spectra (MSR) for each retrieval sub-band and the χ 2 values are large.The retrieved results are removed when the MSR is greater than 3 or the χ 2 for the retrieved state is greater than 5.As stated above, the equivalent optical path length modification due to the scattering of aerosols is Figure 4 Monthly fraction of the TANSO-FTS scenes suitable for retrieval analysis (i.e., measurement scenes that passed all data-screening items described in Sect.3) within a 2.5 x 2.5-degree grid box.A blank indicates that no valid observations were available within the grid box.

Fig. 4.
Monthly fraction of the TANSO-FTS scenes suitable for retrieval analysis (i.e., measurement scenes that passed all datascreening items described in Sect.3) within a 2.5 × 2.5-degree grid box.A blank indicates that no valid observations were available within the grid box.
evaluated by simultaneously retrieving the surface pressure and AOD.Because some aerosol parameters are fixed in the retrieval analysis, the evaluated equivalent optical path length cannot always represent a physically meaningful value when the AOD is relatively large.Therefore, the retrieval results of VCO 2 and VCH 4 are dismissed when the retrieved AOD defined at the wavelength of 1.6-µm is larger than 0.5.This threshold value is tentative and should be tuned under further investigation.
6 Results and discussion

Availability of measurement scenes for retrieval analysis
Clouds are major sources of disturbance for the TANSO-FTS SWIR measurements.Before the launch of the satellite, the probability of a clear-sky occurrence was evaluated from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) cloud-layer product and MODIS cloud mask data (Eguchi and Yokota, 2008).Due to the high sensitivity of CALIOP to thinner clouds, MODIS tends to overestimate the clear-sky probability.By taking the footprint size of TANSO-FTS (approximately 10.5-km diameter) into account, the annual mean clear-sky probability for 10-km aggregation CALIOP data of the year 2007 was about 11% of the globe.
The monthly cloud-free fraction of the TANSO-FTS SWIR, i.e., the ratio between the number of measurement scenes that passed all cloud-detection algorithms described in Sect.3.1 and the number of total measurement scenes, was about 7% on average.Part of this discrepancy may have been due to the different footprint shapes, which were 10.5-km-diameter circles and 0.09 × 10-km rectangles for TANSO-FTS and CALIOP, respectively.If the sensitivities of the CALIOP and TANSO-FTS 2-µm-band to thin cirrus are comparable, the clear-sky probability for TANSO-FTS is expected to be lower than that for CALIOP due to this footprint-size difference.The occurrence of cloud-free areas at high latitudes in the winter hemisphere is low, mainly due to the TANSO-CAI cloud flag test (figure not shown).Further optimization of the cloud detection methods seems to be required.
Figure 4 shows the integrated data screening results, i.e., the TANSO-FTS measurement scenes that passed all screening items described in Sect. 3 and hence were considered suitable for retrieval analysis.This fraction was almost the same as the cloud-free fraction over land.However, measurement scenes were limited around the sun-glint region over the ocean.As mentioned in Sect.3.2, the IMC-fluctuated scans should be removed.One way to detect such scans is by checking the relative magnitude of the interferogram to the center burst.For low-reflectance regions, such as those over the ocean, the center burst of the interferogram is hard to detect.Such scans are tagged "IMC-fluctuated" regardless of the true IMC motion and are removed in the data screening process.
The latitudinal range for retrieval moves north or south according to the seasonal change of solar declination.The region where the retrieval results are obtainable throughout the year is restricted over land, with a latitudinal range of 45 • S to 45 • N. No region over the ocean contains yearround retrievals because of the narrow latitudinal range of the sun-glint region for TANSO-FTS.The number of measurement scenes suitable for retrieval analysis is 9371 of 292 810 for July 2009 and 7839 of 290 369 for January 2010.The monthly mean fraction of the measurement scenes suitable for retrieval analysis is about 3%.

Information obtainable from measurements
The column averaging kernel functions corresponding to the spectra shown in Fig. 1 are plotted in Fig. 5. From the surface to the 200-hPa level, the column averaging kernel functions were about unity for most cases.This indicates that the a priori values within this vertical range hardly affected the retrieved values.The shaded region in Fig. 5 shows the typical range of the column averaging kernel.The column averaging kernel function depended on many parameters, such as solar zenith angle, satellite viewing angle, surface reflectance, AOD, and VCM.Because the uncertainty in the a priori CO 2 concentration at the stratosphere is large, whereas that of CH 4 is small (Eguchi et al., 2010), the relatively large sensitivity of CO 2 concentration to the measurement noise in the upper atmosphere led to variability in the CO 2 column averaging kernel functions that was larger than that of CH 4 .
As stated above, the DFS indicates the number of independent variables obtainable from the observations.In other words, DFS is an index that identifies whether the observation yields sufficient information.The effectiveness of the retrieval results can be evaluated by the uncertainty reduction where σ a priori denotes the a priori uncertainty calculated from S a and Eq. ( 12), and σ a posteriori is the a posteriori uncertainty defined as the root-square sum of the measurement  noise and the smoothing error.Figure 6 shows the global distribution of monthly-averaged SNR, DFS, a priori and a posteriori uncertainties, and the uncertainty reduction for single observations of XCO 2 and XCH 4 .To make effective observations, the measurement noise should be sufficiently smaller than the radiance response corresponding to the a priori uncertainty of the target gas concentration.For XCO 2 , the SNR of South Africa is higher than that of Siberia, but the DFSs are almost the same for these regions because the a priori uncertainty of South Africa is smaller than that of Siberia.estimation for those gases.Although the data availability is limited due to the small number of cloud-free measurements around such regions, the GOSAT measurement is expected to improve current knowledge and reduce the large uncertainty of carbon sources and sinks.

Retrieval results
Before discussing the retrieval results, we briefly describe the MAP iteration.The solution converges in less than six iterations for more than half of the measurement scenes.About 1.5% of the measurement scenes cannot converge.For the converged scenes, the residual spectra are sufficiently small in general (see Figs. 7,8,and 9), and only 0.6% of the measurement scenes cannot pass the χ 2 test.These nonconverged or large-residual scenes tend to fail a fitting in the O 2 sub-band, indicating that they might be contaminated by undetected clouds or high-altitude aerosols that are not taken into account in the current forward model.Focusing on the residual spectra, several spectral points show the large residual probably due to the error in the spectroscopic databases.Furthermore, systematic residuals remain in the O 2 sub-band, which may be attributable to differences in the O 2 absorption line shape and/or ILSF of TANSO-FTS Band 1 (Note: the ILSF at the shorter wavelength region is more sensitive to optical alignment and hard to determine accurately).
The global distributions and latitudinal distributions of zonal averages of retrieval results and simulated results using the NIES TM, which is also used for a priori profiles of CO       than those over land.Although the elements of the state vector are different for land cases and ocean cases, no clear gaps are found across the coastline.The seasonal variation of XCO 2 , which is controlled mainly by photosynthesis and respiration in terrestrial ecosystems, is clear.In the northern hemisphere, XCO 2 decreases in the northern summer as the northern terrestrial ecosystem becomes active, and it increases after November.Unlike XCO 2 , the retrieved XCH 4 values in the northern hemisphere are higher than those in the southern hemisphere throughout the year because the natural and anthropogenic sources of CH 4 are mainly distributed there.The zonal-averaged XCH 4 is at a maximum over the latitudinal region from 15 to 30 • N from June to October and 0 to 15 • N in other months.This is due to the seasonal variations of the strengths of sources and sinks of CH 4 .4) to ( 6).These values correspond to the precision of the retrieval result.The measurement noise is the dominant error component, and the interference error arising from the auxiliary parameters is small.Therefore, the total retrieval errors show similar patterns to the a posteriori uncertainties shown in Fig. 6.The total retrieval errors decrease with SNR.Because the variability of SNR over land is larger than that over the ocean, both maximum and minimum values appear over land, and the standard deviations of each error component over land are larger than those over the ocean.The fraction of the smoothing error to the total retrieval error in XCO 2 is larger than that for XCH 4 because of the difference in the relative magnitude of the a priori uncertainties in the stratosphere (Eguchi et al., 2010).In most cases, the retrieval precisions are estimated to be smaller than 3.5 ppm for XCO 2 and 15 ppb for XCH 4 .
The retrieval accuracy, i.e., the systematic error of the retrieval, is another issue.It depends on the accuracies of the instrumental calibration, the forward model, and the assumed model parameters.Detailed discussion of the systematic error should be based on validation results, which are beyond the scope of this paper.The preliminary of the systematic error is presented in another paper (Morino et al., 2010).

Summary
The retrieval algorithm for column abundances of CO 2 and CH 4 from GOSAT TANSO-FTS SWIR observations in cloud-free scenes and its results have been presented.The www.atmos-meas-tech.net/4/717/2011/Atmos.Meas.Tech., 4, 717-734, 2011 cloud-free scenes are identified using the TANSO-CAI cloud flag data and two additional cloud-detection methods that compensate for weaknesses in the TANSO-CAI cloud flag.The retrieval algorithm is based on an optimal estimation method.To decrease the effects of polarization, a total intensity spectrum is generated from two orthogonally polarized measurement spectra.A fast scalar radiative transfer code using a double-k distribution approach is used as a forward radiative transfer model to reduce computational costs.To avoid possible bias errors, spectral channels contaminated with solar Fraunhofer lines are removed from the retrieval analysis.Furthermore, other possible interference parameters are included in the state vector, to be retrieved simultaneously with the gas concentrations.The interhemispherical differences and the temporal variations of retrieved XCO 2 and XCH 4 show similar patterns with those simulated using the NIES TM, although biases exist, and the amplitudes of retrieval results are larger than those of the NIES TM.The retrieval precision of the column abundances for single observations is estimated to be within about 1%.The major error source for the retrieval precision is instrumental noise, while the interference error due to auxiliary parameters is relatively small.Although the available data over the ocean are limited around the sun-glint region, there is no clear gap in the retrieval results across coastlines, and the expected retrieval precision over the ocean is comparable to that over land.A remarkable reduction in uncertainty is expected over Siberia and South America, where continuous measurements have been limited, and hence uncertainties in sources and sinks are large.However, the retrieval algorithm must be further refined to improve its accuracy.Furthermore, intercomparison of retrieval results based on several kinds of retrieval algorithms with actual measurement data is important to elucidate the characteristics of each algorithm.

Figure 1
Figure 1 Examples of the GOSAT TANSO-FTS SWIR spectra (P-polarization component).Horizontal gray bars indicate the locations of absorption bands of major molecules in the atmosphere.

Fig. 1 .
Fig. 1.Examples of the GOSAT TANSO-FTS SWIR spectra (Ppolarization component).Horizontal gray bars indicate the locations of absorption bands of major molecules in the atmosphere.

Figure 2
Figure 2 Schematic diagram of the retrieval processing flow of TANSO-FTS SWIR."DFS," "MSR," and "AOD" indicate the degree of freedom for signals, mean squares of the residual spectra for each retrieval sub-band, and aerosol optical depth, respectively.

Fig. 2 .
Fig. 2. Schematic diagram of the retrieval processing flow of TANSO-FTS SWIR."DFS," "MSR," and "AOD" indicate the degree of freedom for signals, mean squares of the residual spectra for each retrieval sub-band, and aerosol optical depth, respectively.

Fig. 5 .
Fig. 5. Examples of column averaging kernel functions for CO 2 (a) and CH 4 (b).The shaded region shows the typical range of the column averaging kernel.

Figure 6
Figure 6 Global distributions of SNR (a), DFS (b and c), a priori uncertainty ( 2 uncertainty (e and h), and uncertainty reduction (f and i) of XCO 2 and XCH 4 retr 3

Fig. 6 .
Fig. 6.Global distributions of SNR (a), DFS (b and c), a priori uncertainty (d and g), a posteriori uncertainty (e and h), and uncertainty reduction (f and i) of XCO 2 and XCH 4 retrievals for July 2009.

Fig. 7 .
Fig. 7. Observed and fitted spectra and residuals for the O 2 subband (a), the CO 2 sub-band (b), and the CH 4 sub-band (c).Measurement was conducted over the Pacific ocean (24.9 • N, 135.6 • E) on 1 July 2009.Black dots in the residual plots indicate the channels used in the retrieval analysis, i.e., not contaminated by Fraunhofer lines.

Figure 8
Figure 8 Observed and fitted spectra and residuals for the O2 sub-band (a), the CO2 sub-band (b), and the CH4 conducted over the Sahara desert (21.7ºN, 12.1ºE) on 16 July 2009.Black dots in the residual plots indicate the channe not contaminated by Fraunhofer lines.

Fig. 8 .
Fig. 8. Observed and fitted spectra and residuals for the O 2 subband (a), the CO 2 sub-band (b), and the CH 4 sub-band (c).Measurement was conducted over the Sahara desert (21.7 • N, 12.1 • E) on 16 July 2009.Black dots in the residual plots indicate the channels used in the retrieval analysis, i.e., not contaminated by Fraunhofer lines.

Figure 9
Figure 9 Observed and fitted spectra and residuals for the O2 sub-band (a), the CO2 sub-band (b), and the CH4 conducted over the west Siberia (60.7ºN, 54.1ºE) on 16 July 2009.Black dots in the residual plots indicate the channe not contaminated by Fraunhofer lines.

Fig. 9 .
Fig. 9. Observed and fitted spectra and residuals for the O 2 subband (a), the CO 2 sub-band (b), and the CH 4 sub-band (c).Measurement was conducted over the west Siberia (60.7 • N, 54.1 • E) on 16 July 2009.Black dots in the residual plots indicate the channels used in the retrieval analysis, i.e., not contaminated by Fraunhofer lines.

Fig. 10 .
Fig. 10.Monthly averages of the XCO 2 [ppmv] retrieved by GOSAT (a, c) and simulated by the NIES TM (b, d) within a 2.5 × 2.5-degree grid box.A blank indicates that no valid retrieval result was available within the grid box.Different color-scales are used for the GOSAT retrieval and NIES TM simulation.

Figure 11
Figure 11 Monthly averages of the XCH4 [ppbv] retrieved by GOSAT (a, c) and simulated by the NIES TM (b, d) within a 2.5 x 2.5-degree grid box.A blank indicates that no valid retrieval result was available within the grid box.Different color-scales are used for the GOSAT retrieval and NIES TM simulation.

Fig. 11 .
Fig. 11.Monthly averages of the XCH 4 [ppbv] retrieved by GOSAT (a, c) and simulated by the NIES TM (b, d) within a 2.5 × 2.5-degree grid box.A blank indicates that no valid retrieval result was available within the grid box.Different color-scales are used for the GOSAT retrieval and NIES TM simulation.

Figure 12
Figure 12 Latitudinal distributions of the zonal mean of the retrieved and simulated XCO2 [ppmv] (a, b) and XCH4 [ppbv] (c, d).The standard deviations of zonal variation for July 2009 and January 2010 are plotted as error bars.

Fig. 12 .
Fig. 12. Latitudinal distributions of the zonal mean of the retrieved and simulated XCO 2 [ppmv] (a, b) and XCH 4 [ppbv] (c, d).The standard deviations of zonal variation for July 2009 and January 2010 are plotted as error bars.

Table 1 .
State vector, its a priori, and the VCM for the TANSO-FTS SWIR retrieval."SR" indicates data obtained by semi-real time processing, and "DB" indicates data prepared as monthly databases.

Table 2 .
Statistics of the retrieval errors of XCO 2 (a) and XCH 4 (b).Measurement noise, smoothing error, and interference error are calculated by Eqs.(4) to (6).The total error is the square root of the sum of these error components.Table 2 summarizes the retrieval errors of XCO 2 and XCH 4 calculated from Eqs. ( (a) XCO 2 ERROR [ppmv] Land Ocean Av.Std.Max.Min.Av.Std.Max.Min.