Assessing temporal clear-sky errors in assimilation of satellite CO 2 retrievals using a global transport model

The Orbiting Carbon Observatory (OCO) and the Greenhouse gases Observing SATellite (GOSAT) will make global observations of the total column dry-air mole fraction of atmospheric CO2 (XCO2) starting in 2008. Although satellites have global coverage, XCO2 retrievals will be made only a few times each month over a given location and will only be sampled in clear conditions. Modelers will use XCO2 in atmospheric inversions to estimate carbon sources and sinks; however, if satellite measurements are used to represent temporal averages, modelers may incur temporal sampling errors. We investigate these errors using a global transport model. Temporal sampling errors vary with time and location, exhibit spatially coherent patterns, and are greatest over land and during summer. These errors often exceed 1 ppm and must be addressed in a data assimilation system by correct simulation of synoptic CO 2 variations associated with cloud systems.


Introduction
Atmospheric inversions, which use atmospheric CO 2 concentrations and a transport model to infer carbon sources and sinks, have provided valuable information regarding largescale surface carbon fluxes (Gurney et al., 2002;Rödenbeck et al., 2003;Baker et al., 2006b). However, as modelers move to higher-resolution fluxes, the uncertainties increase primarily due to sparse data coverage (Gurney et al., 2003;Dargaville et al., 2005). In addition to the rapidly expanding surface network, CO 2 measurements from satellites will be used to quantify regional carbon sources and sinks. Studies indicate that spatially dense, global measurements of the column-integrated dry air mole fraction of atmospheric CO 2 Correspondence to: K. D. Corbin (kdcorbin@atmos.colostate.edu) (X CO 2 ) with precisions of ⇠1 ppm are expected to substantially reduce the uncertainties in the CO 2 budget (Rayner and O'Brien, 2001;Baker et al., 2006a;Chevallier et al., 2007;Miller et al., 2007).
Two satellites designed specifically to measure X CO 2 are scheduled to launch in late 2008: the Orbiting Carbon Observatory (OCO) (Crisp et al., 2004) and the Greenhouse gases Observing SATellite (GOSAT) (NIES, 2006). Both satellites will fly in a polar sun-synchronous orbit with an equator crossing time of ⇠13:00 LST, collecting near-infrared spectra from reflected sunlight. OCO will orbit just ahead of the Earth Observing System (EOS) Aqua platform in the A-train, which has a 16-day repeat cycle. OCO has a 10 km-wide cross track field of view that is divided into eight 1.25 kmwide samples with a 2.25 km down-track resolution at nadir. GOSAT's orbit is recurrent every 3 days with a varying swath width from 88 to 800 km.
Satellite X CO 2 retrievals will be used in synthesis inversion and data assimilation models to quantify carbon flux estimates; however, X CO 2 measurements require clear conditions and are sampled at a single instance in time. If satellite data is used to represent temporal averages, variations in atmospheric CO 2 on synoptic time-scales may lead to temporal sampling errors. An observational assessment of systematic differences between mid-day CO 2 on clear-sky versus all days using multiyear continuous data at two towers located in mid-latitude forests found systematic differences of 1 to 3 ppm in CO 2 , with lower concentrations on sunny days than average (Corbin and Denning, 2006). The differences at both towers were greatest in the winter and were not attributable to anomalous surface fluxes. Another study used a high-resolution cloud-resolving model to analyze temporal sampling errors by comparing simulated satellite data to mean concentrations over an area equivalent to a global transport model grid column . At both a temperate and a tropical site, the differences between satellite measurements and diurnally and bi-monthly averaged transport model grid column concentrations were large (>1 ppm). At the temperate site, the temporal sampling errors were negatively biased because of systematic X CO 2 anomalies associated with fronts that were masked by clouds.
While Corbin and Denning (2006) and Corbin et al. (2008) both previously showed underestimations of clear-sky satellite concentrations compared to the true temporal mean, both of these studies only assessed the differences under specific conditions. Corbin and Denning (2006) looked at continuous observations from towers that are both located in midlatitude forests, and Corbin et al. (2008) focused on two simulations over limited regions for short time-periods in August. In this study, we are expanding on previous research by investigating the clear-sky temporal sampling errors using a global atmospheric transport model. In addition to assessing clear-sky differences globally, we also investigate how these differences vary on seasonal timescales.

Model and methods
We simulated 2003 atmospheric CO 2 concentrations using the Goddard Space Flight Center (GSFC) Parameterized Chemical Transport Model (PCTM) (Kawa et al., 2004). The dynamical core of PCTM is a semi-Lagrangian algorithm in flux form from Lin and Rood (1996). PCTM is driven by meteorological fields from NASA's Goddard Earth Observation System version 4 (GEOS-4) data assimilation system (DAS) (Bloom et al., 2005). PCTM was run with 1.25 by 1 horizontal resolution, 26 vertical levels up to 20.5 km, and a 7.5-min time-step with CO 2 output every 3 h. For spin-up, PCTM was run for 3 years from 2000-2002.
The surface fluxes of CO 2 include biological fluxes, ocean fluxes, and fossil fuel emissions. Surface sources and sinks associated with the terrestrial biosphere are based on computations of hourly net ecosystem exchange from the Simple Biosphere Model version 3 (SiB3) (Sellers et al., 1996a,b;Baker et al., 2007). Ocean fluxes are adopted from Takahashi et al. (2002), and estimates of fossil fuel emissions are from Andres et al. (1996). Comparisons to a network of in-situ continuous analyzers showed that the simulation captures synoptic features well .
To assess temporal sampling differences, for each gridcolumn in the model we compare simulated satellite concentrations to the corresponding concentrations that include all conditions. Differences between the simulated satellite data and the mean modelled concentrations are assessed on both annual and seasonal time-scales. While there are large differences in the size of the model grid cells and the OCO samples, Corbin et al. (2008) found spatial representation errors are less than 0.5 ppm, indicating that it is reasonable to simulate OCO observations from a model of this resolution.
To simulate satellite data, PCTM was sampled using the OCO methodology. First, we created a clear-sky subset of PCTM CO 2 concentrations. To determine if the grid cell is clear, we used downwelling solar radiation data from GEOS-4 and created the clear-sky subset using the top-ranked data per month for each grid cell above a specified threshold value.
Simulating OCO orbit and scan geometry, Rayner et al. (2002) calculated a 26% probability that a pixel within a transport model grid cell will be clear. As cloud cover varies with location and time of year, we investigated both 15% and 40% thresholds to assess temporal sampling errors at realistic minimum and maximum coverage. Decreasing the threshold value to 15% produces more random errors with larger differences, while increasing the threshold to 40% decreases the magnitude of the differences but increases the spatial coherency. Since the main conclusions from this analysis are robust among all three thresholds, we will show the results from the 26% threshold value.
Since OCO is not yet in orbit, we used CloudSat tracks to determine the location and timing of satellite overpasses. CloudSat, an existing satellite in the A-train constellation, is flying with a nearly identical orbit only minutes behind the proposed OCO orbit (Stephens et al., 2002). This study used CloudSat tracks from 1 through 16 January 2007, and the tracks are repeated every 16 days for the entire year; however, we only use data from the ascending branch since OCO requires sunlight. The model was sampled at the grid cell that included the satellite retrieval at the closest model hour available, using only the concentrations included in the clearsky subset. After sampling the data, the concentrations were pressure weighted to create the OCO subset of total column CO 2 .
The simulated satellite data are compared to the true annual and seasonal mean total column CO 2 concentrations at every grid cell, which are calculated by taking the mean of all time-steps and cloud conditions. By including both diurnal errors resulting from the time of day the satellite samples and clear-sky errors from retrieving data only in clear-sky conditions, the differences shown are directly comparable to errors that will occur in annual and seasonal mean maps produced using satellite data. Sensitivity tests to determine the impact of sampling at a specific time of day reveal that the errors on these time-scales are due primarily to clear-sky sampling. At over 99% of the grid points, the differences in the annual mean between using all time-steps and sampling only one hour per day are <0.1 ppm. On the seasonal timescale, over 98% of the grid cells have seasonal means calculated using only 13:00 LST data within 0.1 ppm of the seasonal mean including all hours, with a maximum difference of 0.3 ppm. Due to the minimal impact of sampling at a specific time of day on seasonal and annual timescales, the results shown in the next section are due primarily to sampling data in clearsky conditions only.  Fig. 1. Annual mean temporal sampling errors, obtained by subtracting the annual mean at each grid cell from the annual mean in the OCO subset.

Results
Annual mean temporal sampling errors are calculated by subtracting the annual mean total-column CO 2 concentration from the annual mean concentration in the simulated OCO subset for each grid cell ( Figs. 1 and 2). Differences between the satellite-retrieved annual mean and the true annual mean are small in the Southern Hemisphere and increase with latitude. Large differences (>1 ppm) occur over land and in the Northern Hemisphere. The standard deviation is ⇠0.8 ppm over subtropical land in the Southern Hemisphere, reflecting the large differences seen over South America. In the Northern Hemisphere, zonally averaged standard deviations greater than 1 ppm occur. Spatially coherent negative differences can be seen over southeastern North America, southern South America, the North Atlantic Ocean, and Europe. The zonal average of the annual mean differences is ⇠ 0.3 ppm in the Northern Hemisphere mid-latitudes, indicating inversions may incur a negative bias if satellite measurements are used to represent an annual mean. We calculated seasonal temporal sampling errors incurred from using satellite measurements to represent seasonal averages by subtracting the 3-month seasonal total column CO 2 PCTM concentrations for each grid cell from the seasonal mean in the OCO subset at the same grid cell (Figs. 3-5). The magnitude and location of the differences varies by season. Large differences occur during the summer, as the greatest standard deviation in the Southern Hemisphere is in DJF and in the Northern Hemisphere is JJA. Differences also tend to be larger over land regions, likely due to the larger biospheric fluxes and fossil fuel emissions.
The seasonal maps show coherent spatial patterns. In the Northern Hemisphere winter, significant underestimates of the mean are seen in the eastern United States and Europe, while slight overestimations are prevalent near India. The regional underestimations can be seen in the zonal mean of the errors. The transition period during MAM has relatively small errors compared to the other seasons, as the standard deviations are less than 1 ppm; however, over tropical South America the satellite measurements are higher than the seasonal mean and over higher northern latitudes the concentrations over land are biased lower than average. In JJA, over the Southern Hemisphere and tropical oceans the errors are small and random, while over southern South America the satellite underestimates the seasonal mean in the southern half of the continent and overestimates the mean in the northern portion. Large overestimates can be seen in Asia, while underestimates can be seen over the north Atlantic. SON is also characterized by larger zonally averaged errors, particularly from regional overestimates in Asia and underestimations in South America. Calculating seasonal temporal sampling errors reveals large, spatially coherent differences between satellite measurements and temporal means that vary with time and location.

Conclusions
This study indicates that modelers cannot use satellite measurements sampled only in clear conditions to represent temporal averages. The 2003 annual mean errors calculated using PCTM are relatively small and randomly dispersed; however, the errors introduced into inversions using satellite data to represent smaller timescales such as seasonal means vary with both time and location and exhibit coherent spatial patterns at continental scales. The differences are largest during summer months and tend to be greater over land. In the Northern Hemisphere, relatively large regions in North America and Europe underestimate the temporal mean in the winter and fall, while these regions have large but random differences in the summer. Over South America, satellite measurements overestimate the concentrations in fall and winter but underestimate the concentrations during spring.
Since differences between clear-sky concentrations and total concentrations are spatially coherent on seasonal and annual timescales, we suggest that the main cause of clearsky errors is synoptic variability and the covariance of clouds and atmospheric CO 2 concentrations. A study by Parazoo et al. (2008) used the same model to investigate mechanisms for atmospheric variability. In mid-latitudes, large synoptic variations in atmospheric CO 2 are due to weather disturbances that are associated with cloud cover, such as frontal systems. Due to deformational flow, frontal systems create large horizontal gradients in CO 2 that are masked by clouds and thus cannot be sampled by satellites. In the tropics, Parazoo et al. (2008) show that a recharge-discharge mechanism controls variations of atmospheric CO 2 concentrations. CO 2 anomalies created within the boundary layer are transported to the upper troposphere by vertical mixing and convection, which is covariant with cloud cover. Since these anomalies occur under cloudy conditions, they are hidden from satellite observations.
Although these errors should be investigated for various years using different transport models, it is likely spatially coherent patterns would still exist regardless of model choice due to the covariance between clouds and CO 2 concentrations. It is imperative that source/sink estimates from satellite data match the sampling time and location to the observation platform. Further, transport models will need to capture correct placement and timing of convective events and synoptic weather features, including fronts and clouds.