The carbon sink in China as seen from GOSAT with a regional inversion system based on the Community Multi-scale Air Quality (CMAQ) and ensemble Kalman smoother (EnKS)

. Top-down inversions of China’s terrestrial carbon sink are known to be uncertain because of errors re-lated to the relatively coarse resolution of global transport models and the sparseness of in situ observations. Taking advantage of regional chemistry transport models for mesoscale simulation and spaceborne sensors for spatial coverage, the Greenhouse Gases Observing Satellite (GOSAT) retrievals of column-mean dry mole fraction of carbon dioxide (XCO 2 ) were introduced in the Models-3 (a ﬂexible software framework) Community Multi-scale Air Quality (CMAQ) and ensemble Kalman smoother (EnKS)-based regional inversion system to constrain China’s biosphere sink at a spatiotemporal resolution of 64 km and 1 h. In general, the annual, monthly, and daily variation in biosphere ﬂux was reliably delivered, attributable to the novel ﬂux forecast model, reasonable CMAQ background simulation, well-designed observational operator, and Joint Data Assimilation Scheme (JDAS) of CO 2 concentrations and natural ﬂuxes. The size of the assimilated biosphere sink in China was − 0.47 PgCyr − 1 , which was comparable with most global estimates (i.e., − 0.27 to − 0.68 PgCyr − 1 ). Furthermore, the seasonal patterns were recalibrated well, with a growing season that shifted earlier in the year over central and south China. Moreover, the provincial-scale biosphere ﬂux was re-estimated, and the difference between the a posteriori and a priori ﬂux ranged from − 7.03 TgCyr − 1 in Heilongjiang to 2.95 TgCyr − 1 in Shandong. Additionally, better performance of the a posteriori ﬂux in contrast to the a priori ﬂux was statistically detectable when the simulation was ﬁtted to independent observations, indicating sufﬁcient to robustly constrained state variables and improved ﬂuxes estimation. This study serves as a basis for future ﬁne-scale top-down carbon assimilation.


Introduction
In the context of human-induced climate change, the Paris Agreement charts the course for the world to transition to a green way of development and outlines the minimum steps to be taken to protect the Earth, which requires all countries to make significant commitments to stabilize atmospheric greenhouse gas concentrations and keep the global average temperature to well under a 2 • C rise (UNFCCC, 2015). Therewith, a growing number of countries and regions are pledging to achieve net-zero emissions in the second half of this century, for instance, Austria by 2040, Sweden by 2045, the European Union by 2050, and China by 2060. Hence, there has been an increasing demand from policymakers and the scientific community in general for accurate knowledge of CO 2 emissions from anthropogenic sources (so that the targeted reductions are effective) and from biospheric uptake (so that natural reservoirs remain stable) (Ciais et al., 2015;Pinty et al., 2017;Friedlingstein, et al., 2020;Deng et al., 2022). In 2019, the Intergovernmental Panel on Climate Change (IPCC) published a refined methodology report as an update to its 2006 guidelines with the aim to complement them with a bottom-up, transparent framework and highlight the Monitoring and Verification Support (MVS) capacity using independent atmospheric measurements (IPCC, 2019). A great deal of effort has been devoted in recent decades to developing and applying atmospheric CO 2 inversions to constrain global-and regional-scale CO 2 fluxes (Enting et al., 1995;Thompson and Stohl, 2014;Broquet et al., 2011, Peters, et a., 2007Tian et al., 2014;Kou et al., 2017;Kountouris et al., 2018). Most of these inversions are informed by ground-based observations and global chemistry transport models (CTMs), which is far from sufficient to support the abovementioned needs. Despite the development of surface observation networks with highly accurate continuous data, such as ICOS (the Integrated Carbon Observation System) in Europe, the global distribution of ground-based CO 2 measurements remains rather sparse and inhomogeneous. Consequently, the errors introduced by the incomplete observation network, and the uncertainties of the CTMs, as well as inversion framework, have been proven to be a disadvantage in delivering consistent regional flux estimates obtained using state-of-the-art global inversions from the national up to the continental scales (Monteil et al., 2020;Piao et al., 2022;Schuh et al., 2022).
Spaceborne sensors, designed specifically to retrieve atmospheric concentrations with unprecedented spatial coverage, have in recent years begun to improve the current understanding of greenhouse gases and the associated CO 2 emissions' MVS capacity. At present, there are several operational CO 2 observation satellites in orbit, including Japan's Greenhouse Gases Observing Satellite (GOSAT; Kuze et al., 2009), GOSAT-2 (Glumb et al., 2014), the US Orbiting Carbon Observatory 2 (OCO-2; Eldering et al., 2017a, b), OCO-3 (Eldering et al., 2019), and China's TanSat Yang et al., 2018). It is recognized that satellite retrievals of shortwave infrared radiation, despite their uncertainty, are sufficient to reliably capture the seasonal variability of XCO 2 (column-mean dry mole fraction of carbon dioxide), as a first-order question in constraining inversion models (Lindqvist et al., 2015;Li et al., 2017). Furthermore, several centers and universities routinely assimilate GOSAT XCO 2 data into models to estimate terrestrial ecosystem carbon exchange, including Japan's National Institute for Environment Studies (NIES), the United States' National Aeronautics and Space Administration (NASA), France's Laboratoire des Sciences du Climat et de I'Environnement, the Netherland's Institute for Space Research, the UK's University of Edinburgh, Canada's University of Toronto, and China's Nanjing University. As an example, the NIES GOSAT Project provides a Level 4 CO 2 data product, and the monthly regional CO 2 flux estimates for the period 2009-2013, based on XCO 2 retrievals and NIES' global atmospheric tracer transport model with Bayes' theorem, are publicly available (Maksyutov et al., 2013;Takagi et al., 2014). Moreover, NASA's Carbon Flux Monitoring System is another recent top-down global inversion system configured with 4DVar and GEOS-Chem (Goddard Earth Observing System with Chemistry) and concurrently assimilates XCO 2 from GOSAT and OCO-2. It has released the longest available terrestrial flux estimates (from 2010-2018) on selfconsistent global and regional scales and has planned future updates of the dataset on an annual basis . In addition, the University of Edinburgh has simultaneously produced a 5-year CH 4 and CO 2 flux estimate for 2010-2014 directly from GOSAT retrievals of XCH 4 : XCO 2 by using GEOS-Chem and an ensemble Kalman filter (EnKF) . Moreover, the Global Carbon Assimilation System has been upgraded by Nanjing University to assimilate the GOSAT XCO 2 retrievals from 2010-2015 with the ensemble square root filter (EnSRF) algorithm and the Model for Ozone and Related Chemical Tracers version 4 (Jiang et al., 2021(Jiang et al., , 2022. Overall, the top-down CO 2 biosphere flux datasets inverted from satellite data suggest an improved flux estimation compared with the large uncertainties in processbased terrestrial biosphere model estimates Chevallier et al., 2019;Chen et al., 2021). Deng et al. (2016) and Wang et al. (2018) further highlighted the importance of improved observational coverage to better quantify the latitudinal distribution of terrestrial fluxes by combining GOSAT observations over land and ocean. Also, the sensitivity of observations from GOSAT and OCO-2 to optimized CO 2 fluxes has been examined using GEOS-Chem, indicating that GOSAT offers greater sensitivity in Northern Hemisphere spring and summer (Byrne et al., 2017;Wang et al., 2019).
Nevertheless, the inversions primarily involved uncertainties in global CTMs, satellite retrievals, a priori fluxes, and inversion frameworks. A GOSAT CO 2 global inversion intercomparison experiment involving eight research groups found that, as expected, the most robust flux estimates were obtained at large scales and quickly diverged at subcontinental scales (Chevallier, 2015;Houweling et al., 2015;Fu et al., 2021). Generally, the assimilated CO 2 flux (i.e., the analytical field) is a weighted average of background information and observations, which depends on the correlation coefficient between simulated concentrations of the observation and the state variable (i.e., CO 2 flux). In particular, considering the transport errors introduced by global CTMs, the reliability of the regional fluxes inferred from GOSAT retrievals remains a topic of ongoing discussion (Reuter, et al., 2017;. Consequently, if we can configure a reasonable simulation of the background CO 2 concentration compared with the coarse spatiotemporal resolution of the global scale, then the flux constrained by observations can be estimated more precisely at national and city scales. The step up in inversion resolution and accuracy calls for new developments in shifting from global to regional inversions.
The use of regional CTMs in CO 2 research is more recent. For instance, Huang et al. (2014) demonstrated the importance of regional CTM performance to assimilation and suggested it is possible to improve the CO 2 concentration accuracy of the synoptic-scale variation by utilizing EnKF and CMAQ (Multi-scale Air Quality Modeling System). Zhang et al. (2021) assimilated OCO-2 retrievals with WRF-Chem/DART (Weather Research and Forecasting model coupled with Chemistry/Data Assimilation Research Testbed) to improve the estimation of CO 2 concentrations. In recent years, several studies have relied on regional CTMs in CO 2 flux inversions inferred from surface stations, towers, and aircraft flights, including CMAQ, WRF-Chem, CHIMERE, and the FLEXPART Lagrangian model. Not only terrestrial ecosystem exchange (e.g., Europe, North America, East Asia) but also urban CO 2 emissions (e.g., Los Angeles, Paris, Indianapolis) have been estimated, and the importance of regional CTM is increasingly recognized with their advantages in resolving fine-scale CO 2 concentrations (Brioude et al., 2013;Staufer et al., 2016;Lauvaux et al., 2016;Thompson et al., 2016;Kou et al., 2017;Zheng et al., 2018b;Monteil and Scholze, 2021). Moreover, the potential use of regional CTM in CO 2 inversions with satellites has been explored with artificial retrievals by Observing System Simulation Experiments . Pillai et al. (2016) further concluded that satellite missions such as CarbonSat (Carbon Monitoring Satellite) have high potential to obtain high-resolution CO 2 fluxes in Germany. However, regional CTMs are rarely used in satellite carbon data inversion in estimating China's terrestrial carbon sink, even though multimodel comparisons have reported large uncertainties introduced by global CTMs in China's top-down inversion (Wang et al., 2020;Piao et al., 2022;Schuh et al., 2022;. Previous studies have highlighted that the simultaneous assimilation of concentrations and fluxes as state variables can help reduce the uncertainty of both the initial CO 2 fields and the fluxes (Tian et al., 2014;Peng et al., 2015;Kou et al., 2017). Recently, Peng et al. (2017Peng et al. ( , 2018Peng et al. ( , 2020 improved air quality forecasts and emission estimates over China by developing a novel flux forecast model with the ensemblebased Joint Data Assimilation Framework (JDAS), so that the extended model can construct ensembles of both concentration and flux at the hourly scale. As an extension to this work, JDAS was further developed towards a high-resolution inversion of CO 2 fluxes based on the CMAQ and ensemble Kalman smoother (EnKS) with historical GOSAT observations over China, which holds an advantage over global models in terms of the CO 2 background information and inversion scheme. To the best of our knowledge, this is the most up-to-date estimate of China's biosphere flux informed by a regional CTM and satellite observations. It should prove to be of considerable value, particularly under the framework of the Paris Agreement, which requires high-spatiotemporalresolution inversions of CO 2 flux for carbon accounting at national scales.
In this paper, we focus on the development of top-down estimates constrained by GOSAT retrievals and CMAQ. Using this unique regional inversion technique, we address the following questions: 1. On what scales can regional CTMs and GOSAT observations facilitate the inversion of China's carbon sink?
2. What is the difference between posterior flux inferred from spaceborne retrievals and prior flux?
2 Methods and data 2.1 CMAQ regional transport model The atmospheric transport and the signature of sources and sinks in CO 2 concentrations were simulated using a regional CTM, i.e., CMAQ, which was originally developed by the US Environmental Protection Agency to model multiple air quality issues over a variety of scales and has been updated for passive tracers, as in Kou et al. (2013) with a 1-64 km horizontal resolution capability. The CMAQ regional modeling system has already been used in several regional studies and has shown promising performance in capturing the fine-scale spatiotemporal variability of CO 2 mixing ratios (e.g., Kou et al., 2013;2015;Liu et al., 2013;Huang et al., 2014;Li et al., 2017). The CMAQ configuration used here was a domain of 6720 km × 5504 km with 64 km × 64 km fixed grid cells centered at 35 • N and 116 • E in a rotated polar stereographic map projection. This domain, having 105 (west-east) × 86 (south-north) grid points, covered the whole of mainland China and its surrounding regions ( Fig. 2). The model has 15 vertical layers unequally spaced from the ground to approximately 23 km, half of which are concentrated in the lowest 2 km to improve the simulation of the atmospheric boundary layer.
In this study, the initial fields and boundary conditions of atmospheric CO 2 volume fraction were obtained by interpolation of NOAA's CT2019B, which is a widely recognized estimate of the global distribution of atmospheric CO 2 . CT2019B CO 2 concentration were created using the optimized surface fluxes, with a spatial resolution of 3 • × 2 • , 25 vertical levels, and a temporal resolution of 3 h (Jacobson et al., 2020). In addition, the a priori biosphere and ocean fluxes used for simulations within the CMAQ domain were also derived from the CT2019B optimized fluxes at 3 h intervals but with a spatial resolution of 1 • × 1 • . The anthropogenic CO 2 emission fluxes were based on the Multiresolution Emissions Inventory for China, version 1.3, and the Regional Emissions Inventory in Asia, version 3.2, with monthly gridded data at a resolution of 0.25 • × 0.25 • (Zheng et al., 2018a;Kurokawa and Ohara, 2020). The Global Fire Emissions Database, version 4.1s, with monthly gridded data at a resolution of 0.25 • × 0.25 • , was applied to provide the biomass-burning emissions (van der Werf et al., 2017). The abovementioned four individual CO 2 fluxes (i.e., biosphere, fossil fuels, fire, and ocean) were spatially interpolated to the CMAQ grid, conserving the total mass of emissions. CMAQ integrated and generated a 3D CO 2 concentration ensemble derived by the N ensemble fluxes with perturbed CO 2 initial and boundary conditions. The time step of the CMAQ output is 1 h.
In addition, RAMS (Regional Atmospheric Modeling System) provides the three-dimensional meteorological fields, with the lowest seven layers being the same as those in CMAQ. The initial and lateral boundary meteorological fields, sea surface temperatures, and initial soil conditions were prescribed by the European Centre for Medium-Range Weather Forecasts reanalysis data with a spatial resolution of 1 • × 1 • and 6-hourly temporal intervals (Zhang et al., 2002).

JDAS CO 2 assimilation framework
In the joint assimilation framework, besides the application of CMAQ to generate ensemble CO 2 concentrations, a flux forecast model was also designed to represents natural flux variations on account of fluxes acting as model forcing. The EnKS was further designed to joint assimilate CO 2 concentrations and fluxes. A brief description of the flux forecast model as well as the ensemble assimilation scheme is presented below.

Flux forecast model
CO 2 flux was treated as the model input, with the result that ensemble samples of fluxes could not be prepared by the CMAQ's forward forecasting. Consequently, a novel flux forecast model was designed to generate the background CO 2 flux ensembles E f i,t+1 , where i = 1, . . ., N refers to the ith ensemble member at time t (Eq. 1). The superscripts a, f , and p denote "assimilation", "forecast", and "a priori", respectively. (1) First, the a priori flux ensemble E p i,t+1 is created by using the ensemble CMAQ forecast CO 2 concentration C f refers to the a priori flux. The covariance inflation factor β is further used to keep the ensemble spread of the CO 2 concentration scaling factor κ i,t . The ensemble mean of κ i,t can be expressed as ) is determined by the assimilated CO 2 flux at the same time on each day from the previous assimilation cycles among . ., 1), and the a priori CO 2 flux E p t+1 . M refers to the length of the smoothing window, which was chosen as 4 d.
This design follows Peters et al. (2007), in which the useful observational information from the previous assimilation cycle was made beneficial to the next assimilation cycle via a smoothing operator but was further modified to cooperate with the diurnal variation in CO 2 biosphere flux. Then, E f t+1 was used to re-center E p t+1 . In contrast to previous flux models without diurnal variation, this new flux model is advantageous insofar as it facilitates the development of assimilation between regional CTM forecasts and observations at the hourly scale, so as to achieve high-resolution inversion. On the other hand, negative flux in carbon assimilation is realistic and reasonable, so it is not excluded. In this way, the Gaussian assumption is satisfied in JDAS carbon assimilation.

EnKS assimilation scheme
The regional assimilation system used in this study, JDAS, was developed based on EnSRF originated from NOAA's operational EnKF system (https://dtcenter.ucar.edu/com-GSI/ users/docs/users_guide/GSIUserGuide_v3.7.pdf, last access: 15 June 2023). The EnSRF algorithm has been modified with the EnKS feature and further extended to simultaneously assimilate multiple chemical initial conditions and emissions with the in situ measurements of their atmospheric observations (Peng et al. , 2018(Peng et al. , 2020Kou et al., 2021).
In the present study, the GOSAT observations were introduced in the EnKS-based JDAS framework to constrain China's biosphere sink and CO 2 concentrations, and natural fluxes were designed to be concurrently assimilated. Hence, both the CO 2 concentrations (C) and natural fluxes (E) were regarded as state variables (i.e., x = [C, E] T ), and helpful observational information employed in the current assimilation cycle could be efficiently capitalized upon in the next assimilation cycle with reduced uncertainty in the initial CO 2 conditions. Accordingly, the background of the state variables, can be prepared by CMAQ and flux forecast model.
The observation operator has been designed to convert the background forecast to observation space. To obtain the simulated observations H (C f ), the observation operator H performs the necessary interpolation from CMAQ forecasts to observation space XCO 2 . The simulated CO 2 concentration profiles were mapped into the GOSAT satellite retrieval levels and then vertically integrated based on the satellite averaging kernel according to the following equation: where the subscript k represents the retrieval level, XCO p 2 denotes the a priori XCO 2 for retrieval, y p k is the a priori CO 2 profile for retrieval, A k stands for the satellite column-averaged kernel, h k is a pressure weighting function, and y f k denotes the CMAQ-simulated CO 2 profile interpolated into the corresponding retrieval levels. As in Eq. (1), the superscripts f and p also refer to "forecast" and "a priori" in Eq. (2). Moreover, w denotes the RAMS water mole fraction, which was used to map from the CO 2 concentrations to the dry mole fraction, as suggested by Feng et al. (2009). In addition, for the H (E f ), it should be noted that H includes not only interpolation (i.e., Eq. 2) but also CMAQ to convert from flux to simulated XCO 2 .
The observation-minus-background, OMB (i.e., y − H (C f )), is denoted as "observational increments" or "innovations", where y refers to GOSAT XCO 2 . The analysis x a is obtained by adding the innovations to the model forecast with weights K (i.e., Kalman gain matrix), which are determined based on the estimated statistical error covariance of the forecast and the observations based on Eq. (3).
Consequently, after completing the "forecast step", K is obtained by minimizing the analysis error covariance with evolved forecast error covariance over time. Then, the associated analyzed state variables, x a = [C a , E a ] T , can be updated by applying the EnKS constrained by GOSAT retrievals in the "analysis step". Hereafter, AN denotes the analysis fields x a , and BG denotes the model's first guess background fields x f . The basic configuration of the JDAS CO 2 inversion settings followed previous studies. For instance, the ensemble size N was set to 50 to sustain the balance between computational cost and ensemble performance. The horizontal covariance localization radius was chosen as 1280 km to localize the observation's impact and ameliorate the spurious long-range correlations between state variables and observations caused by the limited number of ensemble members (Peng et al., 2023;Houtekamer and Mitchell, 2001;Gaspari and Cohn, 1999). Moreover, the covariance inflation factor β was set to 80 to preserve the ensemble spread. In this study, the assimilation window of EnKS was set to 24 h, and hourby-hour assimilation was adopted in the novel flux forecast model and fine-scale CMAQ background simulation. In an assimilation cycle, the fluxes for the 24 h smoothing window were designed to be optimized hour by hour successively. The distribution of ensemble spread of CO 2 flux in January 2016 is provided in Fig. 1. It shows that the values of the ensemble spread range from 0.2 to 0.8 in most areas, which is consistent with our previous studies (Peng et al., 2015 in Fig. 11c andPeng et al., 2023).

GOSAT XCO 2 retrievals
GOSAT, launched by the Japan Aerospace Exploration Agency in January 2009, was designed to make near-global greenhouse gas measurements in a sun-synchronous orbit. It covers the whole globe in 3 d and has a sounding footprint of approximately 10.5 km. In this study, we assimilated GOSAT XCO 2 retrievals from NASA's Atmospheric CO 2 Observations from Space Level 2 standard data products (version ACOS_L2_Lite_FP.9r; data available at https://oco2.gesdisc. eosdis.nasa.gov/data/GOSAT_TANSO_Level2/, last access: 15 June 2023). This version of processing supports both nadir and glint soundings. In the case of soundings over water, a check was made to ensure the observation was made in glint mode. The XCO 2 data from Lite products were biascorrected (Wunch et al. 2017;O'Dell et al. 2018). Typically, Level 2 Lite products contain 10-200 useful soundings per orbit, noting that more than 50 % of the spectral data were not processed during retrieval because they did not pass the first cloud screening preprocessing step.
The update for CO 2 flux is given by the observation innovation and the correlations between CO 2 concentrations and emissions, while the correlations are naturally provided by the physics-and dynamics-based numerical model. Although there are limited observation numbers, observations of 1 h are available. Thus through hourly updates, along with hourly model advances, the spatially sparse observations can sufficiently constrain the CO 2 flux, which can be demonstrated by the results. Given the EnKF algorithm, the posterior uncertainty is proportional to the prior uncertainty but with a smaller magnitude. Based on hourly update, the posterior uncertainty contains the same flow-dependent information as the prior uncertainty. For both chemistry assimilation and numerical weather prediction, it is common that the dimension of observation is much smaller than the dimension of state vector. Thus data assimilation helps to use the limited observations to constrain the state vector. Before being applied in assimilation, the GOSAT retrievals were operated in three steps. First, only the data retrievals tagged with "RetrievalResults/outcome_flag = 1" were selected, which indicates the retrieval quality. Second, to achieve the most extensive spatial coverage with the assurance of using the best-quality data available, a thinning strategy was used when multiple observations appeared in the same model grid point at the same hour on each day after interpolation of the model's horizontal coordinates. Only retrievals with a minimum value of uncertainty, i.e., "RetrievalResults/xco 2 _uncert", were selected. Third, the OMB quality control method is used to check the background fields to maintain stability in the assimilation. The records with absolute biases (i.e., |o − b|) greater than 5 ppm were removed, which are considered to have a lack of regional representativeness, and were mostly found near the boundary of the model domain. Moreover, the retrievals for the glint soundings over oceans have relatively larger uncertainty, and thus many data over oceans were excluded in our inversions in terms of the data-screening strategy (Fig. 2).
Non-assimilated XCO 2 observations were used for verification purposes after another process of repeated sifting, whose steps were as follows: (1) observations were marked with "outcome_flag = 1"; (2) XCO 2 values with the mini-mum "xco 2 _uncert" in the same model grid point and at the same hour were excluded, which filtered out all of the assimilated XCO 2 ; and (3) outliers were precluded if the |o − b| was larger than 5.00 ppm.

Experimental design and evaluation method
Following previous GOSAT inversion work (Maksyutov et al., 2013;Feng et al., 2017;Wang et al., 2019;Liu et al., 2021;Jiang et al., 2022), in this study, the natural flux (i.e., biosphere-atmosphere exchange and ocean-atmosphere exchange) were optimized, while the fossil-fuel and biomassburning fluxes were kept unchanged. This design, in which the natural fluxes were a subset of the state variables, further allowed us to focus on investigating the uncertainty of China's carbon sink, since the uncertainty in prescribed biomass-burning and fossil-fuel emissions is minor compared to that of the biosphere fluxes in the model domain (van der Werf et al., 2017;Zheng et al., 2018a;Kurokawa and Ohara, 2020). Fully reconciling the differences between bottom-up and inversion-estimated fossil-fuel emissions is outside the scope of this work and is therefore not discussed any further in this study. Consequently, the selected XCO 2 observations were assimilated hourly to adjust the CO 2 concentrations and fluxes. The assimilation was performed for the period 00:00 UTC 25 December 2015 to 23:00 UTC 31 December 2016, using the perturbed initial conditions and boundary conditions by adding Gaussian random noise with a standard deviation of 5 %. The first 7 d was set as the spin-up, which has been tested by Peng et al. (2015) with pseudo-satellite observation and CMAQ assimilation. Results for the period 1 January to 31 December 2016 are discussed and validated in detail in this paper.
Then, additionally, to assess the quality of the inversion results, two sets of forward simulations were performed throughout the year of 2016. One set of experiments was forced by the optimized a posteriori fluxes (denoted as FC), and the other was forced by the prescribed a priori fluxes as a control experiment (denoted as CTRL). Both forward runs used the same initial and boundary concentrations from the CT2019B product. Generally, it is hard to validate the op-timized flux because comparison with in situ flux measurements is difficult on account of the discrepancy in scales between fluxes assimilated in the model grid point and eddycovariance measurements over a very large uniform underlying surface. Therefore, this traditional approach was adopted as a compromise to assess whether the a posteriori fluxes would enable improvements in the fit to observed CO 2 concentrations, including non-assimilated GOSAT and surface observations from 14 sites. "XCO 2 (validation)" denotes the independent GOSAT XCO 2 retrievals for validation. "XCO 2 (assimilation)" represents the observations used for assimilation, and the corresponding model results come from BG (JDAS background fields). RMSE refers to the root-mean-square error, CORR refers to the correlation coefficient, MAE refers to the mean absolute bias, and NUM refers to the number of XCO 2 data. The monthly and annual averages were calculated from the hourly outputs.

Performance of observational and analysis increments
We begin by analyzing the observational and analysis increment performance of JDAS. According to the statistics listed in Table 1, the total number of assimilated XCO 2 values in 2016 reached 15264 (i.e., 79.22 % of the thinned amount), with the monthly ratio of "assimilated-to-thinned" ranging from 74.19 % (in August) to 98.91 % (in July). The available XCO 2 data amount for JDAS decreases from 1788 in January to 1870 in February, to 734 in June, and to 728 in July, which represents an approximate 61 % reduction in the year-round monthly comparison. Also, it should be noted that the maximum median XCO 2 uncertainty occurred in July (0.99 ppm) and the minimum in December (0.64 ppm), indicating a better quality of XCO 2 retrievals in winter and less stable retrievals in summer. As shown in Table 1, both the mean absolute error (MAE) and root-mean-square error (RMSE) exhibit a maximum in July (1.99 and 2.41 ppm, respectively) and a minimum in April and September (MAE: 1.76 and 1.76 ppm; RMSE: 2.18 and 2.15 ppm), indicating that the point-by-point uncertainty is larger in summer and lower in spring and autumn, which is consistent with previous model studies (Li et al., 2017). The difference in seasonal performance could be partly due to the uncertainties in the spatial and temporal variations of the biosphere flux estimation and fossil-fuel inventories. Figure 2 demonstrates the distribution of XCO 2 observation increments and CO 2 flux analysis increments (i.e., the analysis-minus-background E a −E b ) over the model do-main, including January ( Fig. 2a and b), July ( Fig. 2c and d) and the whole year ( Fig. 2e and f). In particular, most of the available XCO 2 in July appears in the north and central region of China, but the south and northwest tend to be blank. The XCO 2 innovation range is usually between −3 and 3 ppm in the corresponding model grid point, with a monthly mean value between −0.12 and −0.96 ppm over the model domain. Moreover, the pattern of CO 2 flux analysis increments (i.e., AN-FC flux) preserves features from innovations and certifies that GOSAT XCO 2 is effectively absorbed in JDAS. GOSAT retrievals were found to display impacts within a certain range near the observation points after entering the assimilation system. The higher variation in monthly flux analysis increments for July than those for January indicates that the uncertainties of forecast flux in summer are larger than those of the variation in winter. Considering the peculiarities of atmospheric CO 2 , such as its long atmospheric lifetime, long-range transport, high background concentrations, and strong biosphere-atmosphere exchanges, there are both wide-ranging overall increases (e.g., −0.01 to 0.1 over central China) and decreases (e.g., −0.2 to −0.01 over South China) and small-scale adjustment taking place in 2016 (Fig. 2f).

Size of the annual carbon sink in China
Before presenting a posteriori biosphere fluxes in China from JDAS, Table 2 provides an overview of most of the wellknown inversion modeling systems, configurations of inversions, atmospheric transport models, spatiotemporal resolutions, and observations. The inversion systems differ by the transport model, the inversion approach, the choice of observation, and prior constraints, enabling us to facilitate international comparison and mutual recognition. For example, either in situ CO 2 or GOSAT XCO 2 -constrained flux (i.e., −1.11 and −0.83 Pg C yr −1 ) demonstrates much higher sink estimates from GEOS-Chem-based inversion with a 4 • × 5 • horizontal resolution. Excluding the outliers, most global inversions report a carbon sink in China of −0.27 to −0.56 Pg C yr −1 from in situ CO 2 and −0.34 to −0.68 Pg C yr −1 from satellite retrievals. In contrast, our estimates constrained by GOSAT observation (−0.47 Pg C yr −1 ) agree reasonably well with the previous estimates mentioned above.

Regional characteristics of posterior fluxes
As can be seen in Fig. 3a, the annual horizontal distribution patterns of biosphere flux show significant spatial heterogeneity and fairly large gradients in most areas. Fig. 3b further illustrates annual differences between a priori and a posteriori fluxes over the model domain. Although China's total carbon sink of a posteriori fluxes (−0.47 Pg C yr −1 ) is approximately equal to the a priori fluxes (−0.43 Pg C yr −1 ), the spatial distribution has been modified through assimilation. Compared to the prescribed a priori biosphere flux, not only large-scale vegetation adjustments but also smallscale conditions can be detected throughout the year after assimilating atmospheric observations (Fig. 3b). Generally, the a priori biosphere fluxes are overestimated (∼ 0.1-0.3 µmol m −2 s −1 ) in the north (dominated by forest, grassland and cropland) and south (dominated by forest and grassland) of China, while they are underestimated (∼ 0.1-0.5 µmol m −2 s −1 ) primarily in central China where there is a large area of cropland. Figure 3c-f show the seasonal spatial differences before and after assimilation, taking January, April, July, and October as representatives of winter, spring, summer, and autumn. The monthly averages were calculated from the daily averages based on hourly outputs. The difference between the analysis and prior flux tends to be larger in July, lower in April and October, and lowest in January, which indicates a larger uncertainty in biosphere flux estimates in the growing season. This is consistent with the findings of previous studies (Jiang et al., 2016;Chen et al., 2021;Fu et al., 2022). Moreover, it should be noted that an obvious underestimation of a priori flux (approximately 0.1-0.5 µmol m −2 s −1 ) occurs in the northern, central, and southern vegetation growth regions. On the other hand, the central part of China, dominated by cropland, shows relatively larger a posteriori flux in winter and smaller a posteriori flux in summer and autumn, in contrast with the a priori flux constrained by the limited background observation sites Jacobson et al., 2020). Additionally, compared with the weekly temporal resolution of global inversion, the hourly observational increments as well as the hourly first-guess fields in this study hold some advantage in evaluating the monthly variations of fluxes. As expected, some distinguishing features are thus demonstrated in the assimilated fluxes, such as the carbon sources in parts of central, eastern, and southwest China, which is more consistent with the underlying surface situation. In this way, the JDAS inversion system has the potential to depict the fine-scale characteristics of biosphere flux.
Next, we analyze the monthly and annual fluxes in five large regions -west, north, central, south, and mainland China (denoted by the red frame in Fig. 3a) -to analyze the regional inversion in subcontinental-scale flux variation as well as to contrast results with the previous inversion analysis (Fig. 4). Given the representative background and observation information, the seasonality patterns were modified by JDAS assimilation, with larger annual sinks relative to the a priori ones and a growing season that is shifted earlier in the year over central and south China. As shown in Fig. 4, there is an evident difference in the a posteriori annual carbon sink magnitude in these regions, gradually decreasing in the north (e.g., forest, grassland and cropland), south (e.g., forest and grassland), west (e.g., grassland and tundra), and central region (e.g., cropland) in turn, which is consistent with the primary corresponding ecosystem types, while the a priori sink of the west tends to be larger than that of the south. Using the north as a reference, the annual carbon sinks of the a priori estimates for the north, south, west, and central regions are 1.00, 0.57, 0.62, and 0.44, respectively, while those of the a posteriori estimates are 1.00, 0.62, 0.56 and 0.38. On the other hand, the a priori and a posteriori amplitudes of the seasonal variation (i.e., the difference between the maximum and minimum monthly estimates, as defined in Crowell et al., 2019) range from 374.33/333.74, 87.01/80.41, 120.33/113.98, and 82.34/88.00 to 413.17/389.48 Tg C per month in north, south, west, central, and mainland China, respectively. Moreover, the drastic fluctuation in the daily variation of prior fluxes has been modified by observational constraints in JDAS (sub-graph in the left-hand panel of Fig. 4). Therefore, this implies the potential for regional inversion in interpreting underlying processes in large regions such as China where the ecosystems and climate are quite varied.
Nevertheless, achieving robust and reliable flux signals at smaller regional scales is quite demanding and rather challenging because of the limited observations and low accuracy of transport models as well as the a priori information. In this paper, we further try to investigate the condition of the regional biosphere carbon sink over several of China's key ecological areas (denoted by the blue frame in Fig. 3a) -for example, Daxing'anling (DX), the Loess Plateau (HT), the Qinling Mountains (QL), the rocky desert in Guangxi (SM), Mount Wuyi (WY), and Xishuangbanna (XS). These regions are characterized by their unique vegetation and climatic conditions. In particular, the a priori and a posteriori seasonal amplitudes amount to 43.64/39.56, 24.03/23.39, 35.73/37.96, 29.36/31.80, 2.70/3.64, and 7.93/7.04 Tg C per month in DX, HT, QL, SM, WY, and XS, respectively. The DX region is characterized by abundant forest and far more satellite retrievals to constrain fluxes, with annual a priori and a posteriori carbon sinks of −25.13/−29.64 Tg C yr −1 . Com-pared to a priori fluxes, relatively stronger a posteriori sinks are also found in QL (−60.05/−62.53 Tg C yr −1 ), SM (−62.10/−71.27 Tg C yr −1 ), WY (0.36/−2.19 Tg C yr −1 ), and XS (−10.12/−10.79 Tg C yr −1 ), which is consistent with the improved ecological conditions due to ecological engineering construction as well as generally favorable climatic conditions. As can be seen in Fig. 5, the XS region is unique and worthy of attention in contrast to the other regions not only because it shows different seasonality in its release of CO 2 to the atmosphere in summer and removal of CO 2 from the atmosphere in other seasons but also because of the large transport model errors that are included in the modeldata mismatch error involved in previous inversion studies (J. Wang et al., 2020;Schuh et al., 2022;Y. L. Wang et al., 2020). Thus, the abovementioned spatial variations of a posteriori fluxes might unlock some of the potential local signals in areas where regional transport models are more reliable and observations are plentiful.

Provincial patterns of optimized fluxes
In this section, we investigate the provincial patterns of biosphere flux (Fig. 6). Based on the gridded a posterior flux dataset, we first assess the annual CO 2 biosphere sink levels in 31 provinces in mainland China (Taiwan, Hong Kong SAR, Macao SAR, and Shanghai are not discussed because of the insufficient grid resolution). At this scale, both the a priori and a posteriori fluxes indicate the strongest carbon sink intensity per unit area being in Shaanxi, Guangxi, and Guizhou, but the a priori fluxes produce an underestimation in Shanxi and overestimations in Guangxi and Guizhou, respectively. Next, the second strongest carbon sink intensity is commonly seen in Shaanxi, Sichuan, Chongqing, and Hubei, whereas a comparatively low level of carbon sink intensity appears in Xinjiang, Liaoning, Anhui, and Yunnan as well as in Tibet and Fujian. Furthermore, some provinces with neutral (i.e., close to 0) source or sink statuses are re-evaluated by the GOSAT-constrained fluxes ( Fig. 6a  and b). For instance, the a posteriori flux in Ningxia is −0.01-0.01 µmol m −2 s −1 , while the a priori flux displays a weak carbon sink of −0.01 to −0.05 µmol m −2 s −1 , due to the complexity in the estimation related to the grassland and cropland land surfaces in this province. On the contrary, the a priori fluxes in Fujian and Jiangsu are close to 0, but we find a carbon sink ranging from approximately −0.01 to −0.05 µmol m −2 s −1 and a carbon source from 0.05 to 0.1 µmol m −2 s −1 , respectively. For Liaoning, the a priori fluxes are characterized by CO 2 sources (0.01-0.05 µmol m −2 s −1 ), while the assimilated fluxes with satellite measurements are slightly adjusted to a carbon sink (−0.05-0.1 µmol m −2 s −1 ).
Lastly, the sizes of the provincial biosphere fluxes are summarized and sorted quantitatively in Fig. 7. The maximum and minimum provincial biosphere flux sizes  (FC, blue). The bars on the right-hand side represent the 12-month average (unit: Tg C per month). The boxes on the left-hand side denote the daily flux (unit: Tg C d −1 ), with the whiskers indicating the minimum and maximum and the horizontal lines across the box indicating the 25th percentile, the median, and the 75th percentile, respectively. are in Inner Mongolia (a posteriori: −53.65 Tg C yr −1 ; a priori: −53.41 Tg C yr −1 ) and Shandong (a posteriori: 5.99 Tg C yr −1 ; a priori: 3.05 Tg C yr −1 ), respectively. Moreover, the difference between the a posteriori and a priori provincial flux ranges from −7.03 Tg C yr −1 in Heilongjiang to 2.95 Tg C yr −1 in Shandong, with an underestimation greater than 2.00 Tg C yr −1 appearing in Shandong (2.95), Jiangsu (2.31), and Hebei (2.25) and an overestimation greater than 5.00 Tg C yr −1 appearing in Heilongjiang (7.03), Liaoning (5.68), Yunnan (5.59) and Guangxi (5.10). On the other hand, a smaller percentage of modification between the a posteriori and a priori flux (i.e., (a posteriori − a priori)/a priori × 100 % in absolute value) arises in Xinjiang (0.28 %), Inner Mongolia (0.46 %), Tibet (1.10 %), Qinghai (2.45 %), Gansu (3.21 %), Shaanxi (3.50 %), Sichuan (4.34 %) and Shanxi (4.65 %), indicating a lower level of uncertainty in these larger carbon-sink provinces. Nevertheless, an increased percentage of modification in provincial flux appears in Jiangsu (a posteriori: 2.29 Tg C yr −1 ; a priori: −0.02 Tg C yr −1 ), Liaoning (a posteriori: −4.27 Tg C yr −1 ; a priori: 1.40 Tg C yr −1 ), Fujian (a posteriori: −1.15 Tg C yr −1 ; a priori: 0.29 Tg C yr −1 ), and Shandong (already listed above).

Evaluation against observations
We further assess the performance of the a posteriori CO 2 fluxes by comparing the CTRL, FC, and AN results  (AN, red bars). The bars on the right-hand side represent the 12-month average (unit: Tg C per month). The subfigures at the bottom denote the daily temperature (blue lines; unit: • ; left-hand y axis), total solar radiation (red stars; unit: MJ d −1 ; left-hand y axis), and precipitation (grey bars; unit: mm d −1 ; right-hand y axis), with the right-hand bars representing the annual average. in the fit to non-assimilated GOSAT as well as surface observations. The monthly and annual statistics were computed from the hourly outputs from the assimilation, simulation and observations. Table 1 demonstrates (as expected) that the concentration from the analysis fields (AN) performs best when fitted to the non-assimilated XCO 2 observations. It is notable that the column-averaged satellite signals have limited capacity in facilitating the tropospheric variation in CO 2 concentration compared to surface observations. Thus the response to changes in the simulated XCO 2 signal is weak, and improvement is rather moderate. For instance, the annual RMSE, MAE, and correlation coefficient for AN are 2.34 ppm, 1.93 ppm, and 0.73; for FC, they are 2.63 ppm, 2.02 ppm, and 0.66; and for CTRL, they are 2.65 ppm, 2.03 ppm, and 0.66, respectively. Additionally, the AN, FC, and CTRL biases from non-assimilated XCO 2 observations are further calculated (Table 3), and the outliers in CTRL have been effectively amended. When FC is compared with the CTRL results, the frequency of bias in [−4, 4] increases by 0.25 %, in [−3, 3] by 0.36 %, in [−2, 2] by 0.32 %, and in [−1, 1] by 0.14 %. The error standard deviation decreases from 2.63 ppm in CTRL to 2.61 ppm in FC and to 2.27 ppm in AN. Furthermore, surface in situ observations from 14 sites are further used as independent observations to evaluate the inversion results. The modeled CO 2 concentrations were ex-    Table 4 are better than FC and CTRL, including RMSE and MAE, which gives a direct indication that the assimilation performs well. Taking improvement rate as example, the RMSE improvement rate between the FC and CTRL mostly ranges from −2.13 % to 12.34 % with an average of 2.48 %, and the MAE improvement rate ranges from 0.08 % to 9.73 % with an average of 2.37 %. Although the RMSE and MAE of AN are lower than CTRL and FC, those of FC are higher than CTRL in Lin'an (in Wuhan, Hubei) and Jinsha (in Yangtze River Delta), which are in the vicinity of urban clusters with increased human activity (Liang et al., 2023). Thus, this helps to check that the inversions actually improve the model fits to the observations but also to determine whether some sites are particularly problematic for natural flux inversions. Inversions actually improve the model fits to the surface observations in forest areas (in northeast, east, and southeast China), cropland areas (in north China), grassland areas (in Mongolia), ocean (in the Korean Peninsula and Japan), and coastal areas (in the Korean Peninsula). The annual-averaged horizontal distributions of CO 2 concentration near the surface in 2016 are presented (Fig. 8). Figure 8a displays the surface CO 2 concentration analysis fields (AN), and the much-refined description in the AN allows for a more detailed characterization of the spatiotemporal distribution of CO 2 concentration and can further facilitate an interpretation of satellite data in a regional context over China. Thus, the AN can be used as a closer representation of the real condition. As shown in Fig. 8b and c, compared to the CTRL fields, the FC fields tend to be con-siderably closer to the AN fields, suggesting that a posteriori fluxes are calibrated acceptably. Furthermore, Fig. 8d shows the year-round statistic of XCO 2 error reduction (defined as (1 − δ FC /δ CTRL ) × 100 %)), as well as the amounts of non-assimilated XCO 2 observations, where δ FC represents the FC XCO 2 error standard deviation and δ CTRL the CTRL XCO 2 error standard deviation. The region of 8-57 • N and 105 • −120 • E is used as a reference because there is a relatively larger difference between the a priori and a posteriori fields, including the concentration as well as flux. In general, the error reduction is primarily found to be positive and ranges from approximately 0.80 % to 32.13 % with a median of 5.65 % and mean of 7.23 %. This zonal evaluation further verifies the improvement in the a posteriori flux compared to the a priori flux.

China's carbon sink international comparability among different methods
The total annual carbon sink in previous research along with our study is summarized ( Table 2). The aim is mainly to check that different methods -for instance, inventories, ecosystem process models, and atmospheric inversions -actually improve the carbon sink international comparability and mutual recognition. Based on national ecosystem inventory data, China's terrestrial carbon sink increased from −0.18 Pg C yr −1 in the 1980s to −0.33 Pg C yr −1 in the 2000s owing to forest area expansion and afforestation during recent years (Piao et al., 2009;Jiang et al., 2016;. Meanwhile, the results from several ecosystem process-based models display a carbon sink ranging from −0.13 to −0.22 Pg C yr −1 during 1980-2010, achieved by assessing the effect of changes in climate and CO 2 (Piao et al., 2009;He et al., 2019). In addition, according to the flux gap between top-down and bottom-up estimations mentioned above, a recent estimate of the lateral flux for China is −0.14 Pg C yr −1 , which includes the carbon exchange between the land and atmosphere in non-CO 2 forms as well as imported wood and crop products . The terrestrial carbon sink in 2016 with lateral fluxes adjustment amounts to approximately −0.33 Pg C yr −1 , constrained by the GOSAT XCO 2 in JDAS (−0.47 Pg C yr −1 ). Correspondingly, we also provide a corrected carbon sink estimate of −0.54 Pg C yr −1 (i.e., −0.68 + 0.14 = −0.54) inferred from in situ CO 2 data provided by JDAS (Peng et al., 2023), which is the optimal mathematical solution under the current sparse observational coverage with daytime photosynthetic uptake and likely leads to a slight overestimation to some extent. Hence, our estimates (−0.68 and −0.47 Pg C yr −1 from in situ CO 2 and GOSAT, respectively) agree reasonably well with the previous estimates mentioned above.

4.2
To what extent is the JDAS's posterior flux different from the prior flux?
In general, most research into the inversion of China's carbon sink has commonly used global transport models. The limited resolution and distribution of observations are deemed to lead to large uncertainties in inversion in small regions, especially at national scales (Crowell et al., 2019;Monteil et al., 2020;Piao et al., 2022). The resolution-related performance of transport models tends to magnify the uncertainty in China's carbon sink estimates. For instance, Fu et al. (2021) found that the results of a global model (i.e., GEOS-Chem) tended to be generally lower than GOSAT's XCO 2 in China from the various terrestrial models with a mean bias of about 2 ppm in winter, while Lei et al. (2014) found GEOS-Chem simulations tended to produce higher values than GOSAT (by 5.8 ppm) in China during summer. In contrast, the observational increments of JDAS show an ability to depict the fine-scale features with strong spatial heterogeneity whilst in general retaining the large-scale spatial patterns, which can be attributed to the CMAQ simulation performance in differentiating the nuances of anthropogenic and natural conditions. On the other hand, the analysis increments depend not only on the innovations, but also on how well the Kalman gain matrix computes the contribution weighting factors based on the time-dependent forecast error covariance. The biosphere flux first-guess fields were derived from the novel flux forecast model by taking the a priori flux, the analysis flux from the previous assimilation cycle, and the forecast concentration (Eq. 1), which is a great help in assisting with improving the background information and initial perturbation for ensemble forecasting. The good response of the vegetation condition to the a posteriori results provides a strong foundation for a meaningful interpretation of biosphere fluxes. Satellites, with their better spatial coverage, as well as regional transport models, with their improved stability, can help in assessing the real conditions of local terrestrial ecosystems with complex conditions, such as over central China. The decreased annual sink and increased seasonal variability in central China deduced by the a posteriori flux with satellites may in fact reflect the atmospheric CO 2 fixed by cropland vegetation, where ∼ 60 % of the area is cropland with relative few in situ observations used for constraining the a priori flux (Piao et al., 2009(Piao et al., , 2022. Actually, downward correction over forest and grassland and upward correction for cropland areas have been validated against independent data. Inversions actually improve the model fits to the surface observations in cropland, forest, and grassland areas. In general, (1) widespread underestimation of the a priori flux (0.01-0.1 µmol m −2 s −1 ) is found in central China, which is dominated by cropland and where dense satellite retrievals are accordingly available; (2) overestimates are distributed in the northeast and south of China over a considerable spatial extent; and (3) smaller changes between a posteriori and a priori estimates are primarily located in the west of China, which tends to agree with the XCO 2 OMB pattern. Nevertheless, summer is the season with the largest percentage of satellite data rejection and retrieval uncertainty, still making it a tough test for inversion systems.
At the provincial scale, the provinces in China differ in both terrestrial vegetation and anthropogenic activity. As discussed earlier, the difference between a posteriori and a priori estimates is closely related to the degree of human activity intervention. Several factors could account for the provincial spatial distribution constrained by GOSAT, for instance, the increased precipitation along with the strong El Niño in 2016, the levels of reforestation and afforestation, and the reductions in biofuels in rural areas bringing about a shrubland carbon sink.

How well can the JDAS inversion constrain the carbon sink of China?
Quantitative information on the extent of which the posterior fluxes are constrained by observations has been checked further. The prior information has been embodied in a prioriflux-simulated concentrations, and observation information has been embodied in the a posteriori flux simulation, whose fluxes are constrained by observations. By evaluating the differences between these two sets of simulation results, the prior information and observation information are now accessible in order to be assessed quantitatively. At the site scale (Table 4), some sites tend to systematically be poorly fitted by the inversions, in particular those in the vicinity of large urban areas with large anthropogenic emissions, such as Jinsha and Lin'an. Besides these two sites, the difference between CTRL and FC is affected by the observation information through assimilation ranges from 0.25 % to 12.34 % (i.e., RMSE decreasing rates), with an average of 2.48 % among all surface observation sites. According to the statistics, the observations have played a positive role in improving carbon sink over the model domain. The non-assimilated GOSAT XCO 2 has also been used to assess the difference between prior and posterior flux simulation. The decrease in the misfits is rather moderate (Table 1). In addition, the smaller correlation coefficient improvement in the contrast of CTRL and FC implies that prior flux patterns play an important role in posterior flux. On the other hand, favorable meteorological conditions (e.g., precipitation in the growing season being 20 % higher than that in 2015; National Climate Center, 2016) have also been reported, which further supports the improved ecological quality, indicating JDAS's potential in tracking biosphere CO 2 fluxes from space.

Summary and outlook
Top-down estimations of carbon budgets have been included in the UNFCCC's MVS framework. At present, most carbon sink inversions in China utilize a global transport model with relatively coarse resolution. Characterized by large heterogeneity in its biospheric spatiotemporal distribution, the transport model error, as well as the sparseness of in situ observations, leads to large uncertainties in the assimilation of carbon flux in China. In this study, a regional high-resolution inversion model (JDAS) was used, which has been extended to incorporate GOSAT constraints, along with a joint assimilation of CO 2 flux and concentration at high spatial (64 km) and temporal (1 h) resolution. The annual, monthly, and daily variation in biosphere flux was reproduced reasonably well, which was attributable to the novel flux forecast model with diurnal variation, the reliable CMAQ background simulation, carefully chosen XCO 2 retrievals, and the well-designed EnKS assimilation configuration.
The size of the biosphere carbon sink in China amounted to −0.47 Pg C yr −1 with JDAS by GOSAT constraints, which is comparable with previous global estimates (i.e., −0.27 to −0.56 Pg C yr −1 from in situ observations and −0.34 to −0.68 Pg C yr −1 from satellite retrievals). Next, the muchrefined CMAQ resolution in JDAS inversion was found to allow for a more detailed characterization of the spatiotemporal distribution of CO 2 and to further facilitate an interpre-tation of carbon flux in a regional context over China. The a priori and a posteriori seasonal amplitudes ranged from 374. 33/333.74, 87.01/80.41, 120.33/113.98, and 82.34/88.00 to 413.17/389.48 Tg C per month in north, south, west, central, and mainland China, respectively. Also, the drastic fluctuation in the daily variation of a priori fluxes was modified by observational constraints, which appeared more realistic than that of the a priori estimates. Moreover, we further investigated the condition of the biosphere carbon sink in several of China's key ecological areas. Using XS as an example, the large transport model errors that were included in the model-data mismatch error involved in previous global inversion studies were effectively reduced by JDAS, and XS was reported to be a relatively stronger sink in contrast to prior estimates (−10.12/−10.79 Tg C yr −1 ). Furthermore, the provincial patterns of biosphere flux were investigated and re-estimated. As seen from GOSAT, the difference between the a posteriori and a priori provincial flux ranged from −7.03 Tg C yr −1 in Heilongjiang to 2.95 Tg C yr −1 in Shandong. Finally, an evaluation against non-assimilated XCO 2 and surface observations demonstrated better performance of the a posteriori flux when fitted to the observations, indicating improved results in the regional inversion. Considering our prior estimates from CT2019B, the discrepancy could be because our study (a) relied on a fine-scale regional transport model; (b) was constrained by GOSAT XCO 2 retrievals with better spatial coverage rather than sparse and inhomogeneous in situ observations; (c) performed a joint assimilation of CO 2 flux and concentration, which helped reduce the uncertainty in both the initial CO 2 fields and the fluxes; and (d) carried out hourly assimilation based on hourly simulation and observation, which was more realistic.
The regional inversion methodology and results presented here prove the feasibility and superiority of regional CTMs and satellite observations in investigating China's carbon sink. On account of the obvious interannual variation in the biosphere sink, this work also serves as a foundation for future multi-year retrospective analyses of biosphereatmosphere exchanges under different meteorological conditions. On the one hand, although the ACOS retrieval technology has been substantially improved and provides unprecedented spatial coverage, more XCO 2 retrievals with better quality and lower retrieval uncertainty are still needed, especially during summertime as well as over ocean and west China. On the other hand, a knowledge gap also exists in inversion-based estimates, in which fossil-fuel emissions are generally assumed to be accurate. Besides uncertainties in natural flux, our current knowledge of urban emissions is far from adequate. Around 70 % of fossil-fuel emissions are derived from cities in combination with considerable uncertainties. Within the framework of the Paris Agreement, inversions at higher spatial resolution are an increasing demand, making it crucial to develop the capacity for inversions to quantify urban emissions and assess the effectiveness of emission mitigation strategies, alongside calls for improve-ments in observations, a priori information, anthropogenic emission inventories, transport models, and inversion technology.
Author contributions. ZP and MZ designed the research; XK and ZP prepared the calculations, analyzed the experiment, wrote of the original draft, and reviewed and edited the manuscript; and XH and ZL prepared the basic data, with comments from FH and LL.
Competing interests. The contact author has declared that none of the authors has any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements. The authors are grateful to the three anonymous reviewers for their precious suggestions. The GOSAT retrievals were provided by the ACOS/OCO-2 project at the Jet Propulsion Laboratory, California Institute of Technology. This study used ground-based in situ CO 2 concentration observations from Ryori, Mt. Dodaira, Kisai, Anmyeon-do, Jeju Gosan, King's Park, Yonagunijima (WDCGG), Shangdianzi, Longfengshan, Jinsha, Lin'an, Shangri-La (CMA), Mt. Waliguan, and Ulaan Uul stations (NOAA ObsPack). We would like to express deep gratitude to the dedicated principal investigators, research teams, and support staff of the stations for providing their CO 2 observation records.
Financial support. This work was supported by the National Key Scientific and Technological Infrastructure project "Earth System Science Numerical Simulator Facility" (EarthLab). This work was also sponsored by the National Natural Science Foundation of China (grant nos. 42141017 and 42275153). In addition, this work was supported by State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (grant no. LAPC-KF-2022-01) and Key Laboratory of South China Sea Meteorological Disaster Prevention and Mitigation of Hainan Province (grant no. SCSF202208). Review statement. This paper was edited by Xiaohong Liu and reviewed by three anonymous referees.