Biogeosciences Assimilation of Soil Wetness Index and Leaf Area Index into the ISBA-Ags land surface model : grassland case study

The performance of the joint assimilation in a land surface model of a Soil Wetness Index (SWI) product provided by an exponential filter together with Leaf Area Index (LAI) is investigated. The data assimilation is evaluated with different setups using the SURFEX modeling platform, for a period of seven years (2001–2007), at the SMOSREX grassland site in southwestern France. The results obtained with a Simplified Extended Kalman Filter demonstrate the effectiveness of a joint data assimilation scheme when both SWI and Leaf Area Index are merged into the ISBA-A-gs land surface model. The assimilation of a retrieved Soil Wetness Index product presents several challenges that are investigated in this study. A significant improvement of around 13 % of the root-zone soil water content is obtained by assimilating dimensionless root-zone SWI data. For comparison, the assimilation of in situ surface soil moisture is considered as well. A lower impact on the root zone is noticed. Under specific conditions, the transfer of the information from the surface to the root zone was found not accurate. Also, our results indicate that the assimilation of in situ LAI data may correct a number of deficiencies in the model, such as low LAI values in the senescence phase by using a seasonal-dependent error definition for background and observations. In order to verify the specification of the errors for SWI and LAI products, a posteriori diagnostics are employed. This approach highlights the importance of the assimilation design on the quality of the analysis. The impact of data assimilation scheme on CO2 fluxes is also quantified by using measurements of net CO2 fluxes gathered at the SMOSREX site from 2005 to 2007. An improvement of about 5 % in terms of rms error is obtained. Correspondence to: J.-C. Calvet (calvet@meteo.fr)


Introduction
The objective of data assimilation is to combine optimally data from different sources that bring complementary information on a geophysical system.The development of Land Surface Models (LSM) able to simulate photosynthesis processes, surface carbon fluxes and vegetation biomass allows the joint assimilation of soil moisture data together with Leaf Area Index (LAI) estimates.
The Leaf Area Index is an important factor controlling surface evapo-transpiration, as it impacts the exchange of water vapor and CO 2 between the vegetation canopy and the atmosphere.Several studies (Jarlan et al., 2008;Sabater et al., 2008) have shown the potential of assimilating LAI to estimate the vegetation characteristics and to reduce model uncertainties.
Soil moisture is a key variable to be initialized in meteorological models since the partition between sensible and latent heat fluxes depends on the quantity of water in the soil available in the root zone.The characterization of soil moisture in deep layers is more important than the surface soil moisture since the superficial reservoir has a small capacity and almost no memory features.As the near-surface soil moisture (w g ) is reasonably well correlated with the profile soil moisture content under specific circumstances, the retrieval of root-zone soil moisture (w 2 ) using surface observations is possible (Calvet and Noilhan, 2000).
The simulated w 2 may be improved by ingesting remotely sensed surface soil moisture data into LSM through data assimilation techniques.In a number of studies (Entekhabi et al., 1994;Houser et al., 1998;Walker et al., 2001;Draper et al., 2009) it has been shown that data assimilation techniques permit to reconstruct w 2 from observed w g .
The main problem to be tackled in using an advanced land data assimilation system (LDAS) from a Numerical A. L. Barbu et al.: Assimilation of Soil Wetness Index and Leaf Area Index Weather Prediction (NWP) perspective is the additional computational cost of model integration.By assimilating the data into an off-line version of the land surface model, this burden is affordable.A study concerning the surface analysis for ALADIN NWP model was performed by Mahfouf et al. (2009).A simplified version of the Extended Kalman Filter (EKF) was developed in order to assimilate screen-level air temperature and air humidity into the off-line ISBA (Interaction between Soil Biosphere and Atmosphere) land surface model (Noilhan and Mahfouf, 1996).
Accurate estimates of w 2 are important also for many applications in hydrology, agriculture and climate studies where the uncoupled mode can be used.Within the Global Monitoring for Environment and Security (GMES) initiative, coordinated efforts are made to produce global biophysical variables that describe the continental vegetation state, radiation budget and water cycle with the objective of developing and validating pre-operational land information services.In particular, new satellite-derived products of soil moisture (Soil Wetness Index) and LAI are being produced.Including this new information in the LDAS and assessing its impact should contribute to a better characterization of the vegetation state, the surface fluxes (carbon and water) and the associated soil moisture, at both global and regional scales.
In a number of previous studies (Pauwels et al., 2007;Sabater et al., 2008;Albergel et al., 2010), the joint assimilation of near-surface soil moisture and LAI was considered in order to assess to what extend the use of both sources of information leads to an improvement of model results.They underlined the positive impact of assimilation on the simulated soil moisture, LAI and/or biomass.The latter two studies were conducted with the ISBA-A-gs model (Calvet et al., 1998), the CO 2 -responsive version of ISBA by using simplified 2D-variational or filtering methods.They used the two-layer version of the model to represent soil processes.
This study is a preliminary evaluation at a local scale of the use of a retrieved soil moisture product based on ground observations, namely SWI together with LAI in a LDAS.We use a Simplified Extended Kalman Filter (SEKF) scheme to incorporate both SWI and LAI data into the ISBA-A-gs model at the SMOSREX grassland site, in south-western France.The period under investigation extends over seven years from 2001 to 2007, including a large range of climatic conditions.In contrast to previous similar studies, the threelayer version of the model is used (Boone et al., 1999).
The aim of this work is twofold.First, the use of a rootzone soil moisture product derived from the exponential filter method proposed by Wagner et al. (1999) and modified by Albergel et al. (2008), using in situ near-surface soil moisture data is assessed.This product is expressed in terms of Soil Wetness Index (SWI) and defined as the profile soil moisture content.On one hand, this new product may play the role of an "observed" root-zone product that can be assimilated directly into a model in order to improve the simulated soil moisture.It may overcome the modeling uncertainties related to the coupling mechanism between the surface and deep soil moisture reservoir in land surface models for data assimilation (Kumar et al., 2009).On the other hand, the use of SWI provided by an exponential filter arises several questions that need to be addressed.One of them is associated with the difficulty of specifying the time length parameter of the exponential filter for producing SWI since, in theory, it should depend upon soil characteristics (sand and clay content, soil depth, etc.), whereas in practice it is set to a constant value.Also one should be aware that the data produced by an exponential filter may have auto-correlated errors.The assimilation of in situ superficial soil moisture is considered, also.In this case, the assimilation exploits the connection between the surface and the root zone as described by the force-restore dynamics of ISBA-A-gs.In order to compare the performance of the LDAS when SWI and w g data are used, the impact on the root-zone soil water content is evaluated against w 2 in situ measurements for both types of assimilated products.
Second, as the description of background and observation uncertainties is of high importance for an optimal data assimilation scheme, several choices of error definition for LAI were tested.Pauwels et al. (2007) used synthetic observations with different degree of uncertainties in order to assess whether a high observational error is still useful for assimilation.Their conclusion is that even with large uncertainties (1 m 2 m −2 for LAI), observations are beneficial for the model simulation.Our approach was to gain insight of the uncertainty settings, under realistic conditions.The objective was to calibrate these parameters to achieve the best possible filter performance.As a result, magnitude-dependent errors are proposed.The accurate description of the errors for both the background model and data is hampered by many factors as deficiencies in the model representation of physical processes and uncertainties in retrieval procedures for measurements.The interest of using a-posteriori diagnostics that may correct the misspecification of background and observation errors was underlined in the literature, e.g.Talagrand (1999), Desroziers and Ivanov (2001).Therefore, an a-posteriori investigation of the analysis quality is performed in this study.
Observation data sets, the ISBA-A-gs land surface model and the data assimilation scheme are described in Sect. 2. In Sect. 3 the results are presented for a 7-year period and discussed for a number of configurations of the LAI assimilation.Several diagnostics are calculated in order to choose the background and observation errors to be used in the LDAS.Section 4 describes the assimilation of in situ superficial soil moisture.In Sect. 5 the impact of the joint assimilation of SWI and LAI on carbon flux are presented.Finally, Sect.6 discusses and summarizes the main conclusions of the study.

Data set
In this study, soil moisture data were obtained from the instrumentation installed at the Soil Monitoring of Soil Reservoir Experiment (SMOSREX) site near Toulouse in southwestern France (De Rosnay et al., 2006) for a period of seven years from 2001 to 2007.Ground based measurements of soil moisture were gathered with an half hourly time step by using impedance sensors installed at different soil depths from soil surface (0-6) cm to 90 cm.The observed w g values were calculated by averaging surface soil moisture between 0 and 6 cm from four devices placed at four different locations of the SMOSREX site.The root-zone soil moisture observations were estimated by integrating the soil water content over a profile of 0.95 m.Measurements of root-zone soil moisture are not assimilated in the model, but used for validation purposes.
Soil moisture values from surface measurements were converted into a Soil Wetness Index through the recursive exponential filter procedure described by Albergel et al. (2008).This approach was calibrated over the SMOSREX site, by scaling the near-surface soil moisture measurements with the minimum and maximum values of w g time series.These normalized values of near-surface soil moisture (SWI g o ) were used to calculate the SWI product over a period of seven years.The exponential filter method converts the volumetric water content in the surface layer into SWI values using a tunable time scale parameter T .This parameter accounts for the most relevant processes that may affect the temporal variations of soil moisture.A time scale of T = 11 days was found suitable for the SMOSREX site (see Albergel et al., 2008, for a detailed description).
The recursive exponential algorithm takes into account a gain factor G that relates the past SWI estimates to the current observation for the superficial layer at time t in such a way that the influence of past measurements decreases: where SWI o represents the soil wetness index estimates and t 0 is the previous time.The result is a dimensionless value scaled between 0 (dry) and 1 (wet).As the exponential filter product may have time correlated errors, the retrieved SWI is incorporated into the model once every three days which reduces the temporal correlation of the data.
The LAI of the SMOSREX grassland was measured frequently from spring to summer, but rather rarely during cold periods.A large dispersion of the observations was noticed for 2001-2002.Therefore, from January 2001 to July 2003, the LAI values were obtained from these measurements by using an interpolation method as in a number of previous studies, e.g.Sabater et al. (2008); Rüdiger et al. (2010); Albergel et al. (2010).For the remaining period until December 2007, the LAI data were retrieved from surface reflectance measurements following a method proposed by Roujean and Lacaze (2002).In order to be consistent with the sampling time of satellite data, the LAI measurements were assimilated every ten days.

Land surface model
In this study the experiments were conducted with the SUR-FEX modeling platform (Le Moigne et al., 2009) developed at Météo France.The simulations were performed in the offline mode (no atmospheric coupling was used).The system was forced by the surface atmospheric variables provided by the SAFRAN (Système d'analyse fournissant des renseignements atmosphériques à la neige) mesoscale analysis system.The SAFRAN analysis provides hourly atmospheric forcing variables (precipitation, air temperature, air humidity, wind direction and speed, incident radiation) using information from more than 1000 meteorological stations and more than 3500 daily rain gauges throughout France.An optimal interpolation method is used to assign values for each analyzed variable on a 8 km grid over France.
SURFEX contains the land surface model ISBA-A-gs (Calvet et al., 1998;Gibelin et al., 2006) which was developed to allow the simulation of photosynthesis and the growth of vegetation with different biomass reservoirs.The vegetation biomass and LAI variables are governed by photosynthesis and evolve dynamically in response to weather and climate conditions.Namely, photosynthesis permits plant growth through the net assimilation of CO 2 , and a deficit of photosynthesis triggers higher mortality rates.A linear relationship between the active biomass and Leaf Area Index is expressed as: where α may depend upon vegetation type, nitrogen supply and climate.The three soil layer version of ISBA is used in this study (Boone et al., 1999).By including a third soil water reservoir in standard ISBA, it is possible to distinguish between rootzone and a base-flow layer.Soil moisture is represented by the near-surface soil moisture w g (representative of the first soil centimeter), the root-zone soil moisture w 2 (over a soil depth of 0.95 m) and a soil moisture value w 3 in the recharge zone (0.5 m).The total soil depth is set to 1.45 m.Soil and vegetation parameters for the SMOSREX grassland site were taken from the ECOCLIMAP global database of soils and ecosystems (Masson et al., 2003), except for the soil depth in the root zone.Its value of 0.95 m was chosen in order to compare the observed and simulated soil moisture over the same soil depth.The values of the soil parameters used in this study, together with the maximum and minimum of modeled soil moisture content in the root zone are listed in Table 1.Generally, the Soil Wetness Index is defined by a linear relation accounting for the limit conditions, namely the minimum and maximum volumetric soil moisture contents, denoted by w: In the model, w min and w max are set to the wilting point (w wilt ) and to the volumetric field capacity (w fc ), respectively (see Table 1).Therefore, the standard definition of the Soil Wetness Index is: On one hand, the SWI values computed using Eq. ( 5) can exceed either 0 or 1 values.Negative values represent soil water content below the wilting point (meaning that the plant roots cannot extract water from the soil).The values larger than 1 indicate wet soils (soil water content being above the field capacity).On the other hand, the result of an exponential filter applied to superficial measurements is expressed in terms of soil wetness fraction that ranges between 0 and 1 only.Therefore, for our data assimilation experiments, the background counterparts SWI b are calculated by normalizing the root-zone soil moisture time series (as resulting from the model free run) with their maximum and minimum values over the whole period of seven years.

Data assimilation scheme
In sequential data assimilation the system state estimate, given by a solution of the model equations, is updated at each time when measurements are available.This update is usually referred to as the analysis.The Extended Kalman Filter (EKF) is a sequential data assimilation method that has been used in a number of papers for land data assimilation applications (Walker and Houser, 2001;Sabater et al., 2007;Draper et al., 2009;Seuffert et al., 2004;Drusch et al., 2009;Albergel et al., 2010).They show that this filter can produce satisfactory estimates of soil moisture.The model equations are discretized according to: Here, the forward operator is the land surface scheme ISBA-A-gs denoted by M. This operator computes the time evolution of the control vector x = (w 2 ,B a ), which contains the root-zone soil moisture and the active biomass at time t given their values at previous time.An observation operator H maps the state vector x into the observation space y o .Equations ( 3) and ( 6) provide the link between the simulated observations and control variables: The Extended Kalman filter (EKF) uses the full nonlinear model to propagate the state estimate, but uses a local linearization of the dynamics to propagate the state uncertainty, that is the error covariance matrix.A finite difference method is used to linearize the forecast model, as well as the observation operator by performing model integrations with perturbed initial values of the state vector.The EKF scheme was described by Mahfouf et al. (2009) and used for the assimilation of near-surface soil moisture by several authors, e.g.Draper et al. (2009), Albergel et al. (2010).
The EKF calculation of the analysis increment ( x t ) at time t when an observation is available is given by: where K represents the Kalman gain calculated by using the assumed diagonal covariance matrices of the background (B) and observation (R) errors as in the following expression: Here H is the Jacobian matrix of the linearized observation operator H.In the EKF formulation B t is obtained by propagating the error covariance matrix from previous time t 0 to observation time t through the Jacobian matrix of the forward model M: In this study, we assume a static behavior of the background error matrix B that is considered constant at the beginning of each analysis step.This assumption is based on the fact that the increase in the background error during each forward propagation step is balanced by the decrease of the error through the previous analysis step.Moreover, the results obtained by Sabater et al. (2007) suggest that the analysis Several configurations of LAI background (and observation) error were tested.Figure 2 summarizes the setups of five experiments where the value of σ LAI is defined as a function of LAI.In the first experiment (option 1), the background (observation) error is set to 20 % of the LAI value.This rather empirical option was used by Jarlan et al. (2008) and Rüdiger et al. (2010), as they underlined the need for a variable error definition.The next three options are represented by a constant error for LAI values less than 1, 2 and 3 m 2 m −2 , respectively.For values larger than these quantities, σ LAI is proportional to the modeled (observed) LAI, as in option 1.The last experiment takes into account the configuration proposed by Sabater et al. (2008) with an overall constant std error of 1 m 2 m −2 .Also, it was assumed that the LAI observation and background errors are equal.
In order to quantify the assimilation performance, the rootmean square (rms) error is computed using all available data (daily LAI and SWI observations).The impact I of the assimilation with respect to the model is calculated as:

Diagnostic on background and observation errors
The performance of an analysis scheme depends on appropriate statistics for background and observation errors.Wrongly specified error parameters may negatively affect the analysis.
One source of information relies on the statistics of the innovations (observations-minus-background) and can be viewed as an a priori diagnostic.This approach was extensively investigated in the literature (Hollingsworth and Lönnberg, 1986;Andersson, 2003;Mahfouf et al., 2007).Several authors have proposed a posteriori verification based on statistics of observations-minus-analysis (Talagrand, 1999;Desroziers and Ivanov, 2001) that potentially provide an additional consistency test of an assimilation scheme.
For diagnosis purposes the following quantities are computed: 1. the differences The diagnosed values of the background (σ f i ) and observation (σ o i ) error variances may be computed a posteriori as in the following formulas:  where n i is the number of measurements, y o i is the value of the i-th observation, and y f i , y a i represent their forecast and analysis counterparts, respectively.

Modeled soil moisture and Leaf Area Index
The temporal behavior of modeled root-zone soil moisture w 2 illustrated in Fig. 3, bottom panel (in blue line), for the 7year period at SMOSREX, shows that the inter-annual cycles of w 2 are reasonably well reproduced.However, the model slightly underestimates the soil moisture data during winter and spring, and largely overestimates the observed values of w 2 in summer and autumn.There are significant differences between the magnitude of observed and simulated soil moisture from 2003 to 2007.
The model is able to simulate the vegetation growth and senescence in response to meteorological conditions (Fig. 4, blue line).In summer low soil water contents are well correlated with reduced active biomass.In the ISBA-A-gs simulations, the start of the growing season tends to occur later than in the observations (as was noticed by Brut et al., 2009, com-paring the model to satellite data), with a lag of about one month.Similarly, the summertime senescence phase may be delayed, especially in the first three years.
In 2001, the majority of precipitation occurred in spring, whereas in 2002 large amounts of rainfall were observed later during the summer (humid and cool summer).Also, the spring of 2007 was characterized by unusual increased precipitation in southern Europe.In relation to these wet conditions, the LAI maximum is highly overestimated by the model for these three years.For the remaining periods, despite the temporal shift, the magnitude of the model is consistent with the observed LAI values.In contrast, the years 2003 and 2004 were very dry, accelerating the vegetation mortality during summertime.In particular, the unusual lack of precipitation in spring 2003 caused an early stress of the vegetation.The senescence occurred early (in June) resulting in the smallest LAI amplitude cycle over the 7-year period.
A second yearly LAI maximum caused by a re-growth of the vegetation, was observed for several years, with rather high value in 2003 and 2005.In 2003 the model is not able to reproduce the vegetation re-growth.In contrast, the autumns of 2005 and 2006 are characterized by the ability of the model to capture the re-growth of the vegetation in response to rainfall events which occurred at the end of the summer.

Jacobian estimates
The examination of the Jacobian matrices is important for understanding of the data assimilation performance.The evolution of the background error covariance matrix by the forward model is performed through its Jacobian matrix (Eq.11), while the Jacobian of the observation operator (H) is required to calculate the Kalman gain (Eq.10).
For the soil moisture component of the state vector, perturbations of a 10 −4 × (w fc − w wilt ) magnitude were used to estimate the tangent linear model M as well as the Jacobian H. Several studies have showed that these very small perturbations lead to good approximations of the linear behavior.The dynamic of the model as captured by the term ∂w 2 (t) ∂w 2 (t 0 ) of the tangent linear model was analyzed extensively by Draper et al. (2009).
For LAI, values of 10 −3 corresponding to LAI perturbations of about 0.003 m 2 m −2 were used to compute the Jacobians following the sensitivity study performed by Rüdiger et al. (2010).In the latter study the structure of the ∂LAI(t)  Figure 5 shows that, generally, the ∂LAI(t) ∂w 2 (t 0 ) Jacobian term has positive values.However, zero values and slightly negative values are also found.Very small negative Jacobian values (10 −3 ) have a relative frequency of 30.2 %.Generally, they occur during the winter season.The soil water content exceeds the field capacity in about 80 % of these cases.Two types of nonnegative values can been distinguished: positive and strongly positive.For w 2 values above the wilting point, the water perturbations directly impact photosynthesis and plant growth and an increase in soil moisture triggers an increase in biomass production.Large Jacobian values (larger than 5) that represent 0.38 % of the population correspond to periods of water stress.Under the limit condition when w 2 approaches the wilting point, small increases in w 2 may cause a large increase in biomass production.When the Jacobian values are strictly zero (occurrence of 14.8%), there is no sensitivity of LAI to soil moisture.The histogram of w 2 corresponding to zero Jacobian (not shown) presents a bimodal probability density function.The two modes correspond to periods of severe drought (when w 2 < w wilt ) or water excess (when w 2 > w fc ).These periods coincide with the senescence phase or with low vegetation growth at wintertime, respectively.Zero Jacobian values also occur when the LAI reaches its prescribed minimum threshold value of 0.30 m 2 m −2 .
The term ∂SWI(t) ∂LAI(t 0 ) is dominated by plant-transpiration processes.Positive LAI perturbations during either growing or re-growing vegetation phases cause enhanced plant transpiration and water extraction rate.This results in a reduction of soil water content and negative values of this Jacobian term are found (not shown).

Joint assimilation of LAI and SWI
In order to illustrate how the assimilation procedure performs, time series of modeled, observed and assimilated LAI are depicted in Fig. 4 for several error specifications used in this study.The main differences between different options are observed for the years 2003 and 2004, in the period of vegetation re-growth (September-October), when the model tends to largely underestimate the observed LAI.Using a std error proportional to LAI values (option 1), the filter is able to reduce the difference between the model and the measurements (Fig. 4, top panel).When the other options are used (for example, option 3 middle panel, option 5 bottom panel in Fig. 4), the filter becomes less confident in the model simulation when the modeled LAI is low.Consequently, measured LAI values higher than simulated LAI values have more weight in calculating the Kalman gain and the assimilation is closer to the observations.Between two assimilation cycles, when no observation is available, the plant growth cannot be maintained.The trajectory is systematically drawn back towards low model values, even though there is no strong soil water constraint in the root zone.This suggests that other mechanisms (as the response to light or to temperature) play a role in the vegetation re-growth.This should be taken into account in order to improve model results persistently.Moreover, the possibility of conflicting information coming from LAI and soil moisture data streams (e.g. increase in LAI while the model has reached a completely dry state) may occur.The filter can balance the influence of the opposing tendencies according to the assumed errors of each component of assimilation, but cannot correct a systematic bias.At summertime, a decrease of the updated SWI component corresponding to a reduction of soil moisture (Fig. 3, top panel) accelerates the vegetation mortality (Fig. 4, top panel).For example, from June to August 2003, the positive bias in the modeled SWI is reduced by half, on average, by the assimilation.For the same period, the bias in the LAI values is significantly reduced, as the increased water stress enhances the vegetation mortality.Also a significant lower updated SWI in June 2004 causes a higher rate of vegetation mortality during the following months of July and August.This is beneficial to the analyzed LAI, now closer to the observations.Hence the assimilation acts in a coherent manner by reducing the LAI towards the low observations.Not only the senescence season benefits from the assimilation.The delay at the start of the vegetation is corrected by the filter, from 2004 to 2007.In 2003, the measured LAI peak of about 3 m 2 m −2 occurs in May, while the model predicts a lower peak value in June.Though the filter is not effective in increasing the LAI maximum, the delay is slightly reduced (Fig. 4).The same behavior is noticed in 2007.The simulated LAI maximum occurs in July when the modeled water stress becomes important.After the assimilation, the peak is shifted one month back.
The convergence of the algorithm with different choices of the error std was investigated.The daily background and analysis departures were used in order to calculate the rms error.Figure 6 shows the rms error averaged over the 7-year period.The model LAI rms error is of 0.98 m 2 m −2 .Much lower values are achieved with all the analyses, and the lowest rms error (0.40 m 2 m −2 ) is obtained with option 3.
In Table 2 the quantification in percents of the assimilation impact I on the LAI component (see Eq. 12) is given for each year as well as for the whole period.For the first two years the annual performance of the assimilation is larger when using option 1, maybe due to the different treatment in processing the observed data.For the remaining period a constant improvement is observed when moving from option 1 to the other options.A LAI improvement of I = 53.8% over all the period is obtained by using option 1, while by choosing either option 3 or 4 we can notice a larger improvement of 59.1 % and 54.5 %, respectively.
Regarding the soil moisture scores, the root-mean square error and bias computed for SWI and root-zone soil moisture are listed in Table 4 for each year and for the whole period 2001-2007.The assimilation of SWI significantly reduces the bias between the model and the retrieved SWI (Fig. 3, top panel) as well as the rms error from 0.091 to 0.023.The rms error calculated for the root-zone soil moisture before and after assimilation of SWI decreases from 0.042 to 0.036 m 3 m −3 .This results in a substantial correction of around I = 13.4 % of the root-zone soil moisture towards the measurements when compared to the model simulations over the 7-year period.The annual bias in the root-zone is also reduced, except for the two first years.In autumn 2001, an important increase of the wet bias is noticed (Fig. 3, bottom panel).During this period, very low LAI values (less than 0.5 m 2 m −2 ) were assimilated and the updated LAI was close to these observations.This causes lower plant transpiration that results in an augmentation of soil water content in the root zone.
These results are obtained when using option 3 for the LAI error specification.No significant sensitivity of soil moisture to the different choices of LAI error was found.

Diagnostic results
Figure 7 shows the histograms of innovation and residual distributions for SWI and LAI.For SWI a Gaussian least square estimate of the innovation mean and variance from a sample of 847 members provides a wet bias (µ = −0.012)with a std of σ = 0.09.If the background and observation errors are uncorrelated and normal distributed, the variance of the innovations is represented by the sum of observation and background variances (Andersson, 2003).Here, one can notice that the chosen errors for the observations and for the background are not consistent with the statistics of the innovations.The LAI innovations present a left tailed distribution and flatter than a normal distribution (Fig. 7, bottom).As expected, the std of residuals is reduced compared to those of innovations from 0.96 to 0.29.
A posteriori diagnostics (see Eq. 13) were computed for LAI by using the analysis outputs corresponding to each choice of the error.Seasonal diagnostics were produced for both background and observation errors in all cases (see Figs. 8 and 9).The background error is overestimated for all options and for all seasons, except during wintertime for the first two options when the specified error is larger than the diagnosed error.Among the other options, option 3 seems to have a less mismatch.A large discrepancy between the specified and the estimated observation error is noticed, for example, in winter and spring for the first two options.This shows that too much confidence is given to observations at the start of the growing period.In option 3, these differences are reduced showing a better agreement between specified and estimated observation errors.
As the use of a retrieved soil moisture product may be subject of poorly known errors, the same diagnostics were calculated for SWI observations and the soil moisture state variable.The diagnosed values show that the SWI observation error is highly overestimated (around 68 %), while the background error of w 2 is overestimated by 25 % (Fig. 10).
The new diagnosed values of the error std are 0.03 for SWI and 0.015 m 3 m −3 for w 2 .They lead to a better match with the innovation statistics (not shown).
Next, a new joint data assimilation experiment, called diagnostic experiment, was performed by replacing the initial background error of soil moisture and the SWI observation error with their diagnosed values.For LAI, the model and observations errors were maintained as for option 3.In Table 3 we compare the performance of these two experiments: initial and diagnostic.The impact of the new experiment on the LAI variable is almost the same.A higher assimilation impact can be noticed for SWI from 63.8 % to 72.3 %, as the SWI observations are now supposed to be more accurate.The fact of using accurate background and observation errors results in the same impact of 13.4 % on the soil water content.Indeed improving the performance of the system with respect to the SWI component does not necessarily provide a better result in terms of w 2 .The explanation lies in the definition of the observation operator.The fact that minimum and maximum values of soil water simulated by the model are different from those observed may lead to a systematic bias between the model and the observations that is not corrected through data assimilation.

Assimilation of superficial soil moisture
As mentioned in the introduction, assimilation of superficial soil moisture data has already been extensively discussed in the literature.In contrast to the assimilation of SWI, the nearsurface soil moisture increments are propagated to the deeper layers by the model.The performance of the assimilation depends on how the model transfers the information from the surface to the root zone.
In this study, in situ superficial soil moisture data were assimilated with a frequency of one observation every three days at 06:00 UTC.Automatic measurements are provided with a mean volumetric error std of 0.03 m 3 m −3 .In order to take into account the representativeness error, a larger error std of 0.04 m 3 m −3 was considered in this experiment.The state vector consists of root-zone soil moisture and LAI as in the previous experiments.Together with superficial soil moisture, LAI data are assimilated using option 3 for the error specification.
Data assimilation techniques are designed to correct random errors in the model and rely on the assumption of unbiased background and observations.However, the model simulations and data are typically different which may cause large systematic discrepancies in soil moisture climatologies.Several authors pointed to the need of rescaling the information before assimilation (Reichle and Koster (2004); Drusch et al. (2005); Crow et al., 2005).In this study, the bias between w g data and the model output was removed by using the Cumulative Distribution Function (CDF) matching as proposed by Reichle and Koster (2004) over the 7-year period.The cumulative distribution of the difference between the model and the observations is plotted against the observations in Fig. 11 where, for example, very wet w g observations induce a negative bias.A 7th-order polynomial is used to calibrate this ranked distribution.
Similar to Table 4, Table 5 shows the annual statistical scores in terms of rms error and bias computed for both soil moisture components in surface and root zone, respectively.Three cases may be distinguished: (1) a reduction of the negative bias in w g causing an increase of the positive bias in w 2 (2001)(2002), (2) a reduction of the positive bias in w g together with a decrease of the positive bias in w 2 (2003, 2004 and 2007) and (3) an increase of the negative bias in w g together with a decrease of the positive bias in w 2 (2005 and 2006).The impact of assimilating w g on the root-zone soil moisture gives an improvement of about 7.9 % over all the period, lower than by assimilating SWI (13.4 %).two analyzed w 2 show a comparable behavior in spring, summer and autumn, with a wet bias during the latter season for both estimates.In response to a significant correction of the large negative (dry) bias in w g in October (see Fig. 13 in conjunction with Table 5), the existing positive (wet) bias in w 2 is increased in November.Consequently, a larger w 2 value is estimated and the updated w 2 through the assimilation of surface observations diverges from the much lower model trajectory.This divergence has an overall detrimental impact on the statistics for 2001 and influences negatively the soil moisture evolution at the beginning of 2002.Sabater et al. (2008) noticed a similar degradation of the w 2 analysis for this period.Under unusual conditions (such as the long dry period from September to December 2001), the assimilation of surface soil moisture may be problematic.This reveals the weakness of using a limited number of soil layers with large differences between layer thickness.By assimilating SWI observations, the analyzed w 2 does not diverge, whereas the bias and the rms error increase as well (see Table 4 for the year 2001).
In 2003, the analyzed w 2 derived from the assimilation of w g is generally closer to the observations than the analyzed w 2 derived from the assimilation of SWI (Fig. 12, bottom right panel).During the unusual dry summer, very low volumetric w 2 values are observed and the assimilation of SWI does not permit to represent this phenomenon.During the period of June and July, when w 2 is constantly below the wilting point, the assimilation presents a saturation regime due to the imposed minimum threshold in the definition of SWI.After the severe drought period at the beginning of the summer, precipitation occur in August.By assimilating SWI, the soil water content is rapidly shifted to rather wet condi- tions which tend to degrade the simulation of w 2 .It seems that in such extremely dry conditions, the exponential filter is quite sensitive to changes in superficial soil moisture.On the other hand, the assimilation of w g data does not cause a large discrepancy in w 2 .Very poor statistical scores (Table 5) for w g in contrast to better scores for w 2 may be explained by the weak vertical coupling of the model during marked drought periods (Kumar et al., 2009).Albergel et al. (2010) have assimilated LAI and w g in ISBA-A-gs for the SMOSREX grassland.Although they used a different soil model (2 layers instead of 3) and different background and observation errors, they obtained (on average, over the 2001-2007 period) similar scores.

Effect of data assimilation on modeled carbon dioxide fluxes
The evolution of LAI is based on the biomass production due to the photosynthetic process.The photosynthesis module of ISBA-A-gs estimates the vegetation net CO 2 assimilation from which the biomass and LAI are predicted.Figure 14 illustrates the coherent impact of LAI updates on the carbon flux for the year 2007.Increased LAI values in the growing season (March-April) due to data assimilation corrections (top panel) trigger an increased photosynthetic activity (bottom panel).In the same manner, lower LAI values corresponding to the mortality phase (July-September) cause a decrease in the CO 2 uptake when compared to the model simulations.
In order to quantify the contribution of the data assimilation on the fluxes, measurements of net CO 2 flux or Net Ecosystem Exchange (NEE) and of latent and sensible heat fluxes have been gathered at the SMOSREX site for three years from 2005 to 2007.The CO 2 flux data were filtered using three criteria: wind direction (between 225 and 315 • ), absence of water deposition and a site-dependent threshold of friction velocity (larger than 0.16 m s −1 ) that account for a sufficient turbulent exchange (Albergel et al., 2010).The flux observations are averaged over 30 min, corresponding to the interval of model outputs.A total of 1609, 1790 and 2469 half-hourly observations are used for 2005, 2006 and 2007, respectively.In terms of rms error, an improvement of around 5 % is noticed for each year.For example, the rms error decreases from 4.25 to 4.01 µmol CO 2 m −2 s −1 for 2006, keeping a high correlation and reducing the bias (as listed in Table 6).For 2005 and 2007, the assimilation improves the rms and correlation scores, but not the bias.The effect of soil moisture and LAI analysis has a limited impact on surface energy fluxes (sensible and latent heat fluxes) (not shown).

Conclusions
This work is a first attempt to assimilate a SWI derived from the exponential filter method in a LSM.A posteriori diagnostics are also employed for the first time in order to verify the specification of the errors for SWI and LAI.This study comprises the Simplified Extended Kalman Filter procedure in different setups within the SURFEX modeling platform for a period of seven years with contrasted meteorological conditions.The results demonstrate the effectiveness of a joint data assimilation scheme when both SWI and LAI were merged into the ISBA-A-gs land surface model.The verification of the assimilation impact on the root-zone soil moisture was performed using ground based observations.The SWI product has advantages that can be exploited for successful data assimilation in a LSM.The rationale of using a SWI product instead of a volumetric surface soil moisture is that the propagation of information from the surface layer to the root zone may not be completely accurate due to a weak coupling between the two quantities for certain areas or for specific time periods.This can explain why the assimilation of SWI outperform the assimilation of w g in this study.
At the same time, one should be aware that the use of SWI poses a set of challenges related to theoretical properties associated to the data assimilation components, namely measurements, modeling and assimilation algorithms.Errors that may affect the analysis can be introduced at each level of the data assimilation procedure.The uncertainties in the observations derived from the exponential filter are difficult to estimate.Therefore, in this study, a posteriori diagnostics were used in order to verify the error specifications.In theory, the presence of autocorrelated observing errors is not compatible with the filter assumptions.The lack of seriallyindependent errors may be overcome by using more robust methods (Crow and van den Berg, 2010).For example, a colored noise process with a given time correlation length may be envisaged for a stochastic representation of observations.Significant improvements were obtained for LAI.Extensive simulations with the Simplified Extended Kalman Filter show that the choice of background and observation errors used in the assimilation is a key issue.By using different options, large LAI corrections are obtained during the senescence periods when the model tends to overestimate the LAI values.Our results indicate that the assimilation of LAI may correct another deficiency in the model, namely a delay in the start of the growing period.The results of statistical investigations support a variable error definition that takes into account the seasonal characteristics of LAI.The LDAS is shown to improve the carbon flux simulations.
Many studies involving LSM evaluations indicate the presence of systematic biases between the observations and the model outputs for soil moisture (Walker et al., 2003;Walker and Houser, 2004;De Lanoy et al., 2007) and LAI (Jarlan et al., 2008;Brut et al., 2009;Lafont et al., 2010).Even after quality control and calibration, under the conditions of an existing bias-free observational system, incorrect model parameterization and uncertain model inputs cause the presence of a systematic bias in the model forecast for both soil moisture and LAI.For example, in this study, it was noticed that after the assimilation of an LAI observation, the model tends to drift back to a biased state.When the observed LAI value is large and the model shows a dry state, the LAI increments could be positive, but the model is not able to maintain a high LAI value.If the conflicting information provided by the observations is reliable, it points towards an error in the LSM and/ or parameters (e.g.too shallow root zone) that the assimilation cannot correct.On one hand, this suggests that the model itself should be improved through enhanced parameterizations or parameter tuning.On the other hand, this is an indication that the bias should be included in the analysis system as demonstrated by Drécourt et al. (2006), De Lanoy et al. (2006).
The computational effort of a filter is an important aspect for operational applications and monitoring activities.The computational cost of the EKF is generally low.The LDAS should be able to incorporate near-real time satellite data at large scale.Therefore, the methodology demonstrated in this study has been implemented in the SURFEX platform and can be used as a guideline in more comprehensive experiments for regional applications.The next step is to extend these results over the France domain by using a mosaic version of the ISBA-A-gs model instead of using only one cover (grassland) option as was considered in this study at local scale.This approach will make possible to aggregate the information from different ecosystem types in several covers in order to describe the regional vegetation state.Satellite SWI (e.g. the Advanced Scatterometer (ASCAT) instrument provides a normalized soil moisture product) and LAI will be ingested in the LDAS which is of high interest for land carbon monitoring.

Fig. 1 .
Fig. 1.The SEKF data assimilation design for LAI and SWI components.SWI o observations in the root zone are derived from normalized surface soil moisture SWI o g using the exponential filter.

Fig. 2 .
Fig. 2. The LAI error standard deviation as a function of LAI values for ISBA-A-gs for the SMOSREX grassland.

Fig. 3 .
Fig. 3. Time series of observed, modeled and assimilated SWI (top) and soil moisture w 2 (m 3 m −3 ) measured and modeled before and after assimilation (bottom) of SWI from 2001 to 2007 for the SMOSREX grassland.

Fig. 8 .
Fig. 8. Seasonal LAI diagnostics of background errors for all five options used in this study calculated over the 7-year period (2001-2007).The estimated (diagnosed) values are in black, the specified values in gray.

Fig. 10 .
Fig. 10.Seasonal soil moisture diagnostics of background (left) and observation errors (right) used in this study calculated over the 7-year period at SMOSREX location.The estimated (diagnosed) values are in black, the specified values in gray.

Fig. 11 .
Fig. 11.Calibration of the cumulative distribution function of in situ data and simulated superficial soil moisture (m 3 m −3 ) by a 7th-order polynomial fit over the 7-year period (2001-2007) at SMOSREX location.

Fig. 12 .
Fig. 12.Time series of root-zone soil moisture before and after assimilation of SWI and w g against w 2 measurements (m 3 m −3 ) for 2001 (top) and 2003 (bottom), respectively.

Fig. 13 .
Fig. 13.Time series of observed, modeled and assimilated surface soil moisture (m 3 m −3 ) for 2001.The observations were rescaled in order to match their statistical distribution to those of ISBA-A-gs.

Fig. 14 .
Fig. 14.Time series of observed, simulated and assimilated LAI (m 2 m −2 ) (top) and corresponding daily evolution of simulated and updated CO 2 fluxes µmol CO 2 m −2 s −1 (bottom) for the year 2007 for the SMOSREX grassland site.

Table 1 .
Soil parameters used for ISBA-A-gs at the SMOSREX location.The last two lines represent the threshold values for w 2 used to define the simulated Soil Wetness Index.

Table 6 .
Statistics of simulated and updated CO 2 fluxes (micro mol m −2 s −1 ) after assimilation of LAI and SWI for ISBA-A-gs from 2005 to 2007, as well as for the 3-year period(2005)(2006)(2007)for the SMOSREX grassland.