A Monte Carlo approach to estimate the uncertainty in soil CO2 emissions caused by spatial and sample size variability

Abstract The soil CO2 emission is recognized as one of the largest fluxes in the global carbon cycle. Small errors in its estimation can result in large uncertainties and have important consequences for climate model predictions. Monte Carlo approach is efficient for estimating and reducing spatial scale sampling errors. However, that has not been used in soil CO2 emission studies. Here, soil respiration data from 51 PVC collars were measured within farmland cultivated by maize covering 25 km2 during the growing season. Based on Monte Carlo approach, optimal sample sizes of soil temperature, soil moisture, and soil CO2 emission were determined. And models of soil respiration can be effectively assessed: Soil temperature model is the most effective model to increasing accuracy among three models. The study demonstrated that Monte Carlo approach may improve soil respiration accuracy with limited sample size. That will be valuable for reducing uncertainties of global carbon cycle.


Introduction
The total global emission of CO 2 from soils is recognized as one of the largest fluxes in the global carbon cycle (Schlesinger and Andrews 2000; Piao et al. 2009;Bond-Lamberty and Thomson 2010) and plays a major role in determining the atmospheric greenhouse effect (Field et al. 2007). This large annual emission dwarfs anthropogenic CO 2 production from fossil fuel and implies that any small error in its estimation would result in large uncertainties related to the effects of CO 2 build-up in the atmosphere. There is a need, therefore, to improve the accuracy of soil CO 2 emission estimates (Shi et al. 2012;Gomez-Casanovas et al. 2013).
Actually, numerous studies regarding the uncertainties of carbon flux estimates using eddy covariance (EC) have been reported. Elbers et al. (2011) presented a method for evaluating the factors of total uncertainty for estimating net ecosystem productivity (NEP) without considering spatial variability. Richardson and Hollinger (2007) used synthetic data sets, developed by assimilating data from a range of FLUXNET sites, into a simple ecosystem model to evaluate the relationship between gap length and uncertainty in the net ecosystem exchange (NEE) of CO 2 . Oren et al. (2006) assessed spatial variability estimates in the context of uncertainty in the annual NEE and combined uncertainty from gap filling and instrument error with the uncertainty caused by spatial variability. Hou et al. (2013) evaluated the effects of the spatial heterogeneity of reservoir permeability on CO 2 migration, applying an uncertainty quantification framework. However, all above studies focus mainly on eddy covariance (EC) methods (e.g., NEE, NEP, and ecosystem respiration), and uncertainty in soil respiration estimation remains far greater than that in other components of the carbon cycle (Bond-Lamberty et al. 2004;Trumbore 2006;Zhang et al. 2013). As with the ecosystem carbon cycle, the uncertainty of soil respiration also contains measurement error and flux calculation uncertainty, spatial variability uncertainty, statistical selection uncertainty, and gap-filling uncertainty.
Methods for measuring soil CO 2 efflux have undergone a considerable evolution over the past 30 years, giving rise to what is today considered state-of-the-art measuring systems that consist of automated chambers that use an infrared gas analyzer (Pumpanen et al. 2004;Subke et al. 2006;Vargas et al. 2011;Jassal et al. 2012;Koskinen et al. 2014;Maier and Schack-Kirchner 2014;Riederer et al. 2014). These systems are often deployed as nonsteady state or steady state. Static chamber systems also continue to play an important role in assessing soil CO 2 emissions because they are relatively inexpensive and easy to deploy. These measurement systems are generally considered to provide the most reliable estimates we have of soil respiration. (Shi et al. 2011;Maier and Schack-Kirchner 2014). Sources of uncertainties in soil respiration stem from site characterization, site carbon capacity, injection rate, CO 2 trapping mechanisms, mineral precipitation dissolution kinetics, and so on (Hou et al. 2013). In general, all sources of uncertainty can be divided into two dimensions: time and space. It is well recognized that static chambers have poor temporal resolution and automated chambers have poor spatial resolution in these dimensions.
To address the problem of temporal uncertainty, gapfilling strategies have been applied effectively to estimate the soil CO 2 efflux, and the soil respiration estimating model has been assessed according to time (Gomez-Casanovas et al. 2013); however, the study only focused on time-series sampling, and spatial uncertainty has not been investigated. For spatial uncertainty, the inventory method is limited by the quality and the spatiotemporal representativeness of measured Rs data. Poor data can result in infinite uncertainty on Rs estimates on a regional scale (Yu et al. 2010). Furthermore, the process-based soil respiration model has always been considered as the universal method for both temporal and spatial estimation of soil respiration. This method can simulate the spatial patterns and also predict the long-term dynamics of ecosystem respiration (Cramer et al. 2001). However, the process-based soil respiration model has a complicated structure when connecting soil-plant-atmosphere processes. It is thus difficult to evaluate the rationality of the estimated results when considerably large uncertainty exists in the spatial representativeness of model parameters (Yu et al. 2010). Nevertheless, compared with the above methods, the geostatistical model of soil respiration could be a good method due to simple structure, sound parameterization method, and reasonable results (Raich and Potter 1995;Reichstein et al. 2003), but the application of this method is built on the premise that relationships exist between in situ soil respiration and environmental variables. However, in natural conditions, randomness is universal and authentic, especially for temporal variation.
Monte Carlo method complies with this natural randomness and only relies on sufficient data and repeated random sampling, without considering any premise. These methods are most suited to calculation by a computer and tend to be used when it is infeasible to compute an exact result with a deterministic algorithm. Additionally, the method is used to complement derivations (Doucet et al. 2000). Currently, the Monte Carlo sampling technique is an efficient method for estimating and then reducing spatial-scale sampling error. It has been applied for estimating transpiration (E) of forest stands (Kumagai et al. 2005a) and for examining how errors in E would be generated from different parameter values acquired with an equation regressed with limited data (Kumagai et al. 2005b;Kume et al. 2010). However, the Monte Carlo sampling technique has not been used in other similar fields, for example, soil CO 2 emission estimation. The technique may eventually play an irreplaceable role in this estimation.
In this study, we aimed to define an optimal and effective sampling design to determine 10s km-scale soil CO 2 emission estimates calculated from soil respiration rate measurements, examine how sample sizes for soil temperature and soil moisture impact these 10s km-scale soil CO 2 emission estimates, determine whether the estimation errors due to sample sizes change with the variations in region area, and then build a standard Monte Carlo sampling procedure for producing defensible estimates of soil CO 2 emission. Based on the assumption that the 10s kmscale soil CO 2 emission was accurately determined from point measurements, the impact of point-to-point variations in soil temperature and soil moisture on the 10s kmscale soil CO 2 emission will also be determined using a Monte Carlo analysis of the original data sets. This analysis predicted how many samples are required to account for point-to-point variations and evaluated the applicability of three general soil CO 2 emission estimating models.

Study site
The study site is located in the Zhangye oasis (1,400-1,600 m a.s.l.), Gansu Province, China, which is the core part of the middle reaches of the Heihe River. The climate is temperate, with a mean annual temperature of 7.6°C, mean annual precipitation of 117 mm, and mean annual potential evaporation of 2390 mm (Wang et al. 2013). The main crops cultivated in this area are maize and wheat. Almost all farmland in this area is irrigated with the water diverted from the Heihe River. The field observations used in this study were derived from the Heihe Watershed Allied Telemetry Experimental Research (HiWATER) project. HiWATER is a comprehensive ecohydrological experiment under the framework of the Heihe Plan and is based on the diverse needs of interdisciplinary research and existing observational infrastructures in the basin .
A permanent observation plot was set up in the farmland, which is located within 38.8369°-38.9055°N, 100.332°-100.410°E, covering an area of 5.0 9 5.0 km 2 and is cultivated with maize (Zea mays L.) during the experimental period. Fifty-one custom-designed polyvinyl chloride (PVC) collars were placed in the observation plot for measurements of CO 2 efflux from the soil. The plot was equally divided into four 2.5 9 2.5 km 2 subplots. Different numbers of PVC collars (observation points) were evenly established in each subplot. Detailed information of the different plot layouts is shown in Table S1.
Measurement of soil CO 2 efflux, soil temperature, and moisture Soil respiration was measured using an automated soil CO 2 flux system (LI-8100; LI-COR, Lincoln, NE, USA) equipped with a portable chamber (Model 8100-103). A PVC collar (20.3 cm in diameter and 10 cm in height) was inserted into the soil among the maize seedlings to a depth of 2.5 cm at each sampling point approximately 2 weeks before the first measurement. Small litter was left in the collar, and large items were removed. All collars were left at the site for the entire study period.
The soil respiration data from the 51 PVC collars were measured once every 6 days over the whole period of maize growth from 6 June to 19 September 2012. The maize in the study site was harvested at approximately this time. Based on the preliminary experiment in 2011 (continuous measurement of soil respiration), the suitable diurnal measurement time was determined: Measurements were taken between 8:30 and 12:00 local time on each sampling day. Preliminary experiment in detail was described in Appendix S1.
Temporal soil temperature and moisture near each collar were measured at the same time as soil respiration measurements. Soil temperature was measured at a depth of 10 cm using a handle thermocouple probe, while the soil volumetric water content was measured at 0-10 cm depth, using a time-domain reflectometry moisture meter (TDR200; Spectrum, Aurora, IL, USA).
Similarly, continuous soil temperature and moisture near each collar were also measured throughout the entire study period. Soil temperature was measured at 10 cm depth by thermorecorders (TR-52; T&D, Matsumoto, Japan), and soil moisture was measured at 10 cm by a soil moisture sensor (SMB-M005; Decagon Devices, Pullman, WA, USA). The continuous measurements were performed at 30-s intervals, and 30-min averages were recorded.

Scaling for growing seasonal soil CO 2 efflux
Soil respiration data from the complete growing season were fitted to soil temperature and water content with exponential and power functions given in equations (1) and (2) to describe the dependence of soil respiration on soil temperature and soil water content.
where R, T, and W are soil respiration, soil temperature, and soil volumetric water content, respectively, and a and b are constant coefficients. An equation with two variables was established to describe the interactive effects of soil temperature and water content on soil respiration (Li et al. 2008a): where a, b, and c are constant coefficients. A soil CO 2 emission of 51 points over the growing season was calculated by integrating the CO 2 efflux for the period from 16 June to 19 September 2012 using the observed ecosystem-specific response equations: (1), (2), and (3). Applicability of this method has been demonstrated by some studies (Wang et al. 2010b;Shi et al. 2014). Equations (1), (2), and (3) were abbreviated as R: T, R:W, and R:T&W, respectively.
Furthermore, the coefficients of determination (R 2 ) of different models on soil respiration (R) against soil temperature (T) and/or soil moisture (W) were calculated by nonlinear least-squares method. The results were from average coefficients of determination (R 2 ) value of 51 points.

Method of analysis
The soil CO 2 emission of every point was calculated by, respectively using equations (1), (2), and (3). Besides, W and T of the every point were represented by their average value. The estimation errors of R, W, and T caused by spatial variations were calculated using various sample size based on Monte Carlo sampling. Figure 1 shows the flow diagram of the Monte Carlo sampling computer program, and the specific steps are shown as follows: (1) The size of a sample A is N (N = 51; respectively for R, W and T), and the arithmetic mean value E N of this sample was calculated.
(2) The subsamples B i with the sample size k = N -1 were chosen from the sample about M = 10,000 times, randomly. The arithmetic mean values E i of each subsample B i were calculated and i = 1, 2, . . ., M. The sampling for subsamples B i is a random process, and it can be thought that the subsamples are different, because it was decided by the sampling method. We assume the sample A = [A 1 , A 2 ,. . ., A N ], and the method is shown as follows: a N is the size of a sample A, and the numbers from 1 to N were rearranged randomly. The rearranging result C 1 was a vector, for example, C 1 = a 1 , a 2 ,. . ., a N (a i = 1, 2,. . ., N, a i 6 ¼ a j ). b The other vector D was obtained by choosing numbers from C 1 before the number k. Thus, D 1 = a 1 , a 2 ,. . ., a k ; c The subsamples B 1 were obtained by choosing the data The steps a-c were repeated 10000 times, and the vector groups C i , D i and B i ði ¼ 1; 2; . . .; 10; 000Þ were obtained. Because the rearranging process for C i was random, . . .; 10; 000; i 6 ¼ jÞ was a little probability event. There were few of D i = D j or B i ¼ B j ði; j ¼ 1; 2; . . .; 10; 000; i 6 ¼ jÞ, but compared with 10000 groups data, the effect on the results can be ignored.
(3) E N was used as the arithmetic mean value of the subsample B i mean, E i . Thus, the coefficients of variation CV k for E i with the sample size k can be calculated by the following formula: When M is large enough, based on the law of large numbers, CV k means the degree of variation for the subsample with measure times k is compared with the sample with measure times N.
Additionally, In this study, the equations (1), (2), and (3), respectively, assume that the variance of R estimates associated with sample sizes was given by variance of T estimates, variance of W estimates and combined variance of T&W estimates associated with sample size. Also,   (3) were determined by the relation of R and T and/or W to potential estimation errors. The total derivative of the equations (1), (2), and (3) is as follows: The form of the equations (1), (2), and (3) can be transformed to the following equation: Equations (7), (8), and (9), respectively, mathematically indicated sources of errors from the three estimation models: the equation (7) means the errors in R caused by the potential estimation variation in T associated with the sample size; the equation (8) means the errors in R caused by the potential estimation errors in W associated with the sample size; and the equation (9) means the errors in R caused by the potential estimation errors in T and W associated with the sample size; The analyses were performed using data sets collected from continuous measurements for T and W. The similar mathematical deduction method has been effectively applied for estimation of tree stand-scale transpiration (Kumagai et al. 2005a,b;Kume et al. 2010).
To examine whether the potential errors due to sample sizes change in different plot conditions, Monte Carlo analyses were performed for subplots with different point densities (i.e., four 2.5 9 2.5 km 2 subplots and four 5 9 2.5 km 2 subplots)

Variation in soil respiration
The general pattern of the change in soil respiration during every day of the preliminary experimental period was similar (preliminary experimental design in detail see appendix S1). There was a strong diurnal pattern with a peak in the period. Figure 2 shows the typical diurnal pattern of soil respiration in representative day of this period. In this period, the diurnal average value is approximately 4.5 lmolÁm À2 Ás À1 , and the diurnal peak value is 7.0 lmolÁm À2 Ás À1 . Nevertheless, the time of the peak value occurring was not steady, but the diurnal average value AE error was located between approximately 7:30 and 12:30. Additionally, the maximum error is defined as 10% of the average value. Figure 3 shows the relation between the sample size and the CV of T & W in the 5 9 5 km 2 plot. Two-dimensional analytic geometry has demonstrated that if the slope of the curve is less than À1.0, the value of the vertical axis will change more slowly compared with abscissa axis; if more than À1.0, the situation will be reversed. Therefore, À1.0 for dCV/dn is deemed as the threshold of significant changes of CV (dCV/dn) for estimating the optimal sample size. A dCV/dn < À1.0 indicates that the CV significantly decreased and greatly improved the precision of the estimation with an increase in the number (n) of PVC collars. In contrast, a dCV/dn > À1.0 suggests a slight decrease with increasing n, and the increase of n cannot effectively improve the estimation precision. In this study, the minimum n at dCV/dn > À1.0 is defined as the optimal sample size. According to this threshold value, n = 3 for the soil temperature was the optimal sample size, and the CV was 5.4%. When n was less than 3, the dCV/dn for the soil temperature was smaller than À1.0. Conversely, when n was more than 3, the dCV/ dn for soil temperature was greater than À1.0 (Fig. 3A). Similarly, the dCV/dn for the soil moisture was larger than À1.0 at n > 4, and the dCV/dn for the soil moisture was smaller than À1.0 at n < 4 (Fig. 3B). The optimal sample size for the soil moisture was n = 4 and CV was 9.5.

Sample size
The optimal sample size of T, W, and E estimates with R:T, R:W, and R:T&W was analyzed by the Monte Carlo method (Table 1). Depending on the threshold value (dCV/dn = À1.0), different optimal sample sizes were indicated for T, W, and E in one 5.0 9 5.0 km 2 plot and four 5.0 9 2.5 km 2 plot. Furthermore, in four 2.5 9 2.5 km 2 plots, the optimal sample size was not obtained because all dCV/dn were smaller than À1.0. Figure 4 shows the variation of errors in E estimated by R:T, R:W, and R:T&W, respectively, associated with the sample size in different plots. There are no significant differences between R:T, R:W, and R:T&W for 51 points in the 5.0 9 5.0 km 2 plot and for 30 points in the 5.0 9 2.5 km 2 plot. The dCV/dn of E estimates by R:

Variation of errors in a different method for estimating E
T&W was larger than with the other two models, with n increasing for 27, 24, and 21 points in the 5.0 9 2.5 km 2 plot and for 15 and 12 points in the 2.5 9 2.5 km 2 plot. However, for 9 points in the 2.5 9 2.5 km 2 , the dCV/dn of the E estimates by R:T was larger than for R:W and R: T&W, with n increasing.

Constant coefficient of errors in different estimate methods
Constant coefficients of three different methods were obtained by the total derivative of equations (1), (2), and (3). The equations (7) influence of T and/or W on the error in R.

Discussion
In this study, the measurements were taken between 8:30 and 12:00 local time on each sampling day. More and more studies are reporting soil respiration over a time range during the day to better represent daily average values (Huang et al. 2007;Li et al. 2008b;Wang et al. 2010a). However, because environments are not different, the typical time ranges are also different. In a preliminary experiment, we confirmed that the diurnal average value AE error (10%) was about located between 7:30 and 12:30 (Fig. 2). In this time range and 5 9 5 km 2 area, observation points maximally reached 51. Furthermore, number and space representations are key for Monte Carlo spatial sampling. These points were evenly distributed in consideration of the representative and experimental situation (Table S1). To a large extent, soil CO 2 emission at this site was accurately measured, and choosing new points did not result in significant changes. Therefore, the number and location of these points are reasonable for the new sampling technique promotion.
There are mainly two sources of the error resulting from the estimation of the model parameter(s) and the integrating of the model prediction for R. In fact, this kind of error was from fitting degree of the three models on soil respiration (R) against soil temperature (T) and/ or soil moisture (W). Table 3 indicated the uncertainty from the two sources by analyzing coefficients of determination (R 2 ). In our original application of the Monte Carlo method, we analyzed the optimal sample sizes and potential errors (CV) by evaluating the dCV/dn value, based on a data set with sample size changes and with variation in stand condition. In the 2.5 9 2.5 km 2 subplot, the optimal sample sizes for R, T, and W were not obtained; this suggests that the sample sizes in the area were too small to significantly reduce the error (Fig. 4). Nevertheless, the ranges of the optimal samples sizes and the potential errors in the 5 9 2.5 km 2 subplots were similar to those in the 5 9 5 km 2 plot (Table 1). The results indicate that a small plot with a large enough sample size also can introduce a similar optimal error based on optimal sample size.
T and W are two dominant factors for soil respiration from homogenous soil and vegetation (Xu and Qi 2001;Shi et al. 2012;Dore et al. 2014). Table 1 shows that the optimal sample sizes were smaller for T than for W, and potential errors for T at the optimal sample sizes were also smaller than those for W. This finding indicated that W variation could be a greater source of variability when increasing the scale, which was observed because the experimental site is arid land, and farmland in this area depends on irrigation from the Heihe River. During the growing season, T is basically stable. But, W is closely related to irrigation, and the irrigation would result in spatial heterogeneity of W (Ge et al. 2013;Liu et al. 2015). So in the scaling-up process, spatial heterogeneity of soil moisture is higher than that of soil temperature. However, the optimal sample sizes were larger for R than for T and W, and the potential errors for R at the optimal sample sizes were larger than those for T and similar with those for W (Table 1). This suggested that the relationship error of R with W and/or T is not a simple linear relation when increasing the scale. The R was based on three models, including R:T, R:W, and R:T&W. Although some studies have suggested that biophysiological process (the hysteresis effect, root exudates, photosynthesis and so on) plays an important role in soil respiration, these belong to factors of the process model for soil respiration (Griffis et al. 2003;Gaumont-Guay et al. 2006;Kuzyakov and Gavrichkova 2010). Actually, the process model is indeed more reliable and rational. Nevertheless, the process model has not been effectively developed and widely used until now (Zhou et al. 2010). In this study, the three functions are classic empirical models and have been confirmed and applied widely. Therefore, the three empirical models were selected for this study.
First, the R:T model is a more universal T-dependent equation for soil respiration estimation (Lloyd and Taylor 1994). Equation (7) is total derivative form of the R:T model and indicated that the error of R (CV) is linear with variation in T (STD). In the scaling-up process, the constant coefficients (b) are stable on the whole when n > 24 (Table 2). Similarly, the R:W model is also an important equation and, in soil moisture, is a single factor. Equation (8) is a total derivative form of the R:W model. Equation (8) indicated that the error of R is linear with the error of W. When n > 24, the constant coefficient (b) is stable on the whole in the scaling-up process. Third, the equation (9) is from the total derivative of R: T&W. This equation suggested that the error of R is the multiple of the potential errors in T and W. Nevertheless, the constant coefficients (b and c) are also shown in Table 2. When n > 24, the constant coefficients (b and c) are stable on the whole in the scaling-up process. All above constant coefficients indicate that when the sample size is not large enough, the effect of T or W on R is not steady due to spatial heterogeneity. This demonstrated that if the optimal sample size was obtained from enough sample size in the Monte Carlo method, the error could be significantly reduced. Furthermore, by comparing b and c in equation (9), we were able to find that c is more stable than b with a sample size change. This also indicated that soil moisture is the dominant factor for soil respiration; variation of steady soil temperature is dependent on strong fluctuation of b. In contrast, W changes dramatically compared with the variation in c. Besides, Table 1. Optimal sample sizes for estimating T, W, and R, based on three models in the 5.0   Table 3 showed the coefficients of determination (R 2 ) of three models on R against T and/or W. The results indicated W could explain 55.6 AE 13.2% variation of R, but T only could explain 28.3 AE 11.6%. Although R:T&W performs the best fitness among three models due to interaction between T and W, the best fitness of R:T&W was mainly based on W factor. That also demonstrated soil respiration is dominated by soil moisture in the field. For estimating soil CO 2 emission, choosing an appropriate model is always a difficult problem (Shi et al. 2012); even though Gomez-Casanovas et al. (2013) evaluated applicability of different models, the evaluation depended on a time series. However, the problem of estimating soil CO 2 based on spatial heterogeneity has not been solved. The results shown in Figure 4, from use of the Monte Carlo method, show the variation of errors (CV) for R based on three models with sample size changes. Apparently, in 51 points in the 5.0 9 5.0 km 2 plot and 30 points in the 5.0 9 2.5 km 2 plot, the errors with sample size changes are not different among the three models (Fig. 4A,B). These results indicate that the responses of the three models to sample size change are not different for these point settings. This may because a larger sample is enough to reduce the spatial heterogeneity. However, for 27, 24, and 21 points in the 5.0 9 2.5 km 2 plot and 15 and 12 points in the 2.5 9 2.5 km 2 , the error of R:T&W is more significantly reduced with sample size change. This means that the R: T&W is more appropriate than other models for these point settings. At these settings, the sample size is not effective for decreasing the spatial heterogeneity from the interaction between T and W. However, for 9 points in the 2.5 9 2.5 km 2 , R:T is the best model for reducing error with the scaling-up process. According to the above analysis, it could be concluded that when the sample size is large enough, the performance of the three models is fine. Nevertheless, a consideration of convenient and traditional applications suggests that R:T could be an appropriate model. When the sample size is not more than enough, R:T&W would be a better choice. Finally, when the sample size is less for the experimental area, R:T is also an effective model for increasing accuracy. Furthermore, Table 2 and Fig. 4 indicate that point density (point number/area) may not be an effective proxy for error analysis in comparison with point number.
However, these conclusions are only for this experiment, but it can be demonstrated the spatial Monte Carlo sampling is an effective method or technique for optimizing sample size and filtering model in future studies.
These analyses would open a new way to effectively decrease error and shed light on the mechanisms driving soil respiration. All regressions were significant at P < 0.001. Table 3. Comparison of coefficients of determination (R 2 ) of three models on soil respiration (R) against soil temperature (T) and/or soil moisture (W