Matching soil grid unit resolutions with polygon unit scales for DNDC modelling of regional SOC pool

Introduction Conclusions References


Introduction
Soil organic carbon (SOC) is the largest terrestrial carbon pool (Schlesinger, 1997), with stocks about four times the biotic (trees, etc.) pool and about three times the atmospheric pool (Lal, 2004).Relatively modest changes in SOC storage can result in a significant alteration in the atmospheric CO 2 concentration (Davidson and Janssens, 2006).Therefore, an accurate SOC pool estimation has become an important requirement for assessing the global carbon balance and for global climate change.
Agricultural soils are a highly sensitive part of the global carbon cycle (Shi et al., 2010;Wang et al., 2011), carbon sequestration by agricultural soils presents an immediate viable option for increasing soil carbon pool and reducing atmospheric CO 2 and mitigating global warming (Sun et al., 2010).For complexities of human activities and tillage practices affecting agricultural soil, SOC dynamic changes are increasingly to be simulated over broad space and time scales by process-based models (Giltrap et al., 2010;Xu et al., 2012a), such as DeNitrification-DeComposition (DNDC) (Li et al., 2003).
The DNDC model developed by Li et al. (1992a, b) can simulate C and N biogeochemical cycles occurring in agricultural systems, driven by both the environmental factors (e.g.soil organic matter, texture, pH, bulk density, hydraulic properties, daily temperatures and precipitation, etc.) and management practices (e.g.crops, tillage, fertilization, manure application, grazing, etc.).It has been validated through long-term applications internationally at the plot scale, including many sites of North America, Europe, Asia, etc. (Pathak et al., 2005;Li et al., 2006;Tonitto et al., 2007), and is one of the most widely accepted biogeochemical models in the world (Li, 2007;Tang et al., 2006;Li et al., 2010).
The DNDC model has also been utilized to upscale estimates of SOC from plot to region scale.At the region scale the DNDC modelling conducted initially has used counties as basic simulation units, where minimum and maximum soil parameter values for each county were derived from soil maps to simulate an upper and a lower estimate of Figures several C and N pools (Cai et al., 2003;Li et al., 2004).However, county scale model simulations are subject to great uncertainties as soil properties are averaged for each county, largely ignoring the nonlinear impacts of soil heterogeneity therein (Rüth and Lennartz, 2008; L. M. Zhang et al., 2014).
Recently for DNDC up-scaled utilization, a region is partitioned into many simulation units, within which all soil properties are assumed to be as homogeneous as they are at the plot scale (Li et al., 2005;Zhang et al., 2012).The homogeneity assumption is a possible major source of error when extending DNDC modelling from the plot to the region scale (Li et al., 2002(Li et al., , 2004)).As the area of the basic simulation unit increases so does soil property variability or heterogeneity, calling into question the accuracy of its capture (Smith and Dobbie, 2001;Bouwman et al., 2002).Soil polygons derived from soil vector maps are used as basic simulation units, that is one way to reduce effects of soil heterogeneity on DNDC modelling as they can as possible (Xu et al., 2012b;Yu et al., 2013;Zhang et al., 2012).Even so, the soil heterogeneity within a soil polygon unit still exists, and depends on the soil vector map scale, smaller map scale resulting higher heterogeneity (Yu et al., 2013).To different broad regions, multi-scales of the polygon unit simulated with DNDC ranged widely from 1 : 50 000 to 1 : 14 000 000, taken effects extremely on accuracy and uncertainly of the modelling (Xu et al., 2011(Xu et al., , 2012b;;Yu et al., 2013;L. M. Zhang et al., 2014).
Another way to reduce effects of the soil heterogeneity on the DNDC modelling is that soil grid cells are used as the basic simulation units (Huang et al., 2004;Y. Q. Yu et al., 2007;Shi et al., 2010;Yu et al., 2011).Cell size or resolution of the soil grid units is one of rulers to scale the soil heterogeneity therein, lower resolution or larger cell size resulting higher soil heterogeneity likewise.The cell size or resolution take effects extremely also on the accuracy and uncertainly of the soil grid unit simulation with DNDC (Yu et al., 2011).
The soil grid units are more often applied to simulation of SOC pool (Qiu et al., 2005;Tang et al., 2006;Yu et al., 2011;Liu et al., 2011), as they are more easily manipulated for spatial model simulation, geo-statistics and spatial analysis than the soil polygon Introduction

Conclusions References
Tables Figures

Back Close
Full units (Huang et al., 2004;Li et al., 2005).They are often derived by data conversion from the soil polygon units, but the grid resolution choice varies by researcher even if the soil polygon units are at same map scale and in same region (Y.Q. Yu et al., 2007;Shi et al., 2010;Yu et al., 2012).For example, the soil polygon units compiled in the Soil Database of China (Yu et al., 2007a) at the map scale of 1 : 1 000 000 have been converted to the gird units at the resolutions of 1 km × 1 km (Yu et al., 2007b) and 10 km × 10 km (Y.Q. Yu et al., 2007Yu et al., , 2012) ) to simulate and estimate agricultural SOC pools in China, respectively.The soil grid units at the resolution of 2 km × 2 km (Shen et al., 2003) and 50 km × 50 km (Wan et al., 2011) converted from the original soil polygon units at the map scale of 1 : 4 000 000, were used for the grid simulation of SOC dynamics in different regions, respectively.The original soil polygon units at the map scale of 1 : 50 000 were converted to grid units at the resolution of 100 m × 100 m (Shi et al., 2010) and 30 m × 30 m (Su et al., 2012) for the grid simulation of SOC dynamics in agro-ecosystem, respectively.Our concerning is whether these soil grid units at different cell sizes are equivalent in accuracy or granularity to their parent soil polygon units at a corresponding map scale for DNDC modelling.In other words, whether these soil grid unit datasets regulate coarser data or contain redundant data of soil properties, contrasting to their parent soil polygon unit dataset at a certain map scale.The coarser or redundant dataset affects the simulation unit inner homogeneity of soil properties, and farther affects the common outcome too, being that modelling error will be lower if all features within the simulation unit are more homogeneous (Cai et al., 2003;Yu et al., 2011Yu et al., , 2013)).
In fact the accuracy and the redundancy are two important issues to soil simulation units' dataset conversion from polygon to grid format, which are often neglected in modelling at regional scale.The accuracy of the grid unit dataset determine reliability and uncertainty of SOC grid simulation (Batjes, 2000;Ni, 2001), the redundancy of the dataset results in mistaken understanding of data accuracy and redundant workload and cost of the simulation (Yu et al., 2011(Yu et al., , 2013)).Some researches focus on data accuracy but neglect the data redundancy (Yu et al., 2007b;Shi et al., 2010), while Introduction

Conclusions References
Tables Figures

Back Close
Full others neglect the data accuracy (Batjes, 2000;Y. Q. Yu et al., 2007) when conduct data conversion, they always search for an individual solution in every case.
Given the variety of datasets and number of simulations, in combination with data accuracy and redundancy as well as computational costs (Schmidt et al., 2008), important questions are raised.How sensitive is DNDC modelling to different simulation units at varied vector map scales or raster grid resolutions?Which raster resolution is optimal to DNDC grid simulation at a fixed soil map scale for error and cost controls?Matching the soil grid unit resolution with polygon unit map scale is one of essential issues to DNDC modelling.
In the present study, paddy soil polygon simulation units at six vector map scales from 1 : 50 000 to 1 : 14 000 000 were converted to grid simulation units at varied raster resolution, respectively, in the Tai Lake region of China.Soil organic carbon pools were simulated by polygon simulations and grid simulations with the DNDC model at the varied vector map scales and raster resolutions, respectively.The objectives of the study were to (1) reveal the impact of vector map scale and raster resolution of soil simulation units on the DNDC modelling, (2) determine an optimal raster resolution of grid simulation units at a fixed soil vector map scales, based on an assessment of the simulation units' data accuracy and redundancy metrics, and (3) construct relationship between soil vector map scale of polygon units and optimal raster resolution of grid units for DNDC modelling at regional scale.The results will serve as a reference for soil simulation unit conversion from polygon to grid format, in the support of soil carbon cycle modelling at regional scale.
2 Materials and methods

Study area
The Tai Lake region (118  (Xu et al., 1980).The soil types in the region are mainly Paddy, Fluvo-aquic and Red soils, which covers 90 % of total area.Paddy soils, the largest single proportion of any soil type in the Tai Lake region, occupy 23 200 km 2 , approximately 66 % of total area (Yu et al., 2014).Derived from loess, alluvium and lacustrine deposit, Paddy soils in the Tai Lake region are recognized as the most typical of their type in China (Yu et al., 2013), with a long history of rice cultivation spanning over several centuries.A summer rice (planted in June and harvested in October) and winter wheat (planted in November and harvested in May) doublecrop rotation has been intensively cultivated in this region (L.M. Zhang et al., 2012Zhang et al., , 2014)).Six subgroups, Bleached, Gleyed, Percogenic, Degleyed, Submergenic and Hydromorphic are included in the Paddy soils.They are cross referenced in US Soil Taxonomy (ST) as Typic Epiaquepts (Bleached, Percogenic, Hydromorphic) and Typic Endoaquepts (Gleyed, Degleyed, Submergenic) (Shi et al., 2006;Soil Survey Staff, 1994).

Development of polygon and grid simulation unit datasets at different map scales
First of all, paddy polygon unit datasets for DNDC simulation at six soil vector map scales, e.g. 1 : 50 000 (C5), 1 : 200 000 (D2), 1 : 500 000 (P5), 1 : 1 000 000 (N1), 1 : 4 000 000 (N4) and 1 : 14 000 000 (N14), were developed in the Tai Lake region, respectively.They were generated respectively by vector overlay from paddy polygons at the six map scale datasets and polygons depicting county boundaries at a scale of 1 : 50 000 using the Union function supported by the ESRI ARCGIS 9.0 software (ESRI, Redlands, CA).All simulation units of paddy polygons at one certain map scale within one county have same feature input value for DNDC modelling such as crops, agri-

Conclusions References
Tables Figures

Back Close
Full cultural management and climate, except soil feature, such as soil types, soil organic matter content, clay content, bulk density, rock fragments content, soil layer thickness, pH, hydraulic properties, etc (Yu et al., 2013).The paddy polygon unit datasets at the six map scales were developed by a Gis Linkage technique based on Soil Type (Yu et al., 2005(Yu et al., , 2007a, b), b), namely PKB (Pedological Knowledge Based) method (Zhao et al., 2006), from soil vector maps at their corresponding map scales, respectively.The soil vector maps were compiled using a standard soil mapping system formulated as part of the Second National Soil Survey of China conducted in the 1980s (Office for the Second National Soil Survey of China, 1994).To the six map soils, soil species is the basic mapping unit for C5 and D2, soil family is for P5 and N1, while soil subgroup is for N4 and N14 (Yu et al., 2014).The soil properties attributed to all paddy polygons were derived from soil profiles, which were surveyed, compiled and authorized in the Second Soil Survey of China in 1980s (Shi et al., 2006).The number of representative soil profiles whose measured data were applied to attribute paddy polygons at C5, D2 and P5 scales totaled 1107, 136 and 127, respectively.The datasets were all taken from three books: Soils of County, Soils of District and Soils of Province, respectively.The paddy polygons at national map scale (N1, N4 and N14) were origined from 49 soil profiles described from the book "Soils of China" (Shi et al., 2006;Yu et al., 2014).
Secondly, paddy grid unit datasets for DNDC simulation were developed from above paddy polygon unit datasets at the six map scales.Each vector paddy polygon unit dataset was converted to a series of paddy grid unit datasets of differing grid cell sizes.
The gird cell size ranged from a default size to a maximum, with the size increment set to approximately 10 % of the default.The default was determined by the soil vector map scale and the lowest mapping unit size (2 mm × 2 mm), which can be described and exhibited in hard copy of the map (Yu et al., 2014).For conversions of the six paddy polygon unit datasets (C5, D2, P5, N1, N4 and N14), the default grid cell sizes are 100, 400 m, 1, 2, 8 and 28 km, respectively.In addition, the paddy polygon unit dataset at N14 scale was also converted to gird unit datasets at cell sizes ranging from the default Introduction

Conclusions References
Tables Figures

Back Close
Full to a minimum size, with the approximate decrement of 10 % of the default cell size.The minimum and maximum grid cell sizes were that at which the difference of the paddy soil SOC pool simulated by DNDC with the grid unit dataset exceeds the simulation from its parent polygon unit dataset by 30 %.All the data conversions were conducted using the Polygons to Raster Conversion Tools (PRCT), a component of the ArcGIS 9.0 software, with the grid cell value assignment type option of Maximum-Area.Finally, all simulation units rendered as vector (polygon unit) and raster (gird unit) datasets describing the soil properties, daily weather, cropping systems, and agricultural management practices of rice paddy fields, are required to initialize and run the DNDC model at regional scale (Yu et al., 2011(Yu et al., , 2013)).Each simulation unit has own data records specifically that were used as input for the DNDC modelling of SOC dynamics (L.Zhang et al. 2009;L. M. Zhang et al., 2009L. M. Zhang et al., , 2012L. M. Zhang et al., , 2014)).

DNDC modelling and validation
The DNDC (DeNitrification-DeComposition) model is a process-base model of carbon (C) and nitrogen (N) biogeochemistry in agroecosystems (Li et al., 1992a, b), it can simulate soil C and N biogeochemical cycles in paddy rice ecosystems, depending on a series of anaerobic processes being supplemented in the model (Li et al., 2002(Li et al., , 2004;;Li, 2007).
For DNDC modelling of SOC dynamics, farming management scenarios were compiled based on five assumptions from L. Zhang et al. (2009) and L. M. Zhang et al. (2009Zhang et al. ( , 2012Zhang et al. ( , 2014)), did not vary with the soil simulation unit within counties.The DNDC modelling runs span the time period 1982 to 2000, duration of 19 years.A total of 65 340 paddy polygon unit simulations were executed, as well as half million paddy grid unit simulations roughly.At present study DNDC in 9.1 versions was run.
To validate and assess performance of DNDC modelling, observed values of SOC content acquired in 2000 from 1033 soil sampling sites within paddy polygon units at C5 map scale, were used to against modelling values (L.M. Zhang et al., 2014).The observed SOC content of top layer (0-15 cm) varied from 1.9 to 36 g kg −1 , and the sim-Introduction

Conclusions References
Tables Figures

Back Close
Full ulated SOC ranged from 5.1 to 34 g kg −1 in 2000, where 99.6 % of simulated polygon units in C5 were within the ranges produced by the observed values.Four statistical criteria, the correlation coefficient (r), the relative error (E ), the mean absolute error (MAE) and the root mean square error (RMSE), were employed to evaluate the model performance.The r of 0.5 at significant level p < 0.01, the E of 6.4 %, MAE of 4.0 g kg −1 and RMSE of 5.0 g kg −1 , all indicated that the modeled results were encouragingly consistent with the observations, the DNDC model were acceptable for SOC modelling of paddy soils in the Tai Lake region (L.M. Zhang et al., 2014).For a more complete discussion of DNDC model validation and error assessment associated for the region can see L. M. Zhang et al. (2012Zhang et al. ( , 2014)).

Data calculation and analysis
Simulated SOC density (SOCD, kg C m −2 ) of a paddy polygon or grid unit is calculated according to the following equation (Yu et al., 2014): Where n is the number of soil pedogenic layers, δ i % represents the volumetric percentage of the fraction > 2 mm (rock fragments), ρ i is the bulk density (g cm −3 ), C i is the simulated soil organic C content (g kg by accounting, the IV of AREA, SOCS and ASOCD were calculated as follows respectively: (2) Where AREA j is the area of the paddy polygon or grid unit; SOCD j is simulated SOC density of a paddy polygon or grid unit; j is the number of paddy polygon or grid unit (Yu et al., 2014).
Variation of an index value (VIV, %) obtained from a grid unit dataset (IVraster ) and its parent polygon unit dataset (IVvector ) is recognized as a ruler to scale the magnitude of the consistency between the two datasets.The two format datasets accuracy may be consistent or identical, only if absolute values of all these indices VIVs are less than 1 % (Yu et al., 2014).The VIV is calculated as follow: VIV(%) = ABS(100 × (IVvector − IVraster )/IVvector ). ( Where ABS is absolute function, IVvector is an index value obtained from a polygon unit dataset; IVraster is the index value obtained from it's an affiliated grid unit dataset.
The optimal soil grid unit size for a polygon unit dataset conversion to grid unit dataset is the maximum grid cell size of which the two datasets are scaled identically.Statistical analyses were conducted by using the Excel and Origin 10 software.Introduction

Conclusions References
Tables Figures

Back Close
Full 2012, 2014).The spatial distribution characteristics of these soil properties depicted by various simulation unit datasets differ from each other.The difference of the input parameter value affects uncertainty of the modelling (Valade et al., 2014;Zhu and Zhuang, 2014).A map scale or raster resolution decrease yielded a change in their estimated content (Tables 1-6), and a corresponding change in the simulated SOC (Table 7).
Weather data (precipitation, maximum and minimum air temperature) and farming management scenarios (sowing method, nitrogen fertilizer application rates, livestock, planting and harvest dates, etc.) variability among these simulation unit datasets for the purposes of this analysis can be neglected, because they were from the same weather and farming management county scale database (Yu et al., 2011(Yu et al., , 2013) ) overlain with these soil polygon datasets.Change in soil type and their attributes as well as soil type area are the main source of SOC variability simulated by DNDC associated with the simulation unit scale and resolution (Yu et al., 2011(Yu et al., , 2013)).

Index values determined from simulation polygon units at different map scales
The basic mapping unit's type, numbers of paddy soil type (STN) and polygon unit (SPN) as well as soil area (AREA) determined from the six paddy polygon unit datasets at different map scales, which describe the physical characteristics of these soil datasets, differ from each other (Table 7).For instance, four of the six paddy soil subgroups, Bleached, Percogenic, Degleyed and Submergenic paddy soil, do not get described in N14 polygon unit dataset but in other five datasets.The data scarcity should be one of the substantial causes of the uncertainties in modelling on regional scales (W.Zhang et al., 2014) did.And understandably, the C5 paddy polygon unit dataset containing the maximum numbers of soil polygon units, soil families and species (Table 7), is the most detailed and accurate database in the Tai Lake region (L.M. Zhang et al., 2009Zhang et al., , 2012;;Yu et al., 2011Yu et al., , 2013)).That the IVs of STN, AREA, SOCS and Introduction

Conclusions References
Tables Figures

Back Close
Full ASOCD obtained in C5 dataset are considered to be the most believable in the region (Yu et al., 2011(Yu et al., , 2013)).The IVs of SOCS and ASOCD for surface paddy soils simulated by DNDC with the six polygon unit datasets display pronounced difference from each other, as well (Table 7).In the main, the IV of SOCS increased with decreasing of the map scale of polygon unit dataset.The highest IV of SOCS was simulated with the N14 polygon unit dataset, due to the largest area of the Hydromorphic paddy soils with the highest SOCD (> 8 kg C m −2 ) simulated mapped in the dataset.The area of Hydromorphic paddy soils mapped in the N14 dataset with the darkest polygons of SOCD simulated by DNDC (Fig. 2f) is 7 times of that in C5 dataset (Fig. 2a) roughly.Spatial distribution maps of SOCD simulated with these polygon unit datasets display differences from each other, too (Fig. 2).Being synthesized much cursorily, the SOCD maps of surface paddy soil simulated by DNDC with the N4 and N14 polygon unit dataset differ distinctly from the others.Obviously, the map scale of soil polygon unit dataset would significantly influence the results of regional SOC pool simulation (Zhao et al., 2006;Xu et al., 2011Xu et al., , 2012b; L. M. Zhang et al., 2014).

Optimal soil grid unit resolutions for SOC modelling at regional map scales
The three paddy polygon unit datasets C5, D2, P5 are representative of regional scale digital maps, describing soil features at the county, district and province levels, respectively (Yu et al., 2013).The VIVs of the four assessment indices (STN, AREA, SOCS and ASCOD) determined from grid unit datasets and their parent polygon unit dataset, increases with increasing grid cell size (Fig. 3a-c).VIV magnitude and trend vary with grid cell size and by dataset and index.For instance, the VIV of STN from C5 or D2 datasets varies with grid cell size best described by an exponential curve (Yu et al., 2011), while the VIV from P5 varies as a logarithmic curve (Yu et al., 2014).
To the C5 polygon unit dataset and affiliated grid unit datasets, VIVs of the four indices are all > 1 % when the grid cell size set as > 0.5 km.And only the VIV of ASOCD is < 1 % when the grid unit resolution ranges from 0.3 to 0.5 km.With the grid cell size Introduction

Conclusions References
Tables Figures

Back Close
Full decreasing to 0.3 km, three of four VIVs are all < 1 % except the SOCS index.Only when the grid cell size is ≤ 0.2 km (Fig. 3a) and STN index depicted with soil species (Table 7), the VIVs of the four indices are all < 1 %.That the 0.2 km×0.2 km resolution is optimal for C5 dataset conversion from polygon to grid unit, as it is at this cell dimension that the grid and parent polygon unit datasets are roughly equivalent in their information content, and the data redundancy is at a minimum (Fig. 3a) when simulating regional SOC pool with DNDC.
Similarly, for D2 and P5 dataset conversion, only the VIV of ASOCD is < 1 % when the raster unit resolution is > 1 and 2 km, respectively.But when the grid cell size for D2 conversion decreases to the range of 0.8-1 km, all of the index VIVs are < 1 % except the STN index of soil species, and all VIVs > 1 % for P5 conversion when grid cells size increase over 2 km and the STN index depicted with soil family (Table 7).VIVs of the four indices derived from D2 and P5 dataset conversions are all < 1 % only when their grid cell sizes are ≤ 0.7 and ≤ 1 km, respectively.It is at those cell dimensions that the grid and parent polygon unit datasets are nearly identical and the cell size is maximized, which minimizes the time and cost of simulation process (Fig. 3b and c).The optimal grid unit resolution for D2 and P5 conversion of simulating regional SOC pool with DNDC is 0.7 km × 0.7 km and 1 km × 1 km, respectively.

Optimal soil grid unit resolutions for SOC modelling at national map scales
The three paddy polygon unit datasets of N1, N4 and N14, describe soil features at the national scale (Yu et al., 2013).Generally, almost all VIVs of the four assessment indices from these grid unit datasets and their parent polygon unit datasets increase with increasing grid cell size except N14 (Fig. 3d-f).
For example, the VIVs of three index (SOCS, AREA and ASOCD) from the N14 dataset conversion varies with grid cell size in the diagram of random scatter except the STN index of soil subgroup when the grid cell size ranges from 18 to 36 km, which is around the center of its default grid cell size (28 km).The VIV random scatter diagram complicates the selection of an optimal grid unit resolution as the VIV values for the four 2666 Introduction

Conclusions References
Tables Figures

Back Close
Full indices are not consistent with grid cell size variation.To simulate regional SOC pool with DNDC, the optimal grid resolution for N14 dataset conversion was determined to be 17 km × 17 km, as all VIVs are < 1 % when the grid cell size is ≤ 17 km (Fig. 3f).
The results for N1 and N4 datasets conversion demonstrate that the VIVs of ASOCD and STN are < 1 % and the VIVs of SOCS and AREA are > 1 %, when the grid cell size is > 2 and > 8 km, and the STN index depicted with soil family and subgroup (Table 7), respectively.The VIVs of the four indices obtained from their grid unit datasets meet the criteria of < 1 %, only when the grid cell size ≤ 2 and ≤ 8 km, respectively.Accordingly, the grid resolution of 2 km × 2 km for N1 and 8 km × 8 km for N4 dataset conversion is optimal from paddy polygon to grid units, which as simulation units for DNDC modelling of regional SOC pool (Fig. 3d and e).

Relationship between polygon unit map scale and matched optimal grid
unit resolution for the simulation of regional SOC pool Correlation analysis indicated a statistically significant relationship between paddy polygon unit map scale (1 : x) and matched optimal grid unit resolution (y, km),which can be described as follows: The quadratic curve regression deviates from a standard linear regression, which describes the relationship between soil polygon unit map scales and their default grid cell sizes.The quadratic model implies that when the map scale for the regional SOC simulation with DNDC is less than 1 : 4 000 000, the optimal grid cell size is less than the default, and the deviation increases with map scale decreasing (Fig. 4).Introduction

Conclusions References
Tables Figures

Back Close
Full

Comparison of simulation grid unit resolutions at different map scale among referenced researches
At map scale of C5 (1 : 50 000), the original soil polygon units were converted to grid cells at size of 100 m × 100 m (Yang et al., 2009;Shi et al., 2010) and 30 m × 30 m (Su et al., 2012) as basic assessment units to simulate the SOC dynamics of agroecosystem.Compared to the default optimal resolution (200 m × 200 m), the soil grid units are redundant by the standards suggested here.Similarly, both the grid unit datasets at a cell size of 1 km × 1 km converted from the 1 : 1 000 000 (N1) scale soil polygon unit dataset (Yu et al., 2007b) and 2 km×2 km converted from the 1 : 4 000 000 scale (N4) dataset (Shen et al., 2003), contain a lot of redundancy, compared to the optimal resolution 2 km × 2 km and 8 km × 8 km achieved in this study, respectively.Although grid unit datasets used by these researchers kept the same data content as their parent polygon unit dataset, the grid cell size is not real resolution matching with their map scales due to the data redundancy.Workload and cost of the regional SOC investigation and simulation tripled due to the increased number of grid cells, if the grid cell was designed as soil sampling and simulating unit.By contrast, the grid units at the cell size of 50 km × 50 km converted from 1 : 4 000 000 (N4) (Wan et al., 2011) scale's and 10 km × 10 km from the 1 : 1 000 000 scale's (N1) (Y.Q. Yu et al., 2007) soil polygon unit dataset, were used as assessment unit for modelling of SOC dynamics in different regions, respectively.It can be anticipated that the simulated results will have higher uncertainty than its parent polygon units' simulations, because the grid unit datasets is coarser than their parent polygon unit datasets.If such the grid cell is designed as soil sampling and simulating unit, the regional SOC investigation and simulation will not be matching in accuracy to the map scale.
The harmonized world soil database (HWSD), completed by FAO/IIASA/ISRIC/ ISSCAS/JRC in 2009, was produced at a cell size of about 1 km × 1 km from an orig-Introduction

Conclusions References
Tables Figures

Back Close
Full inal polygon unit dataset, which contains over 16 000 different soil mapping units and was derived from the Soil Map of the World (1 : 5 000 000), regional Soils and Terrain Digital Database (SOTER) (1 : 1 000 000 to 1 : 5 000 000) as well as the European Soil Map and the Soil Map of China (1 : 1 000 000, digitized and compiled by authors D. S. Yu et al.) (FAO/IIASA/ISRIC/ISSCAS/JRC, 2009).Based on the relationship developed here and assuming a scale of 1 : 5 000 000, the effective resolution of the grid unit dataset would be 10 km × 10 km roughly, rather than 1 km × 1 km.Utilization of the HWSD database at grid cell size of 1 km × 1 km for global SOC pool research would be subject to elevated data redundancy and uncertainty at the map scale (Yu et al., 2011(Yu et al., , 2013(Yu et al., , 2014)), although it is a perfect global soil database in the world at present.
Considering Fig. 2 and Table 1 we see that the influence of the geomagnetic and magnetospheric terms is negligible.Furthermore, Eqs. ( 1) and ( 2) add no insight to the problem.We must therefore conclude that Phillips (1999) incorrectly supposed such a connection to exist.
In spite of this negative result, research will continue on this highly interesting question.For if it were to prove correct, then the consequences would be enormous to say the least.

Comparison of optimal soil raster unit resolutions between calculation and simulation of regional SOC pool
Yu et al. ( 2014) did similar study by using similar method and same basic data in same region as this study.A difference of method adopted in Yu et al. (2014) from this study was that SOC content (C i , g kg −1 ) in Eq. ( 1) was observed data in 1982 (Yu et al., 2014), which is one of input parameters for DNDC modelling in this study; while the C i (g kg −1 ) in this study was simulated data in 2000 by the DNDC modelling.It leads to slight difference of results from each other.For example, the optimal grid sizes matching to 1 : 4 000 000 (N4) and 1 : 14 000 000 (N14) map scales were 9 and 20 km when C i (g kg −1 ) was observed (Yu et al., 2014), 8 and 17 km when C i (g kg −1 ) was simulated, respectively.But the optimal grid sizes matching to other four map scales

GMDD Introduction Conclusions References
Tables Figures

Back Close
Full (C5, D2, P5, P1), respectively, did not find any difference, no matter C i (g kg −1 ) was observed or simulated.Accordingly, the relationships between optimal grid size (y, km) and map scale (1 : x) has slight difference too in their regression parameters: observed C i (g kg −1 ) (Yu et al., 2014).
The reason for the slight difference is that more soil features data were used when the C i (g kg −1 ) was simulated than observed, e.g.soil clay content and pH are two input parameters for DNDC modelling.More Soil features involved implies more rigorous criteria to assess data consistency between grid unit datasets and their parent polygon unit datasets, and leads to increase of optimal raster resolution further, even if the same indices and criteria were applied as Yu et al. (2014) did in their study.Fortunately, the slight difference happened only in polygon unit dataset conversions at small map scales of N4 and N14; and the relationships between optimal grid cell size (y, km) and map scale (1 : x) revealed in the two studies, respectively, are all described in a quadratic curve regression model (Eqs.7 and 8).
The quadratic curve regression model (Eq.8) revealed in this study differ from a standard linear regression too, as Yu et al. (2014) did, which describes the relationship between soil polygon unit map scales and matched default grid cell sizes (Fig. 4).The quadratic model implies that when the map scale is larger than 1 : 4 000 000 (N4) the optimal grid cell size may be larger than the default.Soil grid units at the default cell size converted from polygon units at these map scales will result in data redundancy.When the map scale is less than 1 : 4 000 000 (N4) the optimal grid cell size is less than the default, and the deviation increases with map scale decreasing (Fig. 4).For soil polygon units at these map scales, their conversion to grid units at the default cell

Conclusions References
Tables Figures

Back Close
Full size, will result in a drop of data accuracy and an increase in simulation uncertainty.Thus, the quadratic model is more important to soil polygon unit dataset conversion at less N4 map scales than the other map scales.The quadratic model (Eq.8) also can be substitution of Eq. ( 7), when C i (g kg −1 ) was observed for the calculation of regional SOC pool, as the optimal grid unit resolution determined from the Eq. ( 8) may higher than that from the Eq. ( 7) at a certain map scale.Soil assessment unit dataset accuracy and result certainty are more critical than the dataset redundancy.

Application of the quadratic curve regression model for DNDC modelling at different map scales
Almost all map scales of soil polygon unit datasets for China being frequently used are involved in this study, which were generated from the Second National Soil Survey of China.The six soil map scales were designed for soil mapping at different administrative levels including county, district, province and the whole country (Shi et al., 2006).
The Tai Lake region is a typical area in China where paddy soil prevails.Although it is located in the Yangtze Delta plain in East China, where rice fields are integrated with a high density of river or pond, garden and urban land, the spatial pattern of rice field distribution is similar to hilly or mountain regions where rice fields coexist with crop, grass, shrub and forest and urban land (Yu et al., 2011(Yu et al., , 2013(Yu et al., , 2014)).We may assert with some degree of confidence that the knowledge obtained in this present study can be rolled out elsewhere in East and South China where distributes 95 % of rice filed in China (Li, 1992c).
While in the North and West China, soil vector mapping unit is larger in size than that of East and South China at various map scales, because of simpler natural conditions and reduced spatial variability.We may draw a conclude from it that the optimal grid cell size determined from the quadratic model (Eq.8) can be smaller than the real optimal size in the region (Yu et al., 2011(Yu et al., , 2013(Yu et al., , 2014)).The optimal grid cell size applying will result in a little redundancy of grid unit dataset, but not affect its accuracy matching to their soil polygon units' map scales.Although the quadratic model was obtained from

Conclusions References
Tables Figures

Back Close
Full a specific case study, and it would vary with the research region, the knowledge can be used as a guideline for soil unit conversion from polygon to grid, and for optimizing field sampling strategies, to support the regional simulation of SOC pool dynamics in China.
Within China a few administrative region extents are different from those used here, which is caused by their history anthropogeography and physical geography, resulting in additional soil datasets with non-traditional map scales, such as 1 : 75 000, 1 : 100 000 and 1 : 150 000 scales of soil polygon maps for county level, 1 : 250 000 or 1 : 350 000 for district level, and 1 : 750 000 or 1 : 1 500 000 for province level (Shi et al., 2006), respectively.The soil polygon unit conversion for DNDC modelling at these map scales, the optimal grid resolutions can also be informed from the guidelines published here.

Conclusion
The DNDC model has been utilized to upscale estimates of SOC from the plot to region scale.For DNDC up-scaled utilization, a region is partitioned into many simulation units, e.g.soil vector polygon units or raster grid units, within which all properties are assumed to be as homogeneous as they are at plot scale.The homogeneity assumption is a possible major source of error when extending DNDC modelling from the plot to region scale.The homogeneity of simulation units is linked to soil polygon units map scale and grid units resolution, which has a strong influence on the results of SOC pool simulation.
Soil grid units are more often applied to SOC pool simulation, as they are more easily manipulated for spatial model simulation, geo-statistics and spatial analysis than soil polygon units.Most of them are derived by data conversion from soil polygon units, but the grid unit resolution choice varies by researcher even if they are derived from a certain vector polygon unit dataset.An optimal raster resolution matched with a certain map scale, for soil polygon unit conversion to grid unit, was put forward in this Introduction

Conclusions References
Tables Figures
For the investigation and simulation of regional SOC pool, the quadratic curve model is more important to the soil polygon unit conversion at N4 less map scales than the other map scales.Although the quadratic curve model was revealed from a specific case study and would vary with the investigated region, the knowledge can be used as a guideline for soil assessment unit conversion from vector polygon to raster grid, optimizing field sampling strategies, and minimizing uncertainty of the investigation and simulation of regional SOC pool at different map scales further.Full  Full  Full  Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | −1 ) in 2000, and T i represents the thickness (cm) of the layer i .The simulated SOCD of surface paddy soil is calculated to the depth of 20 cm.Four indices of surface paddy soil, Paddy soil area (AREA, M ha), number of paddy soil type (STN), the simulated SOC stocks (SOCS, Tg) and average SOCD (ASOCD, kg C m −2 ), were selected to assess data accuracy between a paddy grid unit dataset and its parent polygon unit dataset.The four index values (IVs) determined from each polygon unit dataset are recognized as a benchmark for comparison with those values from their affiliated grid unit datasets.Except for the index value (IV) of STN obtained Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Author contributions.D. S. Yu and H. D. Zhang pondered the rationale of the method.X. Z. Shi collected the observed and simulated datasets.Y. L. Ni and L. M. Zhang performed the DNDC model simulation.H. D. Zhang and D. S. Yu prepared the manuscript with contributions from all coauthorsDiscussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 .
Figure 1.The location of Tai Lake region.

Figure 4 .
Figure 4. Relationship between paddy polygon unit map scale and matched optimal grid unit resolution for the SOC simulation with DNDC in the Tai Lake region of China.

Table 1 .
Statistics of soil parameters input from different resolution units at the map scale of 1 : 50 000 in the Tai Lake region of China * .

Table 2 .
Statistics of soil parameters input from different resolution units at the map scale of 1 : 200 000 in the Tai Lake region of China * .

Table 3 .
Statistics of soil parameters input from different resolution units at the map scale of 1 : 500 000 in the Tai Lake region of China * .

Table 4 .
Statistics of soil parameters input from different resolution units at the map scale of 1 : 1 000 000 in the Tai Lake region of China * .

Table 5 .
Statistics of soil parameters input from different resolution units at the map scale of 1 : 4 000 000 in the Tai Lake region of China * .