MesoSoil v2.0: An updated soil physical property database for the Oklahoma Mesonet

Soil moisture data from the Oklahoma Mesonet have been used in numerous fields within the Earth sciences, including agriculture, hydrology, and meteorology. Soil matric potentials measured by heat dissipation sensors at Oklahoma Mesonet stations have been converted to soil volumetric water content estimates using soil water retention curve parameters estimated by the Rosetta pedotransfer function. Recently, an improved version of Rosetta, Rosetta3, was released. Informed by this new pedotransfer function and soil sampling at additional locations, an improved version of the Oklahoma Mesonet soil physical property database, MesoSoil v2.0, has been created. This article presents this new soil database, compares soil water retention parameters estimated using Rosetta3 with those derived from the original Rosetta model, and describes the effects of changes in those parameters on estimated volumetric water content. Using the Rosetta3 model led to changes in the estimated water retention parameters, most notably decreases in the parameter α, which is related to the inverse of the air‐entry potential. These changes resulted in volumetric water content estimates that differed from those based on Rosetta1, with mean absolute differences averaging 0.02 cm3 cm–3 across all site‐years and the greatest differences occurring during wet periods. The mean volumetric water content estimated using the Rosetta3 parameters across more than 100 sites was not significantly different from that determined by soil sampling. The updated database is publicly available and may be found here: http://soilphysics.okstate.edu/data.


INTRODUCTION
The Oklahoma Mesonet is a statewide environmental monitoring network that has been continuously collecting meteorological data since 1994 and soil moisture data since 1996 at more than 100 locations across the state (Figure 1), making it one of the longest-running automated soil moisture moni-toring networks in the world (McPherson et al., 2007). Since 2013, a soil physical property database (MesoSoil) developed by Scott et al. (2013) using the original Rosetta pedotransfer function (Schaap et al., 2001), or Rosetta1, has been used to convert soil matric potentials measured by heat dissipation sensors into soil volumetric water content estimates, commonly referred to as soil moisture. Prior research has used these volumetric water content data for a number of applications, including developing soil moisture-based drought 2 of 9 WYATT ET AL.
Recently, an updated version of the Rosetta pedotransfer function, Rosetta3, was developed (Zhang & Schaap, 2017). Based on this new model, an updated version of the Oklahoma Mesonet's soil physical property database, named MesoSoil v2.0, has been created and incorporated into the publicly available soil characteristic metadata file. The updated database includes additional soil physical property data for more recently installed monitoring sites, as well as depth-interpolated soil physical property estimates for most sites at the 10-cm depth, for which soil samples were previously not collected and analyzed. This paper presents this new, more complete database for use in place of the previous soil physical property database, compares estimated soil water retention parameters resulting from the Rosetta3 model with those from the original Rosetta model, quantifies the error associated with using interpolated 10-cm soil physical properties as inputs to the Rosetta3 model, and describes the effects of changes in the water retention parameters on estimated volumetric water content data. To our knowledge, this paper is the first after Zhang and Schaap (2017) to apply both the Rosetta1 and Rosetta3 pedotransfer functions to a large soil physical property dataset and quantify the resulting changes in hydraulic parameters.

Rosetta3 model
Similar to Rosetta1, the Rosetta3 model is a hierarchical artificial neural network, which is suited for different levels of input data availability. The same dataset that was used to train Rosetta1 was also used to train Rosetta3 (Schaap et al., 2001;Zhang & Schaap, 2017). However, Rosetta3 has several advantages over Rosetta1 in that it unifies the water retention and saturated hydraulic conductivity (K s ) prediction models into one model, relies upon more bootstrapping replications (1,000 in Rosetta3 vs. 60 or 100 in Rosetta1), and is implemented in Python (Python Software Foundation, Python Language Reference, version 2.7) rather than the Windows 98/XP graphical user interface used for Rosetta1, which requires the use of a 32-bit operating system and has become unsupported for regular use. On the other hand, a major drawback in using the Rosetta3 model is that the unification of the water retention and K s estimation models prevents the prediction of the matching point conductivity (K o ) and L parameters (Zhang & Schaap,

Core Ideas
• Rosetta3 pedotransfer function was used to develop an updated Oklahoma Mesonet soil property database. • Along with recent recalibration, the new database improves soil moisture estimates. • The greatest changes in volumetric water content estimates were observed during wet conditions. 2017). This limitation forces users to forgo the use of the K o parameter and to use a constant value of 0.5 for the L parameter when estimating unsaturated hydraulic conductivity values (e.g., Wyatt et al., 2017). Another key difference between Rosetta1 and Rosetta3 is that the option presented in Rosetta1 for using the "best possible" hierarchical model is not available for Rosetta3. This option allowed soil physical properties to be estimated using different hierarchical models (i.e., variable levels of input data) in a single model run. Without this option in Rosetta3, all sites/depths must either be run using the same hierarchical model, resulting in sites/depths with missing parameters being excluded from analysis, or multiple input datasets must be analyzed using multiple model runs to account for varying levels of input data availability.

Rosetta3 model input data
The most comprehensive of the five hierarchical models within Rosetta (H5) uses measurements of six soil physical properties-sand, silt, and clay percentages, bulk density, and volumetric water content at matric potentials of −33 and −1,500 kPa. The Rosetta model uses these input data in order to estimate the van Genuchten water retention parametersthe residual water content (θ r ), saturated water content (θ s ), α, n, and saturated hydraulic conductivity (K s ) (and the matching point conductivity [K o ] and L, if using Rosetta1) (van Genuchten, 1980 Soil sampling and laboratory measurement procedures were described by Scott et al. (2013), and a summary of soil properties for samples included in the present analysis is shown in Table 1. For a small subset of sites, soil samples were collected by hand in 5-cm diameter rings as opposed to the  hydraulic soil sampler used by Scott et al. (2013). All the measured soil property data were used as input data in the Rosetta3 model. Thirteen sites also have measured soil physical property data available for the 10-cm depth (see Figure 1). In MesoSoil v1.0 and in subsequent subversions of that database (MesoSoil v1.3 being the most recent), soil physical properties at the 10-cm depth were not available. After the creation of MesoSoil v1.0, heat dissipation sensors were installed at the 10-cm depth at most Oklahoma Mesonet locations, creating a need for estimated soil water retention parameters at that depth. In order to address this gap in data availability, a linear interpolation was used for MesoSoil v2.0 to estimate the required Rosetta inputs at the 10-cm depth for sites without measured soil characteristic data at that depth (i.e., all but 13 sites). The interpolation was based on measured values from the 5-and 25-cm depths. If measured soil properties were not available at both the 5-and 25-cm depths, then 10-cm water retention parameters for that site remained unavailable. The soil physical property input data used in the Rosetta3 model to develop MesoSoil v2.0 were the same data used by Scott et al. (2013) to create the original MesoSoil database, plus additional data for newer sites that have been sampled since 2013, and measured or interpolated data for the 10-cm depth where available.
After the estimation of updated soil water retention parameters using Rosetta3, parameters from the MesoSoil v1.3 and MesoSoil v2.0 databases were compared. Changes in parameters at each Oklahoma Mesonet station location and depth were quantified and visualized using boxplots. The use of boxplots provides both numerical and visual information regarding the magnitude of parameter values, median parameter values, as well as the range and distribution of estimated parameter values across all sites and depths in the network. Additionally, we used pairwise t tests to determine whether Vadose Zone Journal statistically significant changes in Rosetta parameters were found by depth.

Parameter influence on estimated volumetric water content data
In addition to quantifying changes in water retention parameters between MesoSoil v1.3 and MesoSoil v2.0, we studied the effects of changes in these parameters on estimated volumetric water content data. Volumetric water content data are calculated by first converting temperature differential measurements from heat dissipation sensors (CS-229, Campbell Scientific) to matric potential estimates according to where ψ m is the matric potential [kPa], ψ min = −2,083 kPa and represents the minimum matric potential that can be measured with this equation, a = 3.35˚C −1 and represents the steepness of the curve, ΔT inf = 3.17˚C and represents the ΔT ref value at which the curve reaches its inflection point (Zhang et al., 2019), and ΔT ref is the normalized temperature rise, which ranges from 1.38 to 3.96˚C (Illston et al., 2008).
Matric potential values are then converted to a volumetric water content using Rosetta-estimated water retention parameters according to: where θ is the soil volumetric water content, θ r is the residual water content (m 3 m −3 ), θ s is the saturated volumetric water content (m 3 m −3 ), α is a fitting parameter related to the inverse of the soil's air-entry potential (kPa −1 ), n is an additional fitting parameter, and m = 1 − 1/n (van Genuchten, 1980). In order to summarize changes in calculated soil volumetric water content values across a wide range of climatological, geographical, and soil property conditions, we compared soil volumetric water contents estimated using soil water retention parameters from both the MesoSoil v1.3 and MesoSoil v2.0 databases and differences in volumetric water content between the two were quantified using mean absolute difference (MAD) and bias. Additionally, in order to determine which MesoSoil database led to the most accurate estimates of soil moisture, we compared estimated volumetric water contents found using parameters from both the Rosetta1 and Rosetta3 models with measured volumetric water content found using intact core samples for a subset of sites.

Changes in estimated soil hydraulic properties
Using Rosetta3 resulted in changes to the estimated water retention parameters (Figure 2), and when summarized by depth, more than half of those changes were statistically significant (Table 2). Median residual volumetric water content (θ r ) values increased at all sensor depths, with a mean increase across sensor depths of 0.021 cm 3 cm −3 . This change in residual water content values is smaller than that reported by Zhang and Schaap (2017), who found that θ r values increased by ∼30% when using the second least comprehensive hierarchical model within Rosetts3 (H2), which considers only sand, silt, and clay percentages as a model input. Further, mean θ r values found using Rosetta1 and Rosetta3 were found to be significantly different (p < .01) at all depths, indicating that the use of the Rosetta3 model significantly changes this parameter. Median saturated volumetric water content (θ s ) values decreased at all sensor depths, with a mean decrease of 0.011 cm 3 cm −3 . This decrease is similar to that observed by Zhang and Schaap (2017), who noted that θ s values from Rosetta3 were, on average, 0.018 cm 3 cm −3 lower than those estimated by the Rosetta1 model using soil textural classes alone as model inputs. Although decreases in mean θ s values were observed at each depth, those decreases were only statistically significant (p < .05) at the 25-, 60-, and 75-cm depths ( Table 2).
Median alpha (α) values decreased at all depths, with an average decrease of 0.048 kPa −1 . There was also a notable decrease in maximum α values between the two databases, from a value of 0.76 kPa −1 using Rosetta1 to a value of 0.44 kPa −1 using Rosetta3 (Figure 2c). Additionally, mean α values across sites showed a statistically significant decrease (p < .01) at all depths when using the Rosetta3 predictive model. Again, this trend was also observed by Zhang and Schaap (2017), who found that Rosetta3-estimated α values were, on average, ∼68% lower than those estimated by the Rosetta1 model. Further, Zhang and Schaap (2017) note that the changes in the α parameter values between the two Rosetta models is a result of the imposition of bounds on the parameter values in the Rosetta3 model, which were not present in Rosetta1. Because the α parameter is related to the inverse of the air-entry potential of the soil (van Genuchten, 1980), the observed decrease in α values has potential to lead to increases in calculated soil volumetric water contents and, in turn, estimated soil hydraulic conductivity values, especially when the soil is wet (i.e., near the air-entry potential). In fact, this phenomenon of increased volumetric water content during wet periods when using Rosetta3 parameters was observed at several sites, as illustrated by data from the Butler site (Figure 3).

F I G U R E 2
Boxplots of (a) residual water content (θ r ), (b) saturated water content (θ s ), (c) α, (d) n, and (e) saturated hydraulic conductivity (K s ) parameters for all Oklahoma Mesonet sites included in MesoSoil v1.3 database using both the Rosetta1 model and the updated Rosetta3 model. The 10-cm depth is not shown because parameters were not estimated for that depth for most sites in the MesoSoil v1.3 database. The red horizontal line indicates the median value, lower and upper edges of the blue box indicate the 25th and 75th percentiles, whiskers indicate the extent of data not considered outliers, and red crosses indicate values considered outliers. Saturated hydraulic conductivity (K s ) data are shown on a logarithmic scale for clarity T A B L E 2 Mean estimated soil hydraulic parameter values (residual water content [θ r ], saturated water content [θ s ], α, n, and saturated hydraulic conductivity [K s ]) by depth across all sites using Rosetta1 and Rosetta3, results of t tests comparing parameter means by depth, and average mean absolute difference (MAD) and bias values for volumetric water content data estimated using Rosetta3 vs. Rosetta1 across all sites with available data Mean n values decreased at all depths, but none of the decreases were statistically significant (Table 2). Maximum n values decreased somewhat from a maximum value of 2.35 using Rosetta1 to a value of 2.09 using Rosetta3. These findings are generally consistent with those of Schaap and Zhang (2017), who noted little change, on average, in n values across soil classes. Mean K s values increased at all but the 45-cm depth, where they decreased by 1.8 cm d −1 , though no statistically significant differences were observed at any depth ( Table 2). The range of estimated K s values increased, with maximum K s values increasing from 594 cm d −1 using the Rosetta1 model to nearly 765 cm d −1 using the Rosetta3 model. Despite the notable change in maximum K s values between databases, most K s values remained within the F I G U R E 3 Time series of volumetric water content (θ v ) at 5, 25, and 60 cm calculated using soil physical parameters estimated using both Rosetta1 and Rosetta3 during the year 2019 for the (a) Cherokee and (b) Butler sites. When comparing the volumetric water contents from Rosetta1 vs. Rosetta3, the Cherokee site is an example of a site with minimal differences, whereas the Butler site is an example of a site with relatively large differences, particularly at the 5-cm depth

Impact of linear interpolation on 10-cm soil physical properties
Comparing linearly interpolated 10-cm soil property values with measured values at 13 sites where soil samples have been collected at 10 cm (Figure 1), we were able to quantify the amount of error associated with the linear interpolation method for estimating soil physical properties at that depth, as well as the amount of error in Rosetta parameter estimates resulting from using interpolated rather than measured model input data (Table 3). Sand percentages found using linear interpolation were estimated within 3.2% of measured values, and silt and clay percentages were estimated within 2% of measured values based on the mean absolute error (MAE). The absolute value of the bias was ≤0.10% in all cases. Bulk density estimates had a MAE value of 0.087 g cm −3 and bias of 0.001 g cm −3 , indicating good agreement between measured and interpolated bulk densities. Interpolated soil volu-metric water contents at −33 kPa and −1,500 kPa were overestimated, with biases of 0.026 and 0.009 cm 3 cm −3 , respectively. Further, MAE values show that, on average, θ −33 values were estimated within 0.028 cm 3 cm −3 and θ −1,500 values were estimated within 0.013 cm 3 cm −3 .
Differences in Rosetta-estimated parameters (θ r , θ s , α, n, and K s ) found using linearly interpolated input data vs. measured input data were generally small, with the exception of K s (Table 3). Residual water contents had a bias of 0.004 cm 3 cm −3 and saturated water contents had a bias of 0.005 cm 3 cm −3 , indicating that linearly interpolated soil physical properties led to greater Rosetta-estimated values for those parameters, on average. Mean absolute error for θ r and θ s values was low, further demonstrating that, on average, linear interpolation of 10-cm data has little impact on these Rosettaestimated water retention parameters. Values of α were underestimated using linearly interpolated soil physical property input data, with a bias of −0.026 kPa −1 and a MAE of 0.029 kPa −1 . Values of the n parameter were overestimated, as indicated by a bias of 0.032. On average, however, error in n values due to interpolation of soil physical properties was low (MAE = 0.078). The largest parameter changes occurred F I G U R E 4 Soil volumetric water content measured from soil cores vs. soil volumetric water content (θ v ) estimated using the daily mean normalized temperature rise (ΔT ref ) value on the day of sampling along with (a) Rosetta1 and (b) Rosetta3 soil water retention parameters. The shaded error region around the 1:1 line shows the scale of ±MAD (mean absolute difference) for volumetric water content data estimated using Rosetta1 and Rosetta3, which is slightly wider using Rosetta1 (±0.045 cm 3 cm 3 ) than when using Rosetta3 (±0.044 cm 3 cm 3 ) when considering K s values, which were underestimated by 31.4 cm d −1 , on average, when using linearly interpolated soil physical property data and which had a MAE value of 35.0 cm d −1 . These changes in K s values are large in comparison with the observed changes in the other four Rosettaestimated parameters because the K s parameter itself spans several orders of magnitude and the K s estimates from the Rosetta model have an inherent uncertainty of approximately one order of magnitude (Schaap et al., 2001).
Though the measured 10-cm soil property data considered here are available for only 13 sites, these sites represent a wide range of geographical and soil physical conditions (Figure 1). Further, the small MAE and bias values shown in Table 3 indicate that the use of linear interpolation to estimate soil physical properties at the 10-cm depth likely does not introduce a large amount of error and is a sufficient method of estimating these missing parameters for sites where 10-cm samples are lacking.

Impact of Rosetta3 soil water retention parameters on estimated soil volumetric water content
The impact of changes in soil water retention parameters on estimated soil volumetric water content was quantified using data from all Oklahoma Mesonet sites with available data over the entire period of record. Soil volumetric water content was estimated at depths of 5, 25, and 60 cm using the matric potential equation (Equations 1) recently developed by Zhang et al. (2019) and compared at all sites. Differences between water content estimates made using Rosetta1 and Rosetta3 water retention parameters were quantified using metrics of mean absolute difference (MAD) and bias (Table 2). Bias was calculated based on the volumetric water contents estimated using Rosetta3 parameters minus the water contents estimated using the Rosetta1 parameters.
At the 5-cm depth, the average MAD value across sites was 0.022 cm 3 cm −3 . Smaller average MAD values were found for the deeper depths, with values of 0.020 cm 3 cm −3 at 25 cm and 0.018 cm 3 cm −3 at 60 cm. In several cases, the largest differences in volumetric water content occurred during wet periods, such as in the months of March through June of 2019 at the Butler site, particularly at the 5-cm depth (Figure 3b). This discrepancy is likely caused by the decreases in the α parameter as discussed in Section 3.1, which decreased from a value of 0.57 kPa −1 at the 5-cm depth using Rosetta1 to 0.27 kPa −1 using Rosetta3.
Mean bias values were positive, indicating that estimates of volumetric water content using the Rosetta3 parameters were greater than those estimated using Rosetta1 parameters on average (Table 2). However, bias values were generally small, suggesting that changes in the site-and depth-average volumetric water contents due to the use of Rosetta3 soil water retention parameters were minimal. At the 5-cm depth, the bias averaged 0.001 cm 3 cm −3 ; at the 25-cm depth, the bias averaged 0.005 cm 3 cm −3 ; and at the 60-cm depth, the bias was again 0.004 cm 3 cm −3 . These bias values indicate that the soil water retention parameters estimated using Rosetta3 lead to negligibly increased average volumetric water contents across sites and depths. However, differences between Rosetta1-and Rosetta3-based volumetric water content, data may be appreciable at certain sites and times, especially near saturation, as in the case of the Butler location.

Comparing estimated and measured soil volumetric water content
Measured soil volumetric water content values at the time of soil sampling are available for at least one depth at more than 100 sites, and these data have been used in the past to determine the accuracy of estimated volumetric water content data using the original MesoSoil database (Scott et al., 2013) and using various sensor calibrations (Zhang et al., 2019). Using these data, we were able to compare both the MesoSoil v1.3 database and the updated MesoSoil v2.0 database with measured volumetric water content data in order to determine which dataset leads to more accurate estimates of soil volumetric water content (Figure 4). Our results show that the use of water retention parameters from the Rosetta3 model improves estimates of soil volumetric water content as compared to values of soil volumetric water content derived from the Rosetta1 model parameters.
Overall, soil water retention parameters from the MesoSoil v2.0 database resulted in a greater R 2 value (.685 vs. .666) and a lower MAD (0.044 vs. 0.045 cm 3 cm −3 ) than those from the MesoSoil v1.3 database. Additionally, the mean measured soil moisture value at the time of sampling (0.234 cm 3 cm −3 ) was significantly different from the mean value estimated using soil hydraulic properties from the Rosetta1 model (0.214 cm 3 cm −3 , p < .001) but not significantly different from the mean value estimated using the Rosetta3 model (0.233 cm 3 cm −3 , p = .302). Each of these factors indicate that the newly updated MesoSoil v2.0 database leads to more accurate estimates of soil volumetric water content than parameters from the previous database.

CONCLUSION
The updated Oklahoma Mesonet soil physical property database presented here based on the new Rosetta3 pedotransfer function model includes statistically significant changes in the estimated water retention parameters relative to those based on Rosetta1. These changes resulted in improved estimates of soil volumetric water content as compared to measured values at over 100 locations. The greatest differences in estimated volumetric water content between the previous version of the MesoSoil database and the present version were found during wet periods and were likely caused by decreases in the α parameter. In addition to increasing the accuracy of the soil volumetric water content estimates, this new database improves upon the previous version in that it provides additional data for newly installed monitoring sites as well as previously unavailable soil physical and hydraulic properties estimates at the 10-cm depth. The updated Oklahoma Mesonet soil physical property database is freely available for download here: http://soilphysics.okstate.edu/data.

C O N F L I C T O F I N T E R E S T
The authors declare no conflict of interest.