WEPPcloud hydrologic and erosion simulation datasets from 28 watersheds in US Pacific Northwest and calibrating model parameters for undisturbed and disturbed forest management conditions

The WEPPcloud interface is a new online decision-support tool for the Water Erosion Prediction Project (WEPP) model that facilitates data preparation and model runs, and summarizes model outputs into tables and maps that are easily interpretable by users. The interface can be used by land and water managers in United States, Europe, and Australia interested in simulating streamflow, sediment and pollutant loads from both undisturbed and disturbed (e.g. post-wildfire or post-treatment such as thinning or prescribed fires) forested watersheds. This article contains full hydrologic model runs for 28 forested watersheds in the U.S. Pacific Northwest with the WEPPcloud online interface. It also includes links to repositories with the individual model runs, a table containing default model parameters for disturbed conditions, and figures with model outputs as compared to observed data. The data in the repositories include all the raw data input and output from the model as well as the processed data, which can be accessed through tables and shapefiles to provide additional insights into the model outputs. Lastly, the article describes how the data are organized and the content of each folder containing the data. These model runs are useful for anyone interested in modeling forested watersheds with the WEPPcloud interface.

The data in the repositories include all the raw data input and output from the model as well as the processed data, which can be accessed through tables and shapefiles to provide additional insights into the model outputs. Lastly, the article describes how the data are organized and the content of each folder containing the data. These model runs are useful for anyone interested in modeling forested watersheds with the WEPPcloud interface.  Table   Subject Hydrology and Water quality Specific subject area Decision-support tools in hydrology, soil erosion, and water quality Type of data

Value of the Data
• These datasets contain: 1) model simulation data from the WEPPcloud online interface. Specifically, they provide simulated daily streamflow and annual sediment and phosphorus yield for undisturbed forested conditions; 2) graphs of model data as compared to United States Geological Survey (USGS) data observed at the outlet of watersheds; and 3) a table with default model parameters. • These datasets offer insight into the WEPPcloud's capability to simulate daily streamflow, and annual sediment and phosphorus yield from undisturbed forests with minimal calibration. • Main beneficiaries of these resources are land and water managers and researchers interested in the accuracy of the WEPPcloud interface as well as anyone learning about the WEPP model and the WEPPcloud interface. • Users can either recreate and run the watersheds in WEPPcloud or they can run the model with the provided files.

Data Description
These data were used in a WEPPcloud model assessment study: WEPPcloud: An online watershed-scale hydrologic modeling tool. Part II. Model performance assessment and applications to forest management and wildfires [1] and are also part of an additional study on the impacts of future forest management options on water quality in the Lake Tahoe basin, California/Nevada [2] .
- Fig. 1 shows the location of the modeled watersheds in the Western U.S. - Table 1 contains information on modeled watersheds, including watershed name, USGS watershed name and station, and web links to model runs in WEPPcloud. The model runs are also archived in the HydroShare repository and contain both the input and the output data from the model, among other useful information. The watershed names reflect the watershed names used in other studies, which provided the observed water quality data for model assessment [3] . The streamflow for Mica Creek watersheds, MC3 and MC6, were recorded with flumes. Details regarding data collection can be found in [4] . - Table 2 contains key soils and management parameters used to parameterize WEPPcloud by management and three soil types (i.e. granitic, volcanic, alluvial), for the modeled watersheds. These values were summaries from various field studies conducted by the United States Department of Agriculture (USDA), Forest Service, Rocky Mountains Research Station and from published research papers. -Figs. 2 -10 show daily streamflow and annual sediment and phosphorus yield model outputs as compared to observed data. Modeled streamflow was compared to data from the USGS gauging stations for watersheds in the Lake Tahoe basin, Bull Run, and Cedar River watersheds, and data measured with flumes in the Mica Creek Experimental Watersheds, Idaho. Modeled sediment and phosphorus yield was compared to flow-weighted annual observations processed by [3] . -Figs. 11 -13 show interpolated estimated values of baseflow, deep seepage recession coefficients, critical shear, and phosphorus concentrations in runoff, lateral flow, and baseflow for Lake Tahoe basin watersheds in California/ Nevada. These values were manually interpolated based on the calibrated values at the 17 watersheds in Lake Tahoe with long-term USGS streamflow data. -All the model runs including all the data input and output can be accessed from the web links provided in Table 1 and are also stored in public repositories (see Data Accessibility). -Model runs folder contains a list and description of all the folders in these model runs, which are archived as .zip files. The data structure in these folders is similar for all WEPPcloud model runs. -the climate files generated by hillslope in .prn and .cli formats -the watershed climate file -the original daymet/gridmet data that were used to generate the .cli files dem ( folder ) contains: -the 10-or 30-m Digital Elevation Map (DEM) derived from the National Elevation Dataset -topaz folder containing the watershed delineation and all the maps created during the watershed delineation export (folder) contains channels and subcatchments files in GIS format containing topographic characteristics (such as slope, aspect, or length), input data (soil and management), and output information (runoff, lateral flow, baseflow, sediment, pollutant, etc.). The file also contains several GeoTIFF maps used in the model run.
landuse (folder) contains landuse map (e.g. ascii map with the 2016 National Land Cover Database (NLCD) for US Locale. The NLCD codes are translated into WEPP-equivalent management files based on the mapping for the configuration.
observed (folder) contains observed data (if) provided by the user soils (folder) contains the soil files in WEPP format by mapunit key (mukey) and a ssurgo soils map in ascii format watershed (folder) contains files with slope information for each channel and hillslope    -wepp/output contains the main model outputs for each hillslope and for the watershed. Most of these files are self-explanatory, however, we encourage users to check the WEPP user manual [5] for additional information.
-wepp/plots contains maps of gridded soil loss following a flowpath run [6] -wepp/runs contains all the main WEPP input files -nodb filles, which are JSON serialized instances of wepppy.nodb classes used by WEPPcloud. These contain metadata related to the project. They are viewable in FireFox/Notepad ++ , etc.

Experimental Design, Materials and Methods
The hydrologic simulations were performed with the WEPPcloud interface [ 7 , 8 ] for 28 relatively undisturbed watersheds in the U.S. Pacific Northwest (Lake Tahoe basin, CA/NV; Bull Run Watershed, OR; Cedar River and Taylor Creek, WA, and two watersheds in Mica Creek Experimental Watershed, ID) and compared model outputs such as streamflow, sediment and phosphorus yield to observed data recorded at USGS gaging stations and recorded with flumes ( Table 1 ; [1] ). Each model run (including data input and output) can be viewed either online by accessing the web links in Table 1 or by accessing the zipped folders stored in the HydroShare repository. The WEPPcloud allows users to view most of the model input selections directly on the main page of the model run or in the PowerUser Panel ( Fig. 14 ). The NoDbs folders contain model selections, while the wepp/runs and wepp/output folders contain all the input and output raw data files. The HydroShare repositories contain the same data in similar folders.

Model calibration
All model runs were performed initially with the WEPPcloud default parameters. We further minimally calibrated the model by downloading all the model input data, manually changing key calibrating parameters, and then rerunning the models with wepppy-win-bootstrap [9] , a free Python package developed to facilitate model runs on Windows computers. Lastly, we reran the models on the WEPPcloud interface with the calibrating parameters. The calibration involved altering the linear baseflow recession coefficient ( k b in /wepp/runs/gwecoeff.txt files), the saturated hydraulic conductivity of the underlying geology ( K sub in /wepp/runs/[_].sol files), the rain/snow temperature threshold (T rain/snow in /wepp/runs/snow.txt file) for streamflow, channel bed critical shear stress ( τ c in /wepp/runs/pw0.chn file) for sediment yield, and phosphorus concentrations in surface runoff, lateral flow, baseflow, and attached to sediment for phosphorus yield (in /wepp/runs/phosphorus.txt file). The minimal calibration was preferred to minimize potential issues with equifinality and to demonstrate model's predictive capabilities. Values for daily modeled streamflow at all watersheds and annual sediment and phosphorus yield at watersheds from the Lake Tahoe basin were compared to observed data ( Figs. 2-10 ). Goodness-of-fit statistics (Nash-Sutcliffe Efficiency, the Kling-Gupta efficiency, and percent bias) and additional graphs can be found in [1] .

Basin-scale model runs
In the Lake Tahoe Basin, we were interested in applying the WEPPcloud interface to all 63 watersheds that flow into the lake and further run the models for disturbed conditions (thinning, prescribed fire, wildfire, simulated fire) [ 1 , 2 ], however, the model calibration was performed only for 17 watersheds with long-term USGS data. Therefore, we manually distributed the calibrating parameters to the remaining watersheds based on the watersheds' similarities, parent material, and proximity ( Figs. 11-13 ).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.