Dataset of monthly downscaled future vapor pressure projections for the conterminous USA for RCP 4.5 and RCP 8.5 compatible with NEX-DCP30

Models that simulate ecosystems at local to regional scales require relatively fine resolution climate data. Many methods exist that downscale the native resolution output from global climate models (GCM) to finer resolutions. NASA NEX-DCP30 is a statistically downscaled 30 arcsecond resolution climate dataset widely used for climate change impact studies in the conterminous USA (CONUS), but it did not include vapor pressure data which is essential for many types of models. We downscaled vapor pressure data from 28 global climate models included in the Coupled Model Intercomparison Project Phase 5 (CMIP5) to 30 arcsecond resolution for CONUS to augment the NEX-DCP30 dataset. Monthly vapor pressure values were calculated from raw GCM output for the conterminous USA from 1950 to 2100, representing RCP4.5 and RCP8.5 climate change scenarios. Vapor pressure data were then downscaled from the GCM's native spatial resolutions to 30 arcsecond using the Bias Correction-Spatial Disaggregation (BCSD) statistical downscaling method, which had been used to create the original NEX-DCP30 dataset. PRISM LT71m gridded climate data for 1970-1999 served as the reference data. The newly created downscaled vapor pressure dataset may be used in conjunction with the existing NEX-DCP30 data as input for vegetation, fire, drought, or earth system models. The data is available at the Forest Service Research Data Archive.


a b s t r a c t
Models that simulate ecosystems at local to regional scales require relatively fine resolution climate data. Many methods exist that downscale the native resolution output from global climate models (GCM) to finer resolutions. NASA NEX-DCP30 is a statistically downscaled 30 arcsecond resolution climate dataset widely used for climate change impact studies in the conterminous USA (CONUS), but it did not include vapor pressure data which is essential for many types of models. We downscaled vapor pressure data from 28 global climate models included in the Coupled Model Intercomparison Project Phase 5 (CMIP5) to 30 arcsecond resolution for CONUS to augment the NEX-DCP30 dataset. Monthly vapor pressure values were calculated from raw GCM output for the conterminous USA from 1950 to 2100, representing RCP4.5 and RCP8.5 climate change scenarios. Vapor pressure data were then downscaled from the GCM's native spatial resolutions to 30 arcsecond using the Bias Correction-Spatial Disaggregation (BCSD) statistical downscaling method, which had been used to create the original NEX-DCP30 dataset. PRISM LT71m gridded climate data for 1970-1999 served as the reference data. The newly created downscaled vapor pressure dataset may be used in conjunction with the existing NEX-DCP30 data as input for vegetation, fire, drought, or earth system models.

Value of the Data
• Vapor pressure data is necessary input for many vegetation, fire, drought, or earth system models. • This dataset is relatively high resolution at 30 arcseconds, approximately 800m.

Objective
Vapor pressure is the amount of water vapor held in the air, and is used in simulations that support climate change impact studies [1] , including vegetation modeling [2 , 3] , and wildfire modeling [4 , 5] . Spatially explicit vegetation or fire models require vapor pressure dataset in a gridded format, along with other climate variables. When such models are applied at local or regional scales to explore climate change impacts, they require climate data at a finer resolution than the native global climate model (GCM) resolution. NASA NEX-DCP30 is a dataset comprising gridded future climate projections covering the conterminous USA from 1950 to 2100 at a monthly time step at 30 arcsecond resolution [6] . NEX-DCP30 was created by applying the Bias Correction Spatial Disaggregation method [7] to GCM output. The original NEX-DCP30 dataset includes climate projections from 33 GCMs published by Couple Model Intercomparison Project Phase 5 (CMIP5; [8] ) and for two RCP scenarios [9] , but does not include vapor pressure. We downscaled vapor pressure data for 28 of the 33 GCMs in the NEX-DCP30 dataset, so that NEX-DCP30 dataset may be used to drive a vegetation model that requires vapor pressure data.

Data Description
The dataset described herein represents vapor pressure at a spatial and temporal resolutions and extents identical to NASA NEX-DCP30. It includes 28 of the 33 GCMs included in the original NASA NEX-DCP30 dataset ( Table 1 ). It represents vapor pressure for the conterminous USA at a 30 arcsecond spatial resolution (approximately 800m at this latitude). The values are given at a monthly time step, from 1950 to 2099 or 2100, for each of the 28 GCMs ( Table 1 ). For five of the 33 GCMs in NEX-DCP30 it was not possible to downscale vapor pressure because necessary data from GCMs were not available. As with NEX-DCP30, data are available for RCP4.5 and RCP8.5 climate change scenarios. The data are in annual 12-month files, so that there are 255 files per GCM. The total dataset requires approximately 2.5 TB of storage. Files are named in the following format: "BCSD_0.008deg_vpr_" + [GCM] + "_" + [scenario] + "_" + YYYYMM + "-" + YYYYMM + ".nc", where GCM refers to one of the 28 global climate models, and scenario refers to the climate change scenario. Because RCP4.5 and RCP8.5 data have the same data for the time period 1950 to 2005, the files representing that period include the phrase "historical". This is not to be confused with vapor pressure data associated with the PRISM reference climate data [10] . The first set of YYYY and MM represents the beginning year and month, respectively; and the second set represents the ending year and month. For example, the file named BCSD_0.0 08deg_vpr_ACCESS1-0_rcp85_210 0 01-210 012.nc contains VPR data downscaled from ACCESS1-0 GCM output for January 2100, to December 2100, for the RCP8.5 climate change scenario.
The downscaled vapor pressure dataset is available at the Forest Service Research Data Archive [11] . Vapor pressure values can be readily converted to vapor pressure deficit by subtracting it from saturated vapor pressure, which can be calculated from temperature. There are many empirically derived equations describing the calculation of saturated vapor pressure from temperature [12] .
Average vapor pressure values were mapped to examine continental scale patterns. Downscaled vapor pressure values exhibit a longitudinal spatial pattern at the continental scale, with the lowest vapor pressure averages in the interior West, and the highest vapor pressure in the Southeast ( Fig. 1 a). Projected changes into the future are moderate (1.0 -1.2x historical) under RCP4.5 climate change scenario, and higher (1.2 -1.4x historical) under RCP8.5 ( Fig. 1 b, c). Seasonal average vapor pressure time series for a sampling of EPA Level III Ecoregions [13] exhibit reasonable patterns. For example, in the California Central Valley ecoregion, seasonal patterns of vapor pressure remain consistent into the future, with higher values in late summer ( Fig. 2 ).

Experimental Design, Materials, and Methods
Global climate model (GCM) data in native resolution were downloaded from the Coupled Model Intercomparison Project 5 (CMIP5) data portal on the Lawrence Livermore National Laboratory (LLNL) node of the Earth System Grid Federation (ESGF) [14] . GCM data were downloaded to the NASA High-End Computing Capability (HECC) platform and all subsequent downscaling was performed on the platform. Since vapor pressure ( vpr ) values are not published in the ESGF data portal, vpr was calculated in each GCM's native resolution from two published variables, surface specific humidity ( huss ) and surface air pressure ( ps ), using the relationship vpr = ( huss / ( huss + 0.622)) * ps .
Bias correction spatial disaggregation method (BCSD) requires a reference dataset in the target resolution [7] . Parameter-elevation Relationships on Independent Slopes Model (PRISM) [10] , version LT71m, was used as the reference dataset. PRISM covers the conterminous USA at 30 arcsecond spatial resolution. Version LT71m spans 1950 to 2005 at a monthly time step, and was selected to best match the temporal span and resolution of the downscaling performed to create the original NEX-DCP30 dataset [6] .
BCSD comprises two steps: bias correction and spatial disaggregation [7] . We performed both steps using NCAR Command Language (NCL) [15] . For bias correction, both the reference data and the raw GCM data were resampled to a common 1 degree resolution grid. Then quantile mapping was applied to each cell in the GCM data to correct bias, using PRISM as the reference data. For the spatial disaggregation step, bias corrected monthly ratio anomalies of GCM output are spatially interpolated to the downscaled resolution, and multiplied to mean historical vapor pressure to obtain the downscaled data.
Quality assurance checks were performed across space and time. First, the average downscaled vapor pressure for the historical period (1980-2009) was compared to corresponding averages derived from the reference data to check for spatial errors ( Fig. 3 ). The maps of the ratio of values, one for each of the 28 GCMs demonstrate that the ratios are close to 1, ranging only

Ethics Statements
The work in this project did not involve human subjects, animal experiments, or data collected from social media platforms.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Dataset of monthly downscaled vapor pressure projections, CMIP5 (Original data) (Forest Service Research Data Archive).