High Resolution Forecasting of Summer Drought in the Western United States

Drought monitoring and forecasting systems are used in the United States (U.S.) to inform drought management decisions. Drought forecasting efforts have often been conducted and evaluated at coarse spatial resolutions (i.e., >10‐km), which miss key local drought information at higher resolutions. Addressing the importance of forecasting drought at high resolutions, this study develops statistical models to evaluate 1‐ to 3‐month lead time predictability of meteorological and agricultural summer drought across the western U.S. at a 4‐km resolution. Our high‐resolution drought predictions have statistically significant skill (p ≤ 0.05) across 70%–100% of the western U.S., varying by evaluation metric and lead time. 1‐ to 3‐month lead time drought forecasts accurately represent monitored summer drought spatial patterns during major drought events, the interannual variability of drought area from 1982 to 2020 (r = 0.84–0.93), and drought trends (r = 0.94–0.97). 71% of western U.S summer drought area interannual variability can be explained by cold‐season (November–February) climate conditions alone allowing skillful 3‐month lead time predictions. Pre‐summer drought conditions (represented by drought indices) are the most important predictors for summer drought. Thus, the statistical models developed in this study heavily rely on the autocorrelation of chosen agricultural and meteorological drought indices which estimate land surface moisture memory. Indeed, prediction skill strongly correlates with persistence of drought conditions (r ≥ 0.73). This study is intended to support future development of operational drought early warning systems that inform drought management.

the driest 22-year period since at least 800 (Williams et al., 2022). WUS drought frequency is projected to increase through 2050 with corresponding projections of increased forest-fire activity (Abatzoglou et al., 2021;Strzepek et al., 2010). Given the profound historical consequences, contemporary severity, and future projections of droughts in the WUS, it is essential to continually improve drought monitoring and forecasting systems that inform policy and management designed to mitigate the consequences of droughts. Therefore, this study is motivated to improve summer drought forecasting abilities in the WUS through predicting drought at a much higher spatial resolution (4-km) than existing seasonal drought forecasting frameworks (typically >10-km).
Several drought monitoring systems are used in the U.S. to inform drought response. The National Centers for Environmental Information (NCEI) and the Climate Prediction Center (CPC) of the National Oceanic and Atmospheric Administration (NOAA) publishes weekly Palmer Drought Severity Index (PDSI) maps for the conterminous U.S. (CONUS) by climate divisions (https://www.ncei.noaa.gov/access/monitoring/weekly-palmers/; https://www.cpc.ncep.noaa.gov/products/monitoring_and_data/drought.shtml). NCEI also publishes monthly drought maps referred to as the North American Drought Monitor (NADM) which presents drought conditions throughout the continent (https://www.ncdc.noaa.gov/temp-and-precip/drought/nadm/maps). NASA's North American Land Data Assimilation System (NLDAS) is used to create the NLDAS Drought Monitor which publishes weekly agricultural drought maps based on soil moisture simulations from the Mosaic, Noah, and Variable Infiltration Capacity (VIC) land surface models (LSMs) on a 1/8° (∼12-km) spatial grid (https://ldas. gsfc.nasa.gov/nldas/drought-monitor; Xia et al., 2014). Perhaps the most popular means of monitoring drought in the U.S. is the U.S. Drought Monitor (USDM) which combines expert opinion with metrological and hydrological data to produce weekly maps of categorical drought severity for the U.S. ranging from abnormally dry (i.e., D0) to exceptional drought (i.e., D4) at the county level (Svoboda et al., 2002;https://droughtmonitor.unl. edu/). The West Wide Drought Tracker (WWDT) uses 4-km observationally based data to monitor drought conditions across the WUS dating back to 1895 (Abatzoglou et al., 2017). These operational monitoring systems are essential to inform drought management and related decision making; however, they do not explicitly provide information about future drought conditions which are essential to early drought warning and proactive planning (Fontaine et al., 2014).
Drought forecasts are generally made with dynamical (e.g., AghaKouchak, 2014; Shukla et al., 2014;Yoon et al., 2012) or statistical (e.g., Brust et al., 2021;Hao et al., 2016;Madadgar & Moradkhani, 2013, 2014Park et al., 2016) model simulations of drought-related variables. Many of these approaches have been reviewed in recent literature (Fung et al., 2020;Hao, Yuan, et al., 2017;Prodhan et al., 2022). Dynamical approaches often rely on global or regional climate models based on physical processes of the atmosphere, ocean, cryosphere and land surface. The dynamical approach is the most advanced tool for drought forecasting. For instance, dynamical forecasts from the North American Multimodel Ensemble (NMME) (Kirtman et al., 2014) play an important role in probabilistic prediction of drought . However, dynamical modeling methods are computationally expensive, are often limited by coarse resolution simulations that require bias correction and downscaling, and provide uncertain estimates of precipitation (AghaKouchak, 2014;Hao, Yuan, et al., 2017).
On the other hand, relationships between contemporary drought with future drought have been leveraged to inform statistical drought forecasts which are operationally useful but provide relatively little information about physical mechanisms of drought evolution. For instance, streamflow anomalies (used to quantify the severity of hydrologic drought), soil moisture percentiles (used to quantify the severity of agricultural drought) and the widely used PDSI (Palmer, 1965) (used to quantify the severity of meteorological drought) tend to have strong seasonal persistence (Dai, 2011;Lakshmi et al., 2004;Madadgar & Moradkhani, 2013, 2014Szép et al., 2005), allowing past and present drought to be used as a predictor of future drought. Winter and spring snowpack, commonly referred to as a natural water tower for the western U.S., also plays a critical role in modulating summer drought and aridity conditions and provides additional predictive information for summer drought (Abolafia-Rosenzweig et al., 2022a;Huning & AghaKouchak, 2020;Van Ioon, 2015). Statistical approaches, such as regression or machine learning models, are based on empirical relationships depicted in historical climate and drought records and are useful due to their ease of implementation, relatively high computational efficiency, and established success in providing useful drought forecasts in the U.S. (Brust et al., 2021;Hao et al., 2016) and other regions around the globe (AghaKouchak, 2015;Mathivha et al., 2020;Rhee & Im, 2017). For instance, statistical models are capable of forecasting regions where USDM drought classifications are likely to intensify or improve (Lorenz et al., 2017;Otkin et al., 2013;https://www.drought.gov/) and USDM classifications up to 12 weeks in advance (Brust et al., 2021;Hao et al., 2016). Statistical models were also capable of forecasting the extreme 2012 U.S. drought several months in advance (AghaKouchak, 2014). Although these statistical drought forecasting frameworks can be valuable for informing drought management, these analyses have been conducted over coarse resolutions (>10-km) and thus provide limited information for forecasting the spatial heterogeneity of drought evolution at finer scales which is valuable for end-users (Samaniego et al., 2019).
Because summer WUS drought is spatially variable at finer resolutions than current forecasting systems with typical resolutions >10-km, and has close relationships with food security and wildfire in the WUS, it is imperative to develop seasonal drought forecasting systems at high spatial resolutions (e.g., <5-km; Samaniego et al., 2019). This study uses high resolution (4-km) observation-based (whenever available) spatially and temporally continuous hydroclimate data to develop statistical models for predicting summer meteorological and agricultural drought at a higher spatial resolution than previous U.S. drought forecasting analyses and existing products (AghaKouchak, 2014;Brust et al., 2021;Hao et al., 2016). Importantly, we quantify the predictability of drought at this higher spatial resolution which is akin to nowadays convection-permitting modeling efforts (Chen et al., 2020;Liu et al., 2017;Prein et al., 2015). In this study, we evaluate the ability of statistical models to provide summer drought forecasts across the WUS on a 4-km grid at 1-, 2-, and 3-month lead times. This analysis also assesses the relative importance for a suite of pre-summer drought predictors and provides insights relating the spatial heterogeneity of drought predictability to drought persistence characteristics. The results of this research are intended to inform development of drought forecasting systems that leverage high-resolution (≤4-km) hydrometeorological observations and simulations.

Study Domain
The WUS domain considered in this study is comprised of the five western regions from the National Integrated Drought Information System (NIDIS) drought early warning system (DEWS) (https://www.drought.gov/dews)-Pacific Northwest, California-Nevada, Missouri River Basin, Intermountain West, and Southern Plains regionsthat are north of 31°N latitude and west of 96°W longitude. This domain is spatially heterogenous, including a spectrum of arid to wet regions, deserts to forests, large irrigated agricultural areas, and snow-dominated mountains. The domain's hydrologic cycle is temporally variable, with distinct wet and dry seasons, particularly in the California-Nevada region. This domain is prone to crippling drought events, including the contemporary WUS megadrought (Williams et al., 2020(Williams et al., , 2022. Droughts in the WUS can take many different forms including metrological, agricultural, hydrological, socioeconomic and snow drought (Huning & AghaKouchak, 2020;Littell et al., 2016;Livneh & Hoerling, 2016;Luo & Wood, 2007;Ryu et al., 2010;Westerling et al., 2006;Wlostowski et al., 2022). In this study we focus on meteorological and agricultural drought given their established relationships with wildfire, water security and food security in this domain as well as persistence of drought indices related to these categories (Abatzoglou & Kolden, 2013;Lakshmi et al., 2004;Littell et al., 2009;Lobell et al., 2013Lobell et al., , 2014Scanlon et al., 2012).

Seasonal Drought Forecasting Framework
The proposed drought forecasting framework ( Figure 1) predicts seasonally averaged meteorological and agricultural drought indices (PDSI and soil moisture percentile, respectively) during summer months (June-August) at 1-, 2-, and 3-month lead times. PDSI and soil moisture percentiles are widely used drought indices in research applications and by the operational USDM because of their interpretable representation of land surface and meteorological drought conditions as spatiotemporally continuous fields (Abatzoglou & Kolden, 2013;Dai, 2011;T. W. Ford & Quiring, 2019;Kumar et al., 2014;Luo & Wood, 2007;Palmer, 1965;Shukla et al., 2011). PDSI is a commonly used meteorological drought index. It implicitly accounts for soil moisture conditions making it a less explicit representation of on-going meteorological conditions relative to other indices (e.g., SPI; T. Ford & Labosier, 2014), but results in relatively longer persistence making it more predictable by statistical models (Dai, 2011;Lakshmi et al., 2004;Szép et al., 2005). Figure S1 in Supporting Information S1 shows moderate correlations between summer PDSI and soil moisture percentiles across the western U.S (mean r = 0.63; interquartile range (IQR) = 0.57-0.74) indicating these indices are related but mostly independent.
The forecasting framework used herein adopts generalized additive models (GAMs; see Section 2.3 for details) using climate predictors (Table 2) averaged across winter and spring months of the same water year. The 1-month lead time forecasts use predictors averaged from November to April so predictions can be made by the end of 4 of 24 April. The 2-month lead time forecasts use predictors averaged from November-March so predictions can be made by the end of March. The 3-month lead time forecasts use predictors averaged from November to February so predictions can be made by the end of February. Preliminary tests were conducted to compare GAMs using predictors averaged beginning in November versus predictors averaged over the two most recent months available for the forecast (e.g., 1-month lead time forecasts using predictors averaged over March-April). This preliminary analysis concluded that predictions tend to be more accurate in the former case, using predictors starting in November. One reason for this is that accumulated drought conditions from winter through spring, rather than just spring, tend to provide a stronger land surface moisture memory that modulates summer drought conditions. This is supported by PDSI and soil moisture percentiles averaged over November-April tending to have higher correlations with summer (JJA) drought indices, relative to JJA comparisons with MA (March-April), FMA (February-April) and NDJ (November-January) averaged drought indices ( Figure S2 in Supporting Information S1).

Generalized Additive Models (GAMs)
All models in this study are trained or evaluated over a 39-year period (1982-2020) when climate and drought data are available (Table 2). We employ the widely used generalized additive models (GAMs) (Hastie & Tibshirani, 1986) using a Gaussian distribution to predict metrological and agricultural summer drought with pre-summer climate and drought conditions on 4-km grids across the WUS. GAMs are trained in the statistical software R (R Core Team, 2020) using the mgcv package (Wood, 2011). GAMs are logistic regression models, similar to normal linear regression models, but replace the linear form of ∑ with a sum of nonlinear smooth functions ∑ ( ) . Initial testing was performed using a more complex machine learning algorithm (Artificial Neural Networks; ANNs); however, ANNs were found to be too computationally expensive to comprehensively train across the domain (comprised of 2,57,216 4-km pixels), which is not suitable for potential operation applications. Thus, this study uses GAMs for prediction and analyses due to their success in previous drought and fire prediction research (Abolafia-Rosenzweig et al., 2022a, 2022bDavenport et al., 2015) and their high computational efficiency which is essential for operational systems.  (Table 2) are temporally averaged during pre-summer months from November through April, March and February for 1-, 2-, and 3-month lead time forecasts, respectively. Statistical models predict summer drought indices (averaged from June through August). The Palmer Drought Severity Index is used to quantify the severity of meteorological drought and soil moisture percentiles are used to quantify the severity of agricultural drought.
Climate predictors are pre-treated with a principal component analysis (PCA) to remove linear correlations among each predictor set (Dorman et al., 2013). For GAMs using less than five predictors, all PCs are used as predictors. For GAMs using five or more predictors, the first four PCs are used as predictors, which explain an average of 93% of the variance of all considered PCs. We limit the number of predictors to four to reduce model overfitting while allowing the large majority of climate predictor information to be explained by PC model inputs. GAMs used in this study can be written as: where g(·) is a Gaussian link function, E(drought index) is the expected value of a drought index given the set of predictors, s(·) is a nonlinear smooth function of climate predictors, PC i is the ith principal component for a set of climate predictors, and n is the number of climate predictors used in each GAM. Hence in this study, n never exceeds four. Drought indices predicted in this analysis are soil moisture percentiles (for agricultural drought) and PDSI (for meteorological drought). We generate an ensemble of GAMs by using unique combinations of predictors to ensure that results are insensitive to model-specific characteristics. Additionally, the ensemble of GAMs allows for probabilistic drought forecasting (AghaKouchak, 2014;. Twenty best-performing ensemble members (evaluated based on all unique combinations of pre-summer predictors described in Table 2; totally 511 or 1,023 ensemble members tested for non-snowy and snowy pixels, respectively) are chosen based on a minimized Generalized Cross Validation (GCV) score. GCV is used for smoothness selection in the mgcv package where smoothing parameters are chosen to minimize prediction error. Ensembles chosen on the basis of GCV slightly outperforms ensembles chosen on the basis of the Akaike information criterion (AIC) ( Table S1 in Supporting Information S1) so we exclusively consider ensembles chosen on the basis of GCV in this study. Each pixel is modeled independent of surrounding pixels, thus ensembles are trained and evaluated at each 4-km pixel. GAM simulations of soil moisture percentiles are post-processed to maintain the soil moisture percentile space, constrained between 0 and 1, using the following Equation 2: where SM rank is the original 39-value time series as ranks.

Model Evaluation
Because GAMs are prone to overfitting, we exclusively evaluate model performance using out-of-bag comparisons (i.e., leave-one-out cross validations and retroactive forecasting) as done in previous evaluations of drought predictability (Bachmair et al., 2017;Davenport et al., 2015;. In leave-one-year-out cross validations, models are trained with data from all years except the target predicted year. Predictions are made for each year in the study period following this procedure resulting in a complete out-of-bag predicted summer drought time series. Retroactive forecasting mimics an operational forecasting system where models are trained with records from years prior to a target year. Retroactive forecasts in this analysis were made at 1-month lead times from 2010 to 2020 to support the leave-one-year-out cross validation. We assess the ability of GAMs to predict summer drought using three evaluation metrics: drought prediction accuracy (A), probability of detection (POD) and Pearson's correlation coefficient (r) as done in previous drought-related studies (Bhardwaj & Mishra, 2021;Fankhauser et al., 2022;T. W. Ford & Quiring, 2019;Zhu et al., 2016). Evaluation metrics compare GAM ensemble means with monitored drought. A represents the fraction of total predictions that are correct, ranging from 0 to 1, with 1 being a perfect score: POD quantifies the fraction of monitored droughts (Section 3) that are correctly predicted as drought by GAMs, ranging from 0 to 1, with 1 being a perfect score: Hits represent summers when drought (D1-D4) was monitored and correctly predicted by GAMs. Correct negatives represent summers when abnormally dry (D0) to wet conditions are monitored and correctly predicted. Total is the total number of summers evaluated (1982 through 2020 = 39 summers). Misses represent summers when drought is monitored but GAMs predict abnormally dry to wet conditions. The Pearson correlation (r) coefficient (Pearson & Henrici, 1896), ranging from −1 to 1, is widely used to quantify linear relationships between two variables, and thus serves as a well-known benchmark for performance: where M t and P t represent monitored and GAM predicted drought indices at timestep t, respectively. and represent the time series mean of monitored and GAM predicted drought indices, respectively. For calculation of the A and POD metrics, droughts are considered as a binary event that occurs when drought indices fall into the USDM moderate to exceptional drought classification range (D1-D4) (Svoboda et al., 2002; Table 1). Statistical significance of A and POD metrics are computed from nonparametric bootstrap hypothesis tests, based on comparisons of random samples with monitored drought conditions at 10,000 randomly selected pixels.
Probabilistic drought forecasts can improve drought early warning systems . In this study, the forecasted drought probability is computed as the fraction of ensemble members that predict a drought event based on a specified threshold. We evaluate the ability of our ensemble modeling framework to provide useful  (Svoboda et al., 2002) Category Description Palmer drought severity index (PDSI) Soil moisture model (percentiles) probabilistic drought forecasts through comparisons of probabilistic drought maps from GAMs with monitored agricultural and meteorological drought maps for several significant summer drought events in 2007, 2011, 2012, and 2014. These evaluations consider drought thresholds of D1 and D3.

Predictor Importance
Leave-one-column-out (LOCO) analyses are used to quantify relative predictors' importance at each pixel (Abolafia-Rosenzweig et al., 2022b;Kuhn-Régnier et al., 2021). The LOCO importance is measured by repeatedly training GAMs, each time without one particular predictor, and computing the ratio of a skill metric (Taylor Skill Score (S); Taylor, 2001) for the re-trained model to the original model including all of its original predictors. Ratios less than unity indicate degraded performance from removing a predictor, (i.e., reduced S), where S summarizes root mean square error (RMSE), ratio of variance and r as a single metric.

Data Used for Climate Predictors and Drought Indices
The climate predictors (Table 2) selected in this study are based on previous research that quantified and discussed relationships between these factors with drought in the U.S. (AghaKouchak, 2014;Brust et al., 2021;Hao et al., 2018;Lakshmi et al., 2004;McCabe et al., 2004;Otkin et al., 2018;Rajagopalan et al., 2000;Ryu et al., 2010). Data sources for these selected climate predictors and drought conditions come from long-term spatially and temporally continuous high-resolution data products, which are widely used and observation-based (whenever available) (Table 2). Specifically, precipitation, temperature and vapor pressure deficit (VPD) are obtained from the daily 4-km observation-based PRISM (Parameter-elevation Relationships on Independent Slopes Model) data set (Daly et al., 2008(Daly et al., , 2015. PDSI is derived from PRISM precipitation and temperature, and snow water equivalent (SWE) is from the 4-km observation-based University of Arizona (UA) data set (Broxton et al., 2019;Zeng et al., 2018). Soil moisture, evapotranspiration (ET) and potential ET (PET) are from a new NCAR/USGS 4-km 40-year-plus (1980-2021) CONUS hydroclimate reanalysis data set (hereafter CONUS404; Rasmussen et al., 2023). Atlantic Multidecadal Oscillation (AMO) is published by NOAA's Physical Sciences Laboratory (Enfield et al., 2001), and Pacific Decadal Oscillation is published by NCEI (Mantua, 1999).
PDSI and root-zone (top 1 m) soil moisture percentiles are used to quantify meteorological and agricultural drought conditions, respectively. We use PRISM precipitation and temperature data to compute monitored PDSI because PRISM is an observationally based data set which is widely used and has been formally validated (Abatzoglou, 2013;Buban et al., 2020;Currier et al., 2017;Daly et al., 2008Daly et al., , 2015. PDSI is calculated using the pdsi MatLab package (King, 2022; https://github.com/JonKing93/pdsi/releases/tag/v1.0.0). Given the lack of long-term spatiotemporally continuous observational data products for soil moisture, evapotranspiration (ET) and potential ET (PET) at a high resolution (≤4-km) during the period of this study , we use the CONUS404 data product for these three variables. The CONUS404 data product is an extension of a widely used convection-permitting Weather Research & Forecasting (WRF) CONUS climate modeling product with improved model physics and configurations C. He et al., 2021;Liu et al., 2017). Many studies have shown that the convection-permitting WRF climate modeling product adequately captures key observed meteorological fields (e.g., precipitation, temperature, snowpack, and land surface states) (C. He et al., 2019;Ikeda et al., 2021;Liu et al., 2017;Scaff et al., 2020). In this study, we further evaluate the CONUS404 soil moisture, PET and ET data against in situ observations and satellite retrievals (see below for details) before using them in the GAMs model training and forecasting. Figure S3 in Supporting Information S1 and Figure 2 present comparisons of CONUS404 soil moisture percentiles with 509 in situ monitoring stations across the WUS and operationally used drought monitoring products, respectively. CONUS404 drought predictability is benchmarked against widely used LSMs from Phase 2 of the North American Land Data Assimilation System (NLDAS-2; , Xia, Mitchell, et al., 2012 that is also currently used for operationally monitoring U.S. drought. In situ soil moisture data from 1997 to 2019 and nearest-neighbor matched NLDAS-2 Noah soil moisture data are from T. W. Ford and Quiring (2019). CONUS404 has similar agreement with in situ observations relative to the NLDAS-2 Noah LSM, with a mean prediction accuracy (A) of 0.65-0.75 and a mean POD of 0.8-0.9 ( Figure S3 in Supporting Information S1). Furthermore, CONUS404 monitored agricultural drought conditions during four major summer drought events in the study record (2007,2011,2012, and 2014) well capture operationally monitored NLDAS-2 ensemble mean drought conditions (Figure 2). Similarly, PRISM-derived PDSI favorably compares to operationally monitored PDSI conditions from CPC and NCEI archives (https://www.cpc.ncep.noaa.gov/products/monitoring_and_data/ drought.shtml; https://www.ncdc.noaa.gov/temp-and-precip/drought/historical-palmers/maps). CONUS404 agricultural drought index and PRISM PDSI also compare favorably with the integrated drought maps and characteristics from USDM during these summer drought events (Figure 2). Thus, given the adequate accuracy of CONUS404 in simulating soil moisture percentiles and agricultural drought and its high spatial resolution, we consider the CONUS404 soil moisture data a valuable agricultural drought monitoring product to use in this analysis. The wide operational use and previous evaluations of PRISM combined with Figure 2 further support the use of PRISM PDSI as a meteorological drought monitoring product.
We also evaluate CONUS404 ET and PET, which are exclusively used as predictors. CONUS404 ET is obtained directly from the CONUS404 outputs and PET is derived from CONUS404 outputs following the Penman-Monteith equation ( Figure S4 in Supporting Information S1). Areas of low correlation between CONUS404 and MOD16A2 ET correspond with arid areas characterized by heavily water-limited ET ( Figure S4 in Supporting Information S1). Overall, we consider CONUS404 ET and PET reasonable estimates over most of the western CONUS that can provide valuable predictive information in GAMs.
As a pre-processing step to statistical modeling, all variables are re-mapped to the CONUS404 4-km grid using bilinear interpolation. Bilinear interpolation is adequate because all data have the same native spatial resolution (4-km), excluding the Atlantic Multidecadal Oscillation (AMO) and the Pacific Decadal Oscillation (PDO) indices which are not spatially distributed, and thus are considered uniform across the domain. Predictors and drought conditions are first averaged to a monthly time scale, then predictors are temporally averaged over pre-summer months and monitored drought indices are averaged over summer months (Figure 1). SWE is only used as a predictor of snowy pixels that may be affected by snow processes: pixels that recorded less than 10 years of 0 mm pre-summer SWE and at least 1 year of at least 0.1 mm of mean pre-summer SWE. This results in 83% of pixels considering SWE as a predictor and 17% pixels not considering SWE as a predictor in the study domain.

Evaluating GAM 1-Month Lead Time Predictions
GAMs provide skillful 1-month lead time predictions of meteorological and agricultural summer drought across most of the WUS at a 4-km resolution while maintaining accurate representations of the total WUS area effected by various drought severities and long-term trends (Figures 3-5). In leave-one-year-out cross validations A, POD and r for meteorological drought predictions are statistically significant (p ≤ 0.05) across 93%, 85%, and 100% of the domain with median values of 0.90, 0.67, and 0.86, respectively (Figures 3a, 3c, and 3e). For agricultural drought predictions, A, POD and r are statistically significant (p ≤ 0.05) across 77%, 94%, and 100% of the domain with median values of 0.85, 0.63, and 0.78, respectively (Figures 3b, 3d, and 3f). Retroactive forecasting also provides statistically significant skill of drought predictions across the large majority of the western U.S. (95% and 89% of the domain for meteorological and agricultural drought, respectively) where median correlations for predicted PDSI and soil moisture percentiles are 0.84 and 0.77, respectively ( Figure S5 in Supporting Information S1). Median and interquartile ranges (IQR) of evaluation metrics for drought predictions are summarized in Table 3. 39-year linear trends of predicted drought index time series closely match monitored trends with high correlations (r = 0.94-0.97) and acceptable biases (−18%-3%) across all pixels in the WUS (Figure 4). Spatially averaged monitored and predicted drought indices each show negative trends for PDSI (−0.0214/year and −0.0207/year) and soil moisture percentiles (−0.0028/year and −0.0024/year, respectively), indicating that GAMs accurately captured the direction of broadscale drought intensification from 1982 to 2020.
Predicted drought area (D1 or more severe) has high correlations (0.90-0.98; p ≤ 0.01) and low to moderate biases (−23%-0%) with monitored agricultural and meteorological drought area, respectively from leave-one-year-out cross validations ( Figure 5) and retroactive forecasting analyses ( Figure S6 in Supporting Information S1). The leave-one-year-out correlation is degraded to 0.82 for meteorological drought area when the threshold is increased to extreme drought (D3), and further degraded to 0.68 when only considering exceptional drought (D4). The correlation between predicted and monitored agricultural drought is mostly insensitive to drought severity (ranging from 0.89 to 0.92). One reason for this insensitivity for agricultural drought area, compared to the relatively higher sensitivity for meteorological drought area, is that soil moisture percentile predictions are post-processed to maintain the known percentile space (Equation 2), whereas predicted PDSI is not post-processed because PDSI has temporally and spatially variable ranges. Hence, uncertainties in predicted drought area can be ameliorated by understanding and accounting for the observed drought index space. As shown in Text S1 and Figure S13 of the Supporting Information S1, converting monitored and predicted PDSI to the percentile space reduces GAM-predicted PDSI errors, particularly during abnormally wet or dry (i.e., drought) times and places.
Predicted drought has similar spatial distributions relative to monitored meteorological and agricultural drought during noteworthy drought events in both leave-one-year-out ( Figure 6) and retroactive forecasting ( Figure S7 in Supporting Information S1) cross validations. Hence, accurate drought area predictions from GAMs shown   in Figure 5 and Figure S6 in Supporting Information S1 correspond with accurate spatial distributions during noteworthy drought events. The higher accuracy in the leave-one-year-out 2011 summer drought prediction ( Figure 6), relative to the retroactive forecast ( Figure S7 in Supporting Information S1), emphasizes that using a relatively longer statistical model training period can yield more skillful drought predictions, and thus emphasizes the importance of maintaining historical drought records.  During the same four drought events shown in Figure 6 (2007,2011,2012, and 2014 summers), Figure 7 compares probabilistic drought predictions (using a D1 threshold; see Section 2.2) with monitored drought, revealing that GAM ensembles predict relatively high probabilities of drought over areas and times where drought is monitored. Specifically, the median and IQR of predicted meteorological drought probability for pixels where drought was monitored during these events is 0.80 and 0.55-0.95, respectively; whereas the median and IQR for pixels with no meteorological drought monitored is 0.0 and 0.0-0.05, respectively. Similarly, the median and IQR of predicted agricultural drought probability for pixels where drought was monitored during these events is significantly higher (0.35 and 0.15-0.60, respectively; p < 0.01) than pixels where no drought is monitored (0.0 and 0.0-0.05). This result is similar when drought is considered as D3 or more severe ( Figure S8 in Supporting Information S1). Thus, GAMs are capable of accurately forecasting spatial distributions of major drought events (including exceptional drought) at a 1-month lead time.
There is substantial spatial heterogeneity in high resolution drought prediction skill (Figure 3). Summer drought predictions tend to be relatively inaccurate along the northern west coast from Northern California through Washington, the Sierra Nevada, Northern Idaho, Northern Texas and Oklahoma. Some regions where summer drought is particularly predictable are Southern California and the Central Valley, agricultural plains of Southern Idaho along the Snake River, Nebraskan agricultural plains along the Platte River, and North Dakota and South Dakota along the Missouri River. One factor that affects drought predictability is the persistence of drought, where drought predictability is expected to decrease in areas with relatively weak drought persistence (Hao et al., 2018). We quantify drought persistence as the correlation between mean pre-summer and mean summer drought indices (Figures 8a  and 8b). Drought persistence maps show similar features as GAM skill, particularly for r (Figures 3e and 3f), where regions with low drought persistence generally have low prediction skill. Indeed, when evaluation metrics are binned by drought persistence, there is a tendency for areas with greater drought persistence to correspond with better skill scores (Figures 8d-8f). For instance, there are high correlations between drought persistence (i.e., Figures 8a  and 8b) and GAM leave-one-year-out r (i.e., Figures 3e and 3f), with correlations of 0.82 and 0.73 for PDSI and soil moisture percentiles, respectively (p < 0.01). Many factors modulate land surface moisture memory (i.e., drought persistence) (Koster & Suarez, 2001;Mahanama & Koster, 2003), with a noticeable influence from precipitation (Rahman et al., 2015) where drier areas tend to have greater persistence and more skill (Figure 8). However, there are many exceptions to this statement, as indicated by only moderate anti-correlations between mean summer precipitation with drought persistence: −0.40 and −0.39 (p < 0.01) for soil moisture percentiles and PDSI, respectively. One such exception is that drought tends to be relatively persistent and predictable along surface water channels in relatively rainy regions (e.g., Nebraska) (Figures 3f and 8b), indicating an important role of land surface characteristics in modulating drought persistence and predictability as well. Overall, drought predictability tends to be higher where drought is more persistent, which has significant overlap with areas that have relatively dry summers.

Relative Importance of Predictors
Pre-summer PDSI is the most important predictor for summer PDSI, and pre-summer soil moisture percentile is the most important predictor for summer soil moisture percentile (Figure 9), as expected from the key role of drought persistence in controlling prediction skill discussed in Section 4.1. Thus, the GAMs used in this study heavily rely on the autocorrelation attribute of drought indices. Specifically, pre-summer PDSI is the most important predictor for summer PDSI over 97% of the WUS (Figures 9a and 9e), and pre-summer soil moisture percentile is the most important predictor for summer soil moisture percentile over 48% of the WUS (Figures 9c and 9g). Pre-summer PDSI is also an important predictor of summer soil moisture percentile, showing the greatest importance over 16% of the WUS. Given the dominant importance of the pre-summer drought indices, we also present the second most important predictors for GAMs over the domain (Figures 9b, 9d, 9f, and 9h). This reveals that meteorological drought predictability in the WUS also relies heavily on pre-summer VPD, temperature, soil moisture percentiles, PET, ET and precipitation, which vary by regions. These are the second most important predictor for summer PDSI over at least 11% of the WUS, whereas the rest of predictors are the second most important predictor over less than 6.5% of the WUS. Nine of the 10 predictors are the second most important for soil moisture percentile predictability over 8%-16% of the WUS, whereas SWE is the second most important predictor over only 3% of the WUS.  & Kluver, 2009;Shin et al., 2020). The importance of SWE as a predictor of agricultural drought in the Sierra Nevada is also supported by Figure S2 in Supporting Information S1 which reveals soil moisture percentiles averaged over early winter months (November-January) have higher correlations with summer (JJA) soil moisture percentiles, relative to soil moisture percentiles averaged over late-winter and spring months (February-April) in this area.

Lead Time Sensitivity: Evaluating 2-and 3-Month Lead Time Predictions
2-and 3-month lead time predictions of meteorological and agricultural summer drought are skillful across most of the WUS at a 4-km resolution while maintaining accurate representations of long-term trends and the total WUS area affected by various drought classifications (  (Figures 10 and 11). A, POD and r from 1-, 2-, and 3-month lead time predictions are similar (Table 3) Figures  S9 and S10 in Supporting Information S1). GAM predicted drought area (D1 or more severe) has high correlations (0.84-0.89; p ≤ 0.01) and low to moderate biases (0%-26%) with monitored agricultural and meteorological drought area (Figure 12). This correlation is degraded to 0.75 for meteorological drought area when the threshold is increased to extreme drought (D3), and further degraded to 0.63-0.65 when only considering exceptional drought (D4). The correlation between predicted and monitored agricultural drought is mostly insensitive to drought classification (ranging from 0.83 to 0.86). The high correlation between 3-month lead time drought area predictions (using winter predictors) with monitored summer drought area indicates the majority of the summer drought area's interannual variability can be explained by winter conditions alone. Predicted summer drought from 2-and 3-month lead time simulations show similar spatial distributions relative to monitored meteorological and agricultural drought during the 2007, 2011, 2012, and 2014 significant summer drought events ( Figure 13). Thus, our GAM forecasting system can provide useful 2-and 3-month lead time predictions of summer drought at a high resolution.
The relative importance of predictors from 1-month lead time predictions are similar for models providing 2-and 3-month lead time predictions (Figures S11 and S12 in Supporting Information S1). For 2-and 3-month lead time predictions, pre-summer PDSI is the most important predictor for summer PDSI over 97% of the WUS (Figures S11e and S12e in Supporting Information S1), and pre-summer soil moisture percentile is the most important predictor for summer soil moisture percentile over 52%-54% of the WUS (Figures S11g and S12g in Supporting Information S1), because of the dominant role of drought persistence. Pre-summer PDSI is also an important predictor of summer soil moisture at 2-and 3-month lead times, showing the greatest importance over 21% and 22% of the WUS, respectively. Meteorological drought predictability in the WUS also relies heavily on pre-summer soil moisture percentiles, temperature, VPD, PET, ET, and precipitation (Figures S11b, S11f, S12b, and S12f in Supporting Information S1). Pre-summer PDSI is most frequently (17%-18%) the second most important predictor for soil moisture percentiles, with eight of the remaining predictors being the second most important over 8%-12% of the WUS (Figures S11d, S11h, S12d, and S12h in Supporting Information S1).

Discussion
Stakeholders in agriculture, energy, wildfire, and water resources sectors require accurate and detailed local drought seasonal forecasts so important proactive decision-making can be made based on the best available data (Abatzoglou et al., 2017). For instance, drought data informs crop water requirements, and in turn surface water and groundwater demands and water allocations for irrigation organizations that play a critical role in water distribution and groundwater management decision making, and long-range weather forecasts have been identified among the most commonly used data sources for long-range planning and management by irrigation organizations (Wallander et al., 2022). Wildland fire managers have an increasing need for drought data to be incorporated to fire-related forecasts to improve firefighter safety and response, public health safety, and long-term fuel treatment strategies. In response, improving S2S forecasting of drought and fire hazard is a priority of NOAA's Weather Program Office. There are multiple validated observational and model-based hydrometeolrogical data sets at higher resolutions (e.g., ≤4-km) than drought forecasting analyses in the WUS (Daly et al., 2008(Daly et al., , 2015Liu et al., 2017), motivating the research conducted in this analysis which leverages data sets with a 4-km spatial resolution to predict drought at S2S scales while maintaining the native spatial resolution of these data.
This study finds widely used meteorological and agricultural drought indices are predictable at a 4-km resolution at S2S time-scales across most of the WUS using a relatively simple statistical modeling methodology. However, drought predictability in this study is heavily degraded in areas where drought has low persistence. Future analyses may attempt to overcome this shortcoming by using spatially variable pre-summer temporal averaging to maximize the correlation with pre-summer and summer drought (i.e., Figure S2 in Supporting Information S1). Furthermore, future analyses may consider using other predictors that were not considered in this study, such as the Oceanic Niño Index which tracks the ocean part of the El Niño-southern Oscillation climate pattern which has important relationships with drought in the WUS (Mo & Schemm, 2008), more detailed snow indices (e.g., peak SWE, ablation rate, day of snow disappearance, snow drought, etc.) (M. He et al., 2016;Huning & AghaKouchak, 2020), S2S forecasts of North American Monsoon rainfall (Prein et al., 2022), and other information from long-range dynamical modeling forecasts using combined dynamical and statistical approaches (e.g., Yan et al., 2017). Future analyses that use statistical models to forecast drought should also explicitly account for the ranges of drought indices (as discussed in Section 4.1 and Text S1 in Supporting Information S1). Another limitation is that although the WUS drought predictions made in this study are at higher-resolutions than previous research, there is still a great deal of spatial heterogeneity of drought at even higher resolutions (e.g., sub-kilometer scales) (Vergopolan et al., 2022). The methodology used in this analysis may be applicable at the sub-kilometer scale as more spatially and temporally continuous data sets become available at higher resolutions (e.g., Anderson et al., 2012;Burek et al., 2020;Vergopolan et al., 2021).

Conclusions
In this study we evaluated the ability of computationally efficient statistical models to forecast summer drought across the western U.S. from 1982 to 2020 at a much higher spatial resolution (4-km) than previous drought forecasting efforts. Summer drought predictors are pre-summer drought and climate conditions from long-term spatially and temporally continuous 4-km observation-based (whenever available) or reanalysis data sets. Our statistical forecasting models provide skillful predictions of meteorological and agricultural drought across the Figure 13. Monitored meteorological and agricultural drought classifications from PRISM and CONUS404 in rows one and four, respectively. 2-month lead time predicted meteorological and agricultural drought classifications from GAMs in leave-one-year-out cross validations in rows two and five, respectively. 3-month lead time predicted meteorological and agricultural drought classifications from GAMs in leave-one-year-out cross validations in rows three and six, respectively. western U.S. at 1-to 3-month lead times, and maintain accurate spatial and temporal variability of summer drought area. Our drought predictions provide statistically significant (p ≤ 0.05) skill scores across 70%-100% of the domain, while explaining at least 71% of the monitored interannual variability of drought area for D1 or more severe events (USDM drought categorization). Our predictions provide similar drought spatial distributions relative to monitored drought conditions during four representative major summer drought events in 2007, 2011, 2012, and 2014. Drought predictability is spatially variable, with drought tending to be more predictable in drier areas where drought is more persistent. As expected, pre-summer PDSI is the most important predictor for summer PDSI (representing meteorological drought condition), and pre-summer soil moisture percentile is the most important predictor for summer soil moisture percentile (representing agricultural drought condition), emphasizing that the statistical models used in this study rely heavily on the autocorrelation attribute of drought indices/conditions, consistent with previous studies. This work presents a new capability to predict seasonal drought conditions at a high spatial resolution across the western U.S. Future research should expand on this study by evaluating drought forecasting at higher temporal resolutions (e.g., monthly or weekly instead of seasonal scale) while maintaining a high spatial resolution. This study is intended to support the development of future operational drought early warning systems that provide high resolution drought forecasts.

Data Availability Statement
The data that support the findings of this study are openly available: https://data.mendeley.com/datasets/gw7c3yjhyp/2 (Abolafia-Rosenzweig et al., 2022c). The CONUS 404 data are archived and accessible on the United States Geological Survey Black Pearl Tape system and on the NCAR super-computer Campaign storage system.