Understanding spatial variability of forage production in California grasslands: delineating climate, topography and soil controls

Rangelands are a key global resource, providing a broad range of ecological services and economic benefits. California’s predominantly annual rangelands cover ∼12% of the state’s land area, and the forage production is highly heterogeneous, making balancing economic (grazing), conservation (habitat) and environmental (erosion/water quality) objectives a big challenge. Herein, we examined how climate and environmental factors regulate annual grassland forage production spatially across the state and among four ecoregions using machine learning models. We estimated annual forage production at 30 m resolution over a 14 year period (2004–2017) using satellite images and data fusion techniques. Our satellite-based estimation agreed well with independent field measurements, with a R 2 of 0.83 and RMSE of 682 kg ha−1. Forage production (14 year average) showed large spatial variability (2940 ± 934 kg ha-1 yr-1; CV = 35%) across the study area. The gradient boosted regression tree with 11 feature variables explained 67% of the variability in forage production across the state. Precipitation amount, especially in November (germination) and April (rapid growth), was found as the dominant driver for spatial variation in forage production, especially in drier ecoregions and during drier years. Seasonal distribution of precipitation and minimum air temperature showed a relatively stronger control on forage production in wetter regions and during wet years. Additionally, solar energy became more important in wetter ecoregions. Drought reduced forage production from the long-term mean, i.e. a 33% ± 19% decrease in production (2397 ± 926 kg ha-1yr-1; CV = 38%) resulting from a 29% ± 5% decrease in precipitation. The machine learning based spatial analysis using ‘big data’ provided insights on impacts of climate and environmental factors on forage production variation at various scales. This study demonstrates a cost-effective approach for rapid mapping and assessment of annual forage production with the potential for near real-time application.


Background
Rangelands comprise more than 30% of global land area and provide a broad range of ecosystem services-including food and forage production, soil and water resource protection, biodiversity and wildlife habitat (Parton et al 2012, Schohr 2014. California has more than 4-6 million hectares of annual rangelands, approximately 10-14% of the state's land area, ranging from the North Coast to southern interior. Nearly 70% of the forage production in provide. Most notably, severe drought led to ecological shifts that significantly impacted forage production (Macon et al 2016). Due to its diverse climate and topography, California's forage production varies by as much as fourfold, both interannually at a given location and within a given year at different locations (Larsen et al 2014. This spatial heterogeneity and temporal dynamics pose challenges for rangeland managers to balance economic (grazing), conservation (habitat) and environmental (erosion/water quality) objectives. Therefore, it is critical to quantify and understand the drivers of spatial/temporal variability in forage production across long-and short-term (e.g. extreme wet/dry) conditions.
Growing-season precipitation is generally recognized as the primary driver of annual grassland forage production in the Mediterranean climate of California (Le Houerou 1984). Several studies developed regression equations linking precipitation and/or air temperature with peak annual forage production (Murphy 1970, Pitt and Heady 1978, George et al 1988, 1989, Chaplin-Kramer and George 2013. While, these studies found positive relationships between peak production and precipitation and air temperature, the relationship was not always simple and displayed considerable site-specific variability. Additionally, an analysis of long-term, peak forage production across California documented abovemean production during years with low but welldistributed precipitation , indicating that temporal variability of precipitation within a given year is also an important factor. At a local scale, topography and soil characteristics affect forage production by regulating solar energy and moisture availability. For example, south versus north slopes have contrasting microclimates that affect the length of the growing season, especially during years with limited rainfall (Hufstader 1978. A recent microclimate-forage growth study on a California Central Coast grassland found that wetter topographic locations were more productive in a dry year (water limitation) while warmer topographic locations were more productive in a wet year (energy limitation) . Soil properties, such as texture, fertility and water holding capacity, also affect vegetation composition and growth characteristics , but are much less studied.
Although previous studies examined relationships between forage production and its potential drivers, they were limited in spatial and temporal scale relative to the extent of California rangelands. Satellite remote sensing offers a powerful and cost-effective approach for rangeland monitoring at large spatial and temporal scales (Reeves et al 2015, Jones et al 2018. Landsat and MODIS satellites, for example, have collected land observations for resource management at 30-500 m resolution for several decades. Empirical relationships between the normalized difference vegetation index (NDVI) and green plant photosynthetic activity during the active growing season have been developed and applied to estimate grassland productivity in Europe (Boschetti et al 2007), Asia (Xu et al 2008), and North America (Gaffney et al 2018). A vegetation index-based approach was also adopted in a global study to assess the effect of climate change on global grassland productivity from 1982 to 2011 (Gao et al 2016). More recently, small unmanned aerial systems (sUASs) have been used to estimate finer resolutions of forage production at a catchment scale . Despite their advantages, satellite remote sensing images are limited by frequent cloud cover during rainy seasons when grasslands are actively growing. For example, considering the 8-16 day Landsat revisit cycle, an overcast day on the overpass date may lead to no useable images for as long as 32 days. However, data fusion techniques make it possible to obtain daily 30 m resolution observations, by combining Landsat with other more frequent but coarser resolution satellite observations such as MODIS (Gao et al 2006, Zhu et al 2010, Chen et al 2015. Previous studies reported that grassland forage production is highly dependent on water availability, which is often a function of a suite of climatic, topographic and edaphic factors (Fuhlendorf et al 2017). However, most studies of grassland productivity tend to focus on the impact of one set of environmental drivers, either climate, topography or soil properties (Willms 1988, Jin and Goulden 2014, Gao et al 2016, whereas a comprehensive analysis of how these drivers work together to shape production is still lacking. The lack of sufficient field measurements is often an impediment for using traditional statistical approaches. However, recent advances in machine learning methods provide a powerful tool for knowledge discovery from large datasets, by disentangling and visualizing complex relationships (Molnar 2019).
In this study, we mapped grassland forage production at 30 m resolution on California rangelands by fusing Landsat and MODIS satellite data from 2004 to 2017, and investigated how a suite of climate, topography and soil factors determined the spatial variability of the derived 14 year mean forage production using gradient boosted regression trees (GBRT) (Friedman 2001). Specifically, we sought to answer the following questions: (1) What are the main drivers shaping the spatial patterns of forage production of California rangelands? (2) How do these environmental factors affect forage production across the state and within each ecoregion? and (3) How do drought versus wet conditions alter these relationships? The results of this study will inform rangeland management and policy decisions and provide insights on productivity of California's diverse rangelands and how their response to future climate change may vary depending on spatial locations within the state.

Study area
This study focused on California rangelands, grasslands in particular, characterized by a Mediterranean climate with hot, dry summers and mild, moderately wet winters . Our effective study area is ∼24 000 km 2 , spanning from 33 • N to 41 • N and 118 • W to 124 • W. It represents a mean precipitation gradient of 160-2290 mm yr −1 and elevations from sea level to ∼1200 m (figures 1(a) and (b)). The area was divided into four ecoregions, based on ECOMAP (Cleland et al 2007): Northern California Coast Ranges (NCCR), Northern California Interior Coast Ranges (NCICR), Central Coast Ranges (CCR) and Sierra Nevada Foothills (SNF) (See table S1 for land cover percentages). Precipitation fluctuates widely from year-to-year across the study area with 397 ± 271 mm (mean ± std dev) in the driest year (2014) and 1158 ± 654 mm in the wettest year (2017) of the study period. Annual grasses and forbs typically germinate with onset of the rainy season in late fall (November) and reach peak biomass in late spring as the rainy season ceases, typically in April (Larsen et al 2014). Compositionally, forbs tend to be more prevalent in dry years while grasses are more dominant in wet years (Pitt and Heady 1978).

Data
To calibrate and validate satellite-based forage estimates, we measured monthly dry forage biomass at seven grazing exclosures (>60 × 60 m) during two contrasting growing seasons (November to May in 2017 and 2018) (figure 1(c)). Water year 2017 (>1000 mm yr −1 ) was a wet year while 2018 (<500 mm yr −1 ) was a relatively drier year. Our sites covered a large environmental gradient, from dry sites in the CCR to northern wetter sites. We clipped 15-36, 30 × 30 cm quadrats, within each exclosure, depending on exclosure size, and oven-dried the clippings at 60 • C for at least 48 h. Site level biomass measurements were calculated by averaging the quadrat measurements. Sites were mowed or grazed to remove excess dry residual matter at the end of each growing season. In total, we collected 61 site-month measurements across all sites and time (table S2).
To estimate daily and annual forage production at 30 m across the entire study area, we combined remote sensing observations of surface reflectance in the red and near infrared bands from Landsat and moderate resolution imaging spectroradiometer (MODIS) satellite instruments for the 2004-2017 period. We downloaded the surface reflectance and the associated quality assessment (PIXELQA) layers of Landsat Analysis Ready Data (ARD, series 5-8) from USGS earth explorer (https://earthexplorer.usgs.gov/). Landsat ARD dataset includes Landsat (4-8 series) observations at 30 m acquired from 1982 to present available every 8-16 days. The ARD datasets are geometrically and radiometrically consistent and have flagged non-target features (e.g. clouds and cloud shadows) and poor quality observations, to allow analysis with minimum user preprocessing (Dwyer et al 2018). At a higher temporal frequency, MODIS sensors have been collecting global multispectral images every day at moderate spatial resolutions since 2000. We obtained the MODIS collection 6 daily red and near infrared surface reflectance products at 250 m since 2003 from both Terra (MOD09GA) and Aqua (MYD09GA) instruments from NASA Earthdata (https:// earthdata.nasa.gov/). Each MODIS product included the corresponding quality assurance (QA) layers.
NDVI (Deering et al 1975) is a widely used proxy for photosynthetic activity and primary production (Carlson and Ripley 1997). We calculated NDVI from both Landsat and MODIS surface reflectance. Initial filtering removed low quality observations, i.e. those affected by clouds and cloud shadows, based on the corresponding QA layers. Invalid observations in MODIS time series were temporally gap filled using a cubic interpolation method.
We used Daymet climate data from the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) to assess drivers of spatial variation in forage production. The gridded daily surface weather variables at 1 km spatial resolution were interpolated from more than 8000 meteorological stations, based on a digital elevation model (Thornton et al 2016). We downloaded Daymet climate data during the same time period as the MODIS data (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). Compared to a longer historical time period , annual precipitation during 2004-2017 was 18 mm yr −1 lower and mean air temperature was 0.2 • C higher. We aggregated daily precipitation, min/max/mean air temperatures, and solar radiation to monthly averages. Seasonal averages that were potentially related to forage production were further derived as detailed in table 1. We calculated the precipitation concentration index (PCI) (Oliver 1980) to depict precipitation variation between the driest and wettest month during the growing season (November to May) (Sloat et al 2018) and growing degree days using a baseline temperature of 4 • C. We chose to use 4 • C as the baseline temperature because young annual plants in California grow very slowly at temperatures between 4.4 • C and 10.5 • C . For spatial analysis, we derived long-term means (LTMs) for 29 variables from water year (Oct-Sept) 2004-2017 (table 1).
Shuttle Radar Topography Mission (SRTM) 1 Arc-second (30 m) global elevation data were used for topographic characterization. We derived aspect, slope and water flow direction (flow accumulation  Table 1. List of variables and feature groups used as predictors for the forage production models. Seasonal variables include averages during growing season (GS, November-June), early-, mid-and late-seasons (November-January, January-March, March/April-June), winter refers to December to February, which was further split into early and late winter (November-January and January-March).

Variable names and Group
Feature type (unit) abbreviations Data source/Reference a Growing degree day variables were removed due to high correlation with mean air temperature. b Aspect was converted to a continuous variable by cosine(aspect) where −1 represents south and +1 represents north. and curvature) using ArcMap 10.6.1. We also calculated illumination condition (IC), a widely used measure for topography-modulated lighting conditions: −1 (less light) to +1 (more light) (Tan et al 2013). Grasslands mostly exist in well-drained areas below 1200 m elevation and occur on diverse soil types (Jackson and Bartolome 2007). We obtained the 800 m resolution aggregated and rasterized SSURGO soil property dataset from California Soil Resource Lab (https://soilmap2-1.lawr.ucdavis.edu/mike/ soilweb/soil-properties/download.php). Rasterized SSURGO datasets were derived based on thicknessweighted average values to aggregate horizons within soil profiles and spatially-weighted averages of map units within grids (Beaudette et al 2013, O'Geen et al 2017. Selected soil properties for the 0-25 cm layer (dominant rooting zone for grasses/forbs) that were expected to most strongly affect forage production were downloaded: clay, sand, silt, water holding capacity, bulk density and soil organic matter. The Daymet climate data and rasterized SSURGO data sets, originally at 1000 m and 800 m spatial resolution, were resampled to 30 m using a cubic resampling method to match satellite and topographic data grids.

Forage production estimation with satellite data fusion
We estimated forage production as a function of absorbed photosynthetically active radiation (APAR) accumulated during the growing season (cAPAR) (equation 1, figure 2).
PAR was estimated from daily shortwave incoming solar radiation at 2 km resolution (Hart et al 2009), assuming a constant ratio of 0.5 (Blackburn and Proctor 1983, Papaioannou et al 1993, Li et al 2010, Akitsu et al 2015. The CIMIS program generates all sky shortwave incoming solar radiation for California, based on GOES satellite observations (Hay 1993).
We followed an approach introduced by Sellers et al (1996) to calculate fPAR as a function of NDVI and simple ratio (SR) of NIR reflectance over red reflectance: (2) where the minimum and maximum fPAR (fPAR min = 0.001, fPAR max = 0.95) correspond to the lower and upper 2% of the NDVI histogram (NDVI min and NDVI max ). NDVI min and NDVI max were set to 0.015 and 0.760, respectively, based on values derived in Los et al (2000). We applied the same procedure for SR min and SR max generating values of 1.030 and 7.333, respectively. California rangelands grow rapidly during the rainy season, when clear sky satellite imagery is sometimes limited, especially for Landsat with a 16 day revisiting interval. We, therefore, implemented a spatial and temporal adaptive reflectance fusion model (STARFM) (Gao et al 2006 to fully exploit the complementary resolutions of Landsat and MODIS data. The STARFM approach combines the spatial resolution of Landsat (30 m) with the temporal resolution of MODIS (daily at 250 m). The approach uses one or more pairs of concurrent Landsat and MODIS images to predict Landsat resolution images on days when Landsat data were unavailable. We used a 48 day time window when searching for image pairs, prioritizing dates with clean pixels and dates that were closer to the date of prediction. This fusion approach generated continuous daily NDVI images at 30 m resolution (figures S1(a)-(c) available online at stacks.iop.org/ERL/16/014043/mmedia). We applied an enhanced Savitzky-Golay filter (Chen et al 2004) to the fused NDVI time series to remove abnormal values. The filtered time series followed similar temporal patterns with those from MODIS data while keeping the spatial details from Landsat observations (figure S1(d)). To mask trees and other perennial vegetation, we only kept pixels where NDVI was smaller than 0.3 in mid-August (figure 1(c)), because annual herbaceous plants are senescent in August.
Timing for the beginning and end of each growing season were estimated by fitting two sigmoidal curves stitching the fused daily NDVI time series (Zhang et al 2003) (See appendix). The green-up and senescence dates were defined as the dates with the largest increase/decrease of NDVI based on the curvature of the simulated curves (figure S2). We then estimated the annual cumulative APAR (cAPAR) as the sum of daily APAR over the identified growing seasons for each 30 m pixel and for each year during 2004-2017 (figure 2). We found a strong linear relationship between cAPAR and measured monthly dry biomass for all seven exclosures (R 2 = 0.83) (figure 3(a)). A linear regression was developed to relate forage production with cAPAR, using 70% of the data for training and the remaining 30% for testing. The slope showed an average light use efficiency (LUE) of 0.55 g MJ −1 APAR. The estimated forage production explained 83% of variation with a RMSE of 681 kg ha −1 , when validated using the testing dataset (figures 3(b) and (c)).

Figure 2.
Flowchart for forage production estimation with data fusion. We first generated daily NDVI images at 30 m resolution by fusing Landsat and MODIS data. The fused NDVI was then converted to fPAR. We estimated forage production as a function of absorbed photosynthetically active radiation (APAR) accumulated during the growing season (cAPAR). We estimated annual forage production from the calibrated LUE over each 30 m pixel for the 14 growing seasons in this study (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). LTM forage production was generated for the entire 14 year record, drought years (water years 2007-2009 and 2012-2015) (Asner et al, 2016, Williams et al 2015, Ghosh 2019 and extremely wet years (water years 2005, 2006 and 2017) with > 1000 mm yr −1 annual precipitation. We quantified the relative departure from the long-term production during the extreme dry and wet periods (departure normalized by LTM). It is important to note that California grasslands are predominantly a grazed landscape. Since our forage estimation is based on daily cumulative APAR throughout the active growing season, 'forage production' carries different meanings in grazed vs. ungrazed areas. In ungrazed areas, forage production is equivalent to the peak forage production, whereas in grazed areas, it is the sum of standing crop and livestock consumption.

Gradient boosted regression trees and feature selection
To understand the complex relationships between forage production and potential environmental drivers, we used the GBRT machine learning model (Friedman 2001) available at the 'scikit-learn' library in Python (Pedregosa et al 2011). GBRT fits regression trees sequentially, with each tree aimed at minimizing the prediction residuals of its predecessors. To reduce correlations between trees, it uses a random subsample of the training data for fitting each tree, and only considers a portion of the input features at a time for each node. The final prediction is the sum of all regression trees multiplied by a learning rate (lr).
We randomly sampled 10% of the pixels (n = 2 600 000, figure S3) in our study area, which were further randomly split (70 vs. 30%) for model training and testing. We tuned the model hyperparameter by grid search, and set n = 100, subsample = 0.6, max_depth = 6, max_features = 1.0, lr = 0.1 (see SI-section 1). We started with 41 independent variables and found a large degree of inherent autocorrelation ( figure S4). Therefore, we grouped these variables into 11 features based on correlations, i.e. variables with r > 0.7 were grouped (table 1). Hereafter, 'feature' refers to the 11 independent variable sets, and 'variable' refers to the 41individual climatic, topographic and soil metrics (see SI-section 2 for more information).
We built ensemble models, by allowing only one variable in each feature group to be included in a GBRT model each time to reduce autocorrelation between variables. We first built a baseline model using the first variable within each feature as listed in table 1, and then another 10 models were built by replacing the 'GS_ppt' variable with the other ten variables in feature group 1, one at a time. Finally, we looped through all feature groups that had more than one variable and built 30 additional models.

Determinants of spatial variation in forage production
We examined the relative contribution of each predictor to spatial patterns of forage production using feature importance, quantified by the mean decrease in the impurity by each input variable based on the GBRT models (Breiman et al 1984). Impurity represents how poorly the observations at a given node fit the model, measured by the residual sum of squares within that node in a regression tree. For reduced models, we averaged feature importance by feature group and calculated the standard deviation within each group.
Additionally, we used partial dependence plots (PDPs) to quantify how forage production varies spatially with each independent variable. Partial dependence marginalizes the machine learning model over the distribution of other independent variables, so that the remaining function shows the relationship between the targeting variable and dependent variable (Molnar 2019). The y-axis of PDPs represents the difference between the marginalized prediction and the mean prediction. For example, a wide range on the y-axis indicates a strong sensitivity of forage production to the target predictor, suggesting a strong controlling effect with confounding factors excluded. A partial dependence curve with positive slope suggests a positive relationship and vice versa.

Results
We first report the overall spatial patterns of forage production (3.1) and performance of the GBRT models in capturing the spatial variability (3.2). The determinants of spatial variation in LTM production are presented in section 3.3. In Section 3.4, we further examined drought vs. wet year production. Within each section, we report results for the whole study area first, followed by results for each ecoregion.

Spatial patterns in long-term forage production
Estimated mean annual forage production (2940 ± 934 kg ha −1 yr −1 ) from satellite observations over the 2004-2017 study period showed large spatial variability, with a CV of 35% across the study area ( figure 4(a)). The CCR had much lower production (2610 ± 890 kg ha −1 , table 2), where the driest areas of the ecoregion with precipitation less than 200 mm yr −1 (figure 1(b)) had a mean annual production of less than 2000 kg ha −1 (figure S5). In contrast, 11% of the study area had very high forage production (>4000 kg ha −1 yr −1 ), including the northern tip of CCR, central SNF, a large portion of NCCR and several valleys in NCICR. We found similar spatial patterns for production averaged over dry and wet years ( figure 4(b) and (c)). However, drought-year production (2397 ± 926 kg ha −1 yr −1 ; CV = 38%) was lower and had a higher spatial variability than wet-year production (3722 ± 1080 kg ha −1 yr −1 ; CV = 29%).

GBRT model performance
The GBRT model effectively captured the spatial variability of satellite-based LTM forage production at 30 m resolution across the entire study area. Predicted forage production was in good agreement with the estimated forage production in the randomly selected testing data (R 2 = 0.70 and RMSE = 507 kg ha −1 ), when using all 41 variables as input (table 2, full model). Reduced models with 11 predictors, one variable from each of the 11 feature groups, had slightly reduced performance with a RMSE = 535 ± 4 kg ha −1 , explaining 67% (±0.5%) of spatial variability (tables 2 and S3). The predicted forage production map (figure 5) showed very similar spatial patterns as those from the satellite-based estimates, even though the model was trained with only 10% of the pixels ( figure 4(a)). More than 84% of the area had an absolute percent error between modeled vs. satellite estimates lower than 20% (figure S6) and the mean absolute percent error was 11.8% N represents the number of pixels for the domain. b Standard deviations associated with the reduced study area models were calculated from the ensemble models.
R 2 and RMSE were based on comparison with the corresponding independent test datasets. Mean percent error is the average of percent errors over all pixels, where percent error is actual error divided by the remote sensing estimation. Relative difference (absolute percent-errors) were calculated as the mean absolute percent error over all pixels, i.e. dividing the absolute difference between the GBRT predicted and remote sensing estimated forage production by remote sensing estimation. averaged over the study area. Among ecoregions, the CCR ecoregion showed the highest prediction accuracy (table 2). Models built specifically for mean production during drought years performed better than the LTM models (R 2 = 0.74 ± 0.004 and RMSE = 425 ± 4 kg ha −1 ) (table 2). In contrast, wetyear production models displayed larger uncertainty in capturing the spatial variability (R 2 = 0.52 ± 0.009 and RMSE = 675 ± 6 kg ha −1 ). However, the mean and absolute percentage errors were similar for drought vs. wet years (table 2).

Determinants of spatial variation in forage production
Over the entire study area, total amount and monthly distribution of precipitation were the two most influential variables controlling the spatial variation in long-term forage production (figure 6). Together, these two features reduced more than 40% of impurity across all splits within each regression tree. PDPs showed that forage production increased rapidly with an increase in growing season precipitation before reaching a plateau at ∼450 mm yr −1 (figure 7(a)). PDP distributions were skewed toward negative values, indicating that low precipitation amount has a stronger negative impact on forage production than a corresponding positive impact from high precipitation. For example, in the more arid regions, forage production was enhanced by 1000 kg ha −1 when precipitation increased from 300 mm yr −1 to 450 mm yr −1 .
November precipitation, in particular, had the strongest impact on forage variability (figure S7), with a ∼ 1750 kg ha −1 difference in forage production when precipitation varied from the driest (15 mm) to the wettest regions (90 mm, figure S8(a)). In contrast, January precipitation had the weakest impact, with an 850 kg ha −1 difference in forage production across the driest (50 mm) to wettest regions (140 mm). Notably, April precipitation (figure S8(b)) showed the strongest per-unit precipitation influence on forage production. Every 1 mm increase in April precipitation resulted in an average 29 kg ha −1 increase in annual forage production.
Areas with more evenly distributed precipitation (i.e. lower PCI, concentrated in the SNF and NCICR, figure S9) had higher forage production than those with less evenly distributed precipitation (i.e. higher PCI, spreading across the southwestern CCR, figure  S9), given the same amount of precipitation ( figure 7(b)).The impact from precipitation distribution was more pronounced when PCI varied from 12 to 13.5 than at higher values. The difference in forage production was approximately 500 kg ha −1 between locations with the lowest and highest PCI values.
Minimum temperature was the third most influential feature in regulating the spatial variation of forage production (figure 6(a)). For example, a warmer minimum temperature during the growing season increased the production by 200 kg ha −1 beyond 4.3 • C (figure 7(c)). We observed slightly higher forage production in colder areas (<3 • C), mostly at higher elevations in the NCCR region where precipitation is high (figures 7(c) and S10(a)).
Elevation, mean air temperature and solar energy had intermediate impacts on forage production (∼10% importance). Forage production steadily decreased as elevation increased from 150 m to 900 m ( figure 7(d)). Compared to the mean for the entire study area, forage production at 900 m elevation was 150 kg ha −1 lower but 100 kg ha −1 higher at 150 m elevation. When everything else was held at their mean values, areas with cooler growing season mean air temperature were more productive than warmer areas (figure 7(e)). The relationship between solar radiation and forage production showed an asymmetrical bell shape (figure 7(f)), peaking at 250 W m −2 (figure S10(b)). However, the lowering effect was more pronounced at higher values of solar radiation.
The next most important feature was soil properties ( figure 6(a)). Higher plant available water holding capacity enhanced forage production (figure 7(g)). As the 0-25 cm available water holding capacity increased from 2.5-4.5 cm, forage production increased from 100 kg ha −1 below to 75 kg ha −1 above the overall mean forage production, when all other predictors were held at their mean values. Slope contributed another ∼5% of importance with steep slopes generally less productive (e.g. 300 kg ha −1 lower on a 25 • slope) than those on flat areas (figure 7(h)). Variables related to insolation affected forage production slightly. For example, south-facing areas were generally less productive than north-facing areas with an average difference of 140 kg ha −1 (figure 7(i)). Curvature and flow accumulation had very limited impact on forage production at the statewide level ( figure 6).
At the sub-regional scale, the ecoregion-specific models showed different dominant controls for spatial variation in LTM forage production (figure 8). Overall, precipitation amount played a more important and dominant role in drier ecoregions (CCR & SNF, table S4), whereas energy and temperature became more important in wetter ecoregions (NCCR & NCICR). In the wettest region (NCCR), for example, minimum air temperature (see PDP in figure S11) was the most important (15%), while all other non-topography related features plus elevation had a similar contribution (10%) (figure 8(c)).
Similarly, in the NCICR ecoregion (second wettest ecoregion), forage production variation was a mixed effect of climate, solar energy, elevation and soil properties, although precipitation amount became the most influential feature ( figure 8(d)).

Forage production under drought versus wet conditions
Drought reduced forage production relative to the LTM, but not by the same magnitude across the state. For example, production was reduced by 612 ± 315 kg ha −1 yr −1 (23% ± 12%) across the study area, when the precipitation decreased from the LTM by 140 ± 46 mm yr −1 (31% ± 5%) during drought years ( figure 9(a)). Larger departures occurred in less productive areas, such as a decrease by more than 40% on the west side of the Central Coast and ∼20% in lower foothills in dry years ( figure  9(a)). During wet years, similar patterns were found for increased production, but with a higher departure, i.e. 882 ± 521 kg ha/ −1 yr −1 (33% ± 20%) due to a MAP increase of 245 ± 89 mm yr −1 (56% ± 12%) ( figure 10(b)).
We also found that areas that experienced a smaller forage reduction during drought years were likely to experience a smaller production increase during wet years (figure S12(c)). At a given location, every 1% increase in forage production during wet years corresponded to a 0.55% production reduction during drought years, although the percent increase in precipitation was usually higher in wet years than the percent decrease in drought years ( figure S12(d)). The GBRT models, built separately for wet and dry years, showed different dominant factors controlling the spatial variation of forage productivity. Compared to drought years, the relative importance of precipitation amount decreased from 24%-19% during wet conditions, while the distribution of precipitation became equally important with precipitation amount (figures 6(b) and (c)). Soil properties became relatively more important during wet years, i.e. 9.5% as compared to 7.0% in dry years.
Elevation showed a negative impact on forage production under drought conditions, at a rate of −0.36 kg ha −1 m −1 ( figure 10(c)). Under wet conditions, the reduction in forage production became greater with increasing elevation: −0.33 kg ha −1 m −1 at 150-450 m to −0.67 kg ha −1 m −1 at 450-900 m ( figure 10(d)). Similar to the analysis for the entire time period, lower growing season mean temperature was associated with higher forage production, under dry or wet conditions; however, the influence of growing season mean temperature was stronger for wet years (−336 kg ha −1 • C −1 , figure 10(f)) than for drought years (−146 kg ha −1 yr −1• C −1 , figure 10(e)). The negative influence was most pronounced in cooler regions (<16 • C, mostly higher elevations, figure S13) during wet years (figure 10(f)).

Spatial controls on forage production
California grassland forage production showed a large degree of spatial variability associated with heterogeneity in environmental conditions. Across the state, climatic factors, especially the amount of precipitation received at the start and end of the growing season, exerted a stronger influence on spatial variability in forage production compared to topographyand soil-related factors. However, there was a strong interplay among factors, with the relative importance of each factor being dependent on local microclimate and climate conditions (figure 8).

Climate
Precipitation in California exhibits large interannual, latitudinal and topographic variation. Consistent with previous studies, our results showed that forage production was a strong function of the total amount and temporal distribution of annual precipitation (Duncan and Woodmansee 1975, Sala et al 1988, Bai et al 2008. Previous studies also suggested that precipitation before November 20th was an important driver of peak forage production at a site in NCCR (Murphy 1970), as well as demonstrating a significant correlation between peak forage production and November and April precipitation in SNF (Duncan and Woodmansee 1975).
Our results provide robust support for these previous site-level investigations based on 14 years of forage production at 30 m resolution across four ecoregions spanning 24 000 km 2 . At the statewide scale, our results indicated that November and April precipitation had strong positive impacts on peak forage production. Annual species in California typically germinate in October-November, depending on timing of the germinating rainfall event (Murphy 1970). Therefore, sufficient precipitation during this period promotes germination and early-season biomass accumulation as residual soil heat and moderate air temperatures allow slow growth. This early season growth increases root and leaf area development that allows for more efficient capture of solar radiation and soil water once temperature and moisture conditions become more favorable later in the growing season . Additionally, establishment of early season soil cover may enhance rainfall infiltration into the soil (versus surface runoff), thereby increasing soil water storage for subsequent plant use. In April, as temperatures and solar radiation become more favorable, plants can rapidly accumulate biomass if soil water is available. Therefore, precipitation amounts in November and April appear to contribute a disproportionate increase to annual forage production.
Ecoregion-specific modeling revealed precipitation distribution within the growing season as an important factor controlling peak forage production (figure 8). Mediterranean annual grasses senesce quickly when soil moisture depletes. Therefore, small but consistent precipitation events throughout the growing season are favorable to higher forage production (Duncan and Woodmansee 1975). Forage production models for wet-dry endmember conditions revealed that when water is limited during drought years, a more uniformly distributed precipitation regime favors higher forage production (figure 10(a)). In contrast, when soil water is ample during wet years, regions with more uniformly distributed precipitation (PCI < 10.8) produced somewhat less forage than regions with more variability in rainfall distribution ( figure 10(b)). This observation is likely associated with plants having the highest growth potential when favorable conditions for temperature, solar radiation and precipitation coincide later in the growing season. Therefore, when overall precipitation is higher, greater forage production is achieved when more precipitation occurs during the rapid growing stage, namely April.
Temperature affects plant growth through its influence on evapotranspiration and respiration/photosynthetic rates (Vermeire et al 2009). We found that given the same amount of precipitation, regions with lower daily mean air temperature (T mean ) during the growing season had higher forage production ( figure 7(e)). This may result in part from warming-caused higher water loss through evaporation, reducing the proportion of 'effective precipitation' available for photosynthesis. De Boeck et al (2006) reported decreased production under warmer situations when comparing grass growth in climate-controlled chambers with ambient and warmer (+3 • C) temperatures. However, the daily minimum (mostly nighttime) air temperature (T min ) showed an opposite effect from T mean (figure 7(c) and (e)). At nighttime, most plants pause photosynthetic activity and only conduct respiratory activities, consuming carbohydrates. Thus, warmer night temperature increases respiratory C losses (Peraudeau et al 2015) and photosynthetic capacity during the following day (Turnbull et al 2002). Our results suggest that when 4.3 • C < T min < 5.4 • C, photosynthetic capacity is more sensitive to changes in nighttime temperature than respiratory capacity, compared with areas with 2.5 • C < T min < 4.3 • C. Our results highlight the different relationships between forage production and air temperature metrics (T min vs. T mean ).

Topography
Landscape positions experiencing less environmental stress (e.g. soil water deficit) were found to be associated with higher forage production at a ranch with a complex topography . Our large-scale study here showed that topography contributed only moderately to spatial variability in forage production. Elevation (150-900 m) was the most influential topography-related factor, with more pronounced impact in the drier CCR and SNF ecoregions. Soil moisture is typically lower in higher elevations than lower elevation regions at 5-10 cm (Porazinska et al 2002) and at multiple depths (10-90 cm) (Bales et al 2018) due to less soil development at higher elevations. Higher elevation regions also have higher solar radiation. Together, lower soil moisture and higher solar radiation may invoke additional water stress on forage growth in high elevation regions, and eventually lead to reduced forage production.
Aspect and slope showed little impact on forage production, probably because their impacts on insolation are relatively muted prior to the spring solstice and only become more apparent in April, which may enhance the importance of April rainfall on forage growth. In addition to the role in regulating insolation, greater slopes are conducive to greater soil runoff, thereby decreasing water availability for plant use.

Soil properties
Few studies have considered the role of soil physical properties on California rangeland forage production. Strong impacts from soil pore size distribution and available water holding capacity have been reported for grasslands in the Great Plains (Reyes et al 2017). However, these grasslands are dominated by perennial species with deep root systems. In parts of California, site-level soil texture and available water holding capacity were reported to be not significantly related to forage production based on an experiment conducted in the NCCR ecoregion (Jones et al 1983). In our regional analysis, available water holding capacity showed a positive impact on forage production when other variables are held constant. However, spatial variation in soil properties showed a relatively weaker effect on the statewide variation in LTM forage production, compared to climatic factors, solar energy and elevation. On the other hand, we did find soil properties became more important during wetter years (figures 6(c) and 8(b)-(d)). Finer textured soils that have higher available water holding capacity are able to retain more rainfall for plant growth (Weng and Luo 2008). Our findings may be limited by the small variability of plant-available water holding capacity in the upper 25 cm of soils across the study area (e.g. interquartile range = 2.7-3.8 cm; CV = 27%). Further, the rooting distribution of annual grasses/forbs is concentrated in the upper 15 cm with few roots extending below a depth of 30 cm, thereby limiting the importance of soil water storage in controlling overall annual forage production relative to more deeply rooted perennials (Gordon andRice 1992, Holmes andRice 1996).
The weak relationship between soil properties and forage production is not equivalent to low importance of soil moisture. Soil moisture not only controls onset of the growing season (Bart et al 2017), but also is positively related to annual forage production . Additionally, soil water storage serves as a water buffer to support plant growth between rainfall events; therefore, periodic rainfall events that are well distributed throughout the season are important for sustaining soil water and forage growth over longer time periods.

Implications for land managers and policymakers
Sustainable management of forage resources is a critical goal for rangeland livestock producers (Kachergis et al 2013, Roche et al 2015) and a key factor in adaptive rangeland decision-making (Roche 2016). Stakeholders make decisions at different time scales, depending on their management goals (Brown et al 2017). For example, strategic planning for ranch sustainability may require LTM and wet-dry year variability information, while tactical planning, such as rapid drought response, depends on near real time information. Annual forage forecasts could help land managers make decisions about grazing and stocking rate strategies, such as taking proactive actions before dry or drought conditions fully emerge. Our results suggest spatial variability in forage production is more responsive to precipitation received in November and April. Consequently, November precipitation could be used as an indicator to estimate potential annual forage production using our machine learning model, assuming other factors are at multiyear mean levels. Noting that for ranches in mesic regions (e.g. NCCR), accuracy of this method could be lower than in arid regions because precipitation amount is not the only factor influencing forage production.
Annual forage prediction maps and identified key drivers from this study also inform policy decisionmaking. For example, the estimated forage production maps and modeling results at 30 m could inform drought-monitoring efforts in near realtime, providing a low cost, objective and largescale assessment of California rangeland forage conditions. The US Drought Monitor (USDM, https://droughtmonitor.unl.edu/About/WhatistheUS DM.aspx) is a multi-institutional effort that provides drought classification maps to assess and identify drought conditions each week across the country (Svoboda et al 2002). Because of their coarse spatial resolution, USDM maps have limited utility for drought mitigation at more localized scales (Brown et al 2008). The USDM approach is mostly based on weather data coupled with expert inputs from resource agencies (Svoboda et al 2002), and does not explicitly consider differences in plant response to drought. Our study demonstrated large variations in forage production departure during drought years, suggesting the importance of including perspectives of plant sensitivity to drought when mapping drought intensity. The vegetation drought response (VegDRI) tool provides a satellite-based indication of drought effects on vegetation health at 1-km resolution (J. F. Brown et al 2008). VegDRI maps are limited by a lack of extensive ground truthing (e.g. green biomass), which could be remedied using our estimated forage production maps.
Finally, this work can inform implementation of drought relief and disaster payment programs, such as the Livestock Forage Program (LFP) and Noninsured Crop Disaster Assistance Program (NAP) by USDA. The LFP is currently triggered by USDM ratings and duration of forage production loss at the county level. However, our results demonstrate that reductions in forage production vary significantly during drought within an individual county ( figure  9). Further, our calibrated cumulative APAR based LUE method could generate near real-time forage production maps at 30 m resolution. These maps provide finer spatial resolution information directly related to forage productivity. With this higher resolution information, it is possible to declare sub-county production-based drought, which would allow disaster relief programs to better distribute resources (George et al 2010). Our results also complement existing rangeland production monitoring tools, such as Grass-Cast (https://grasscast.unl.edu/), that are focused on the Great Plains and Southwest regions. Our quantified relationships between forage production and its key drivers may provide opportunities to expand such tools to larger geographic areas.

Implications for assessing climate change impacts
Temperature is projected to warm 1.7 • C-5.8 • C over the next century in California (Cayan et al 2012), with much of the increase resulting from rising minimum temperatures (mostly nighttime) (Cordero et al 2011). Our study indicated forage production responds differently to changes in minimum temperature (positive when T min > 4.3 • C) versus average temperature (negative) (figures 7(c) and (d)). Therefore, an increase in minimum temperature and average temperature may produce contrasting sitespecific effects on forage production. Previous studies forecasting climate change impacts on forage production did not account for different responses to rising minimum versus average temperatures (Zavaleta et al 2003). A California annual grassland field study found experimental warming reduced canopy greenness (Zavaleta et al 2003); however, these results may not reflect actual warming effects because the same intensity of warming was applied during both day and nighttime periods. A climate change impact modeling study for the San Francisco Bay area suggested that rangeland forage production may be enhanced by future temperature and precipitation conditions (Chaplin-Kramer and George 2013). Yet, this conclusion was based on a growing degree-day model to forecast future forage production and did not account for differential rising rates between maximum and minimum temperatures.
Precipitation varies greatly from year-to-year in California and is expected to become more variable and unpredictable in the future (Cayan et al 2012). Extremely wet and dry years are likely to appear more frequently in the future. Forage production showed different resilience towards extreme precipitation conditions across the study area. Mesic regions exhibited a smaller departure from the LTM than drier regions under drought or wet conditions (figures 10(a) and (b)). This spatially varying response to changes in precipitation suggest that forage production in more arid regions may get hit harder by climate change than more mesic regions. Our findings are in general agreement with previous climate change sensitivity analyses examining forage production under a Mediterranean climate (Huxman et al 2004, Golodets et al 2013. In the meantime, impacts from more frequent drought may be more severe than predicted because of the predicted rising temperature (Polley et al 2017). Some ranchers may be forced to switch to livestock that are more adapted to warm temperatures and drought, such as sheep or goats (Kay 1997).

Conclusion
We investigated relationships between LTM forage production and key factors driving forage production across four California ecoregions (∼24 000 km 2 ). We explored 41 potential variables derived from climate, topography and soil properties that were reduced to 11 feature groups to avoid collinearity. Together, these 11 features explained 67% of the variation in annual forage production across the study area. Precipitation was the dominant driver for forage production with high sensitivity to November (germination) and April (rapid growth period) precipitation amounts. The influence of precipitation amount was most pronounced during drought years and in drier ecoregions (CCR&SNF); in wet years and wetter ecoregions (NCCR&NCICR) other variables such as precipitation distribution and solar energy showed a relatively stronger control. Minimum air temperature showed an increased effect in wetter ecoregions, but not during wet years. Elevation showed the strongest impact among topography factors with higher elevations having lower forage production. Soil properties showed the least influence among predictor variables, which is possibly due to their small degree of variability across the study area. The quantified linkages and forage production maps provide important decision-making information for rangeland managers and state-wide drought-disaster relief programs. Our approach has the ability to forecast forage production in near real-time to provide rangeland managers with information to make proactive grazing decisions, especially with regard to droughtinduced risks. We anticipate that future research will build upon this approach as new remote sensing platforms become available to monitor abiotic (e.g. soil moisture) and biotic controls of forage production in near real-time. This work also provides a foundation for predicting rangeland forage response to future climate change scenarios at regional scales. especially thank Royce Larsen for logistical support and guidance in selecting and maintain field sites.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.