A global-scale relationship between crop yield anomaly and multiscalar drought index based on multiple precipitation data

Drought impact on crop production is well known as crop yield is strongly controlled by climate variation. Previous studies assessed the drought impact using a drought index based on a single input data set, while the variability of the drought index to the input data choice is notable. In this study, a drought index based on the standardized precipitation index with multiple timescales using several global precipitation datasets was compared with the detrended anomaly based on the global dataset of historical yield for major crops over 1981–2016. Results show that the drought index based on the ensemble precipitation dataset correlates better with the crop yield anomaly than a single dataset. Based on the drought index using ensemble datasets, global crop areas significantly affected by drought during the study period were around 23%, 8%, 30%, and 29% for maize, rice, soybean, and wheat, respectively, induced mainly by medium to longer drought timescale (5–12 months). This study indicates that most crops cultivated in dry regions were affected by droughts worldwide, while rice shows less correlation to drought as it is generally irrigated and cultivated in humid regions with less drought exposure. This study provides a valuable framework for data choices in drought index development and a better knowledge of the drought impact on agriculture using different timescales on a global scale towards understanding crop vulnerability to climate disruptions.


Introduction
Drought is often referred to as a creeping phenomenon (Wilhite and Glantz 1985), considered the most complex natural hazards affecting ecosystems (e.g. wildfire), agriculture (e.g. crop losses), and socioeconomic sectors (e.g. water supply, industry) (Heim 2002). Agriculture poses the most significant losses; drought-induced crop yield loss is considered a major loss among agricultural sub-sectors (Daryanto et al 2016, FAO 2018. Droughts can reduce soil-water availability, affect water and soil quality, and crop failures, which can threaten food security and livelihood (Lesk et al 2016, Lu et al 2017. Despite tremendous improvements in management and technology, crop production remains highly dependent on weather and climate variation (Rosenzweig et al 2001, Lobell and Field 2007, Lesk et al 2016.
Drought can be identified as meteorological, agricultural, hydrological, and socioeconomic drought (Dracup et al 1980, Wilhite and Glantz 1985, Keyantash 2002. Among the drought types, meteorological drought triggered by the precipitation deficit is considered the primary cause of the later drought types. Precipitation deficit then can have significant consequences on soil moisture shortage triggering lack of water supply to crop (agricultural drought), flow and storage deficit (hydrological drought), and water scarcity eventually affecting society (socioeconomic drought) (Heim 2002, Zargar et al 2011, Bachmair et al 2018. Understanding a multiple timescales drought representation and its lagged impacts is essential to characterize the subsequent implications of different drought types to different terrestrial systems (Vicente-Serrano et al 2012, Liu et al 2019.
Numerous drought indices have been developed to characterize the several types of drought, each with its strengths and weaknesses (Mishra andSingh 2010, Zargar et al 2011). Among them, the three drought indices, which are mainly based on meteorological indicators, have been widely used: standardized precipitation index (SPI) (Mckee et al 1993), palmer drought severity index (PDSI) (Palmer 1965), and standardized precipitation evapotranspiration index (SPEI) (Vicente-Serrano et al 2010). SPI was proposed as a meteorological drought index to identify drought frequency and duration represented in different timescales based on precipitation. The SPI is recommended by the World Meteorological Organization as the main drought index for a country's drought monitoring (Hayes et al 2011). PDSI and SPEI employ additional variables than precipitation (e.g. water holding capacity, temperature, evapotranspiration) (Paulo et al 2012, Liu et al 2018, Hoffmann et al 2020. These variables are often not well measured and are not widely available, although current progress on remote sensing approach can address such gap estimating the variables in a more consistent spatio-temporal scale , Liu et al 2020.
Numerous studies have compared the commonly used drought index applicability for various systems (Paulo et al 2012, Nam et al 2015, Spinoni et al 2015, Peña-Gallardo et al 2019a. More recently, Hoffmann et al (2020) compared SPI, SPEI, and PDSI with simulated soil moisture on a global scale and assessed the uncertainties of the input data from which they are developed based on the correlation. They suggested that the uncertainties in the variations of the drought indices to input data choice are more considerable than the differences between the indices themselves. Since the precipitation is the only input in SPI, this allows us to limit the input data uncertainties, representing a more robust drought characterization (Guttman 1998, Hayes et al 2011, Spinoni et al 2014, Hoffmann et al 2020.
Previous studies have demonstrated the suitability of the meteorological drought based on SPI for agricultural drought indicators and crop losses assessment alongside the other common drought indices (Yamoah et al 2000, Raja et al 2014, Kamali et al 2018a, Kim et al 2019, Leng and Hall 2019. Moreover, there are also numerous agricultural drought indices based on land surface parameters estimated by remote sensing approach (e.g. soil moisture, surface temperature, crop phenology) (Orth et al 2020, Ghazaryan et al 2020, such as moisture index (Thornthwaite and Mather 1955), vegetation health index (Kogan 1990), normalized difference vegetation index (Peters et al 2002), and various composite indices (Liu et al 2019(Liu et al , 2020. In the purpose of capturing the diverse response of crop to primary water deficit (i.e. precipitation), the multiscalar drought indices, SPI offers flexibility to represent the timescale-dependent drought (Guttman 1998, Zargar et al 2011 (2019). However, the contribution of this uncertainty is less understood when the drought index is linked with crop response. On the other hand, most previous studies used crop data on a national or sub-national level. The grid-scale approach was mainly conducted using global crop growth modeling due to limited historical gridded crop yield data globally with a few exemptions, such as Monfreda et al (2008) and Kim et al (2019). The information of crop response variations in a finer spatial scale (i.e. grid-scale) is necessary as the mitigation effort against extreme climate impact should be addressed in the country's specific vulnerable regions (Fraser et al 2013, Kamali et al 2018b. Furthermore, understanding the crop sensitivity of past trends can be valuable information for future efforts. To our knowledge, a global-scale evaluation of the relationship between drought using a drought index and agricultural yield anomaly has not been done by employing a multiscalar and multisource drought index on a grid-scale. This study aims to develop a global multiscalar drought index using multisource precipitation and to assess the relationship with crop yield anomaly as an indicator of agricultural impact to drought. Further, the objectives of this study include: (a) a comparison of SPI using different global precipitation datasets as inputs, (b) comparison of correlation skill between drought index and the global major crops yield anomaly based on different input data and timescales, (c) uncertainty of the crop yield correlated with drought based on different input data and timescale, and (d) spatial identification of drought impact to the crop yield using the most correlated input data and timescale. The drought index here was developed based on SPI with 1-12 month timescales based on the several global gridded precipitation datasets. We assessed the drought index skill by determining its relationship with the global major crop yield anomaly (maize, rice, soybean, and wheat) in a 0.5 • grid resolution during 1981-2016. This study enriches existing global drought-related studies for the specific agricultural system (i.e. crop yield). Nonetheless, this study gives a valuable reference for data selection in drought index development and a better understanding of the drought impact on agriculture using different timescales on a global scale.

Precipitation
Precipitation is the input data for deriving the drought index in this study, based on SPI (Mckee et al 1993). Therefore, the quantity and quality of the precipitation data are sensitive to the model fitting in deriving the SPI-based drought index. Recently, numerous global precipitation datasets are available, including gauge-, satellite-, merging-, and reanalysis-based datasets. Sun et al (2018) compared available global precipitation datasets and found discrepancies in the magnitude and variability of the precipitation datasets. This implies that there can be substantial differences between the drought index derived from different products (Golian et al 2019, Hoffmann et al 2020, Turco et al 2020).
For developing the drought index, we used several precipitation datasets that provide: (a) period over 1981-2016, which corresponds to the global historical yield dataset (see figure S1 available online at stacks.iop.org/ERL/17/014037/mmedia), (b) global land extent (90 • S-90 • N and 180 • W-180 • E), (c) grid in ∼0.5 • spatial resolution, and (d) daily or monthly temporal resolution. Finally, we selected nine precipitation datasets that meet the above requirements to estimate the drought index (DI) (table 1). The selected precipitation datasets were then resampled to a standard 0.5 • × 0.5 • resolution by bilinear interpolation. The temporal scale used in this study is a monthly scale over the study period.

Crop dataset
The global dataset of historical yields for major crops (GDHY) was used (Iizumi and Sakai 2020).
This study analyzes the four major crops: maize, rice, wheat, and soybean with a spatial resolution of 0.5 • spanning 1981-2016. The grid-cell estimation of crop yield in GDHY was calculated from agricultural census statistics from the Food and Agriculture Organization of the United Nations (FAO), satellite remote sensing, and various crop-related datasets.
We used crop calendar data for around the year 2000 (Sacks et al 2010) to estimate the annual drought index associated with the crop growing season for each crop. This data was also used for developing GDHY (Iizumi and Sakai 2020); thus, the window period between these major datasets used in this study can be consistent (i.e. using the same growing season during which yield data is obtained) (Iizumi et al 2018). This dataset was developed based on reported crop calendar (sub-national level) and global climate data. It has been widely used for numerous global crop studies as a primary crop calendar data source ( irrigated and rainfed crops including the major crops used in this study. The sums of irrigated and rainfed crop areas were used in this study, as GDHY does not specify irrigated and rainfed crop systems separately. The crop calendar dataset and MIRCA2000 are provided in the same grid resolution used in this study.

SPI calculation
The SPI is an accumulated precipitation value within a specific timescale (e.g. monthly). It is fitted to a probability distribution (i.e. gamma distribution), which is then transformed into a normal distribution with zero mean and standard deviation of one. We calculated SPI for different timescales (i.e. 1-12 months) given the typical timescale for agricultural application (Zargar et al 2011, Peña-Gallardo et al 2019b, Hoffmann et al 2020. The utilization of multiple timescales in SPI calculation for agriculture was used to reflect the response of crops to drought over a different region characteristic. The SPI calculation was done using each precipitation dataset, resulting in nine multiscalar SPI on a monthly basis for the period 1981-2016 (432 months).
In addition, to provide a combined drought index, we also obtained a mean of SPIs (hereafter ENSEMBLE) based on multiple datasets shown in table 1. We assigned equal weights to each dataset due to no prior knowledge of their skill to estimate the 'true' precipitation value. The averaged value was restandardized to a mean value of zero and standard deviation of one, adopted from Turco et al (2020). The ensemble approach has been used to provide input and associated uncertainty for forcing data in land surface models under historical and future climate conditions (Wang et al 2009, Lilhare et al 2019, Turco et al 2020. Moreover, this approach enables us to obtain the best estimate from the recently available datasets and preserve their unique characteristics. In this study, ENSEMBLE was used as a reference to quantify the agreement among SPIs developed using a different dataset.

Drought index
Drought index (DI) was defined by equation (1) (Kim et al 2019); a larger DI represents a larger deficit of precipitation (i.e. more severe drought conditions). Since we used annual crop yield data, monthly SPI was converted to an annual basis using equation (1), based on SPI value in certain month associated with the crop season period. We used harvested month obtained from Sacks et al (2010) dataset to select which SPI (January-December) was used to represent DI in a year. We used the constant MIRCA2000 harvested month throughout the years analyzed in this study: where SPI n,j,t is the n month SPI timescale in harvested month j and year t. For example, the harvested month of a particular crop is in February; thus, 6 month SPI (SPI 6 ) at year t is calculated from accumulated precipitation from September to December at year t − 1 and from January to February at year t. This calculation was done for each grid-cell, crop, and input dataset. Moreover, we used SPI less than zero (indicating the dry conditions) for the DI development in equation (1), since comprising wet conditions (i.e. SPI > 0) in the analysis may also introduce the impact of extreme wet conditions (e.g. flooding, submergence) to crop yield. Furthermore, here, we included −1 < SPI < 0, defined as near normal SPI (Mckee et al 1993, Guttman 1999, Svoboda et al 2012, to capture the lower drought magnitude with higher frequency that may control yield anomaly. Here, we also revealed that using the threshold of zero results in the highest global crop areas with a strong negative correlation (−1 ⩽ r < −0.5) compared to other thresholds (see figure S2).

Crop yield anomaly
Crop yield anomaly (θ t ) was used to represent crop yield variations in each grid point. The anomalies are obtained through a detrending method based on the long-term trend defined in equation (2). The longterm trend was estimated using a locally weighted regression model (LOWESS) (Cleveland and Devlin 1988). LOWESS method can account for possible non-linearity of the trend, and it is suitable for shorter time series (i.e. in this study) compared to other methods (e.g. linear regression model, moving average models, empirical mode decomposition model, smoothing spline models). Moreover, LOWESS is considered one of the well-performed detrending methods for crop yield (Ye et al 2015, Lu et al 2017. This trend removal allows us to exclude the overall yield increase due to recent technological advances in each grid point. This anomaly or residual is assumed to be induced mainly by climate disruptions: where Y t is crop yield (tons per hectare, t ha −1 ) and Y t is the long-term trend obtained by LOWESS (t ha −1 ). We applied the same neighborhood size (f parameter) in LOWESS as 0.25 for all grid points. This value was determined as it results in maximum significant areas of the correlation between drought magnitude and crop yield anomaly.

Correlation
We used the Pearson correlation coefficient to assess the relationship between crop yield anomaly and each DI calculated from nine precipitation data and 12 timescales for the four crop types. We examined the skill of each DI by comparing the percentage of drought-correlated global crop areas with a significant correlation (p-value < 0.05); the negative correlation indicates that crop yield loss is associated with drought. The global crop area percentage was calculated by weighting each valid grid point, where crop yield data is available, based on the crop area fraction of MIRCA2000.

SPI intercomparison
Here we compare SPI obtained from different datasets using both negative and positive SPI indicating dry and wet conditions, respectively. Our results in figure 1 show the comparison using the 6 month SPI (SPI 6 ). It presents the comparison between SPIs averaged in selected areas of interest (Africa, China, India, Europe, US, Argentina) with a high fraction of cropland obtained from MIRCA2000. These regions also represent the variation of precipitation observations density based on GPCC, from sparse to dense stations located in different continents. For example, the region of interest in Europe and the US defined in figure 1 can be classified as a dense observation region (1826 and 154 gauges, respectively). On the other hand, the region in Africa and Argentina represents scarce observations with only 10 and 19 precipitation gauges, respectively, within ∼780 km side of the square. Results show higher variability among SPI values developed by different datasets in such datascarce regions. Similar results were obtained using the other representative timescales (i.e. 3, 9, 12) (see figures S3-S5). Furthermore, figure 2 shows the spatial distribution of standard deviation from all SPI 6 among datasets averaged over the study period. SPI 6 was used to demonstrate the result; similar results were obtained using the other timescales. The higher spread (i.e. standard deviation) of SPIs can be prominent in almost the whole of Africa (except Southern Africa), higher latitude (e.g. subarctic regions), higher elevation and complex terrain (e.g. Tibetan plateau, West Coast South America), arid and semi-arid regions (e.g. West and Central Asia), and Amazon region. Notably, a higher spread is also found across Asian monsoon regions (i.e. around the Indian subcontinent to Southeast Asia) whose rainfall intensity is relatively higher than other regions.
Various factors may affect the discrepancy among SPIs calculated from the different datasets, for example, the accuracy of each measurement, orographic complexity (Sun et al 2018), and SPI means in extremely dry and cold regions (Spinoni et al 2014). As there is a larger range of precipitation values among different datasets, the higher uncertainty of SPI values can be expected as precipitation becomes only input (Golian et al 2019, Hoffmann et al 2020, Turco et al 2020). This indicates that using a single precipitation product to develop a precipitationbased drought index (i.e. SPI) may lead to uncertain results. Then, as the drought index has been commonly used in many drought impact assessments, this uncertainty related to precipitation data can affect the outcome. Therefore, the ensemble approach can be considered a strategy for dealing with these issues and getting the best estimate among datasets with associated uncertainty information (Wang et al 2009, Turco et al 2020. Figures 3 and 4 demonstrate the agreement among SPI with different timescales calculated from different datasets along the study period. We assessed the agreement based on the degree of the root mean square error (RMSE) of each SPI relative to ENSEMBLE. Figures 3(a) and (b) show respectively the global land grid percentage in which the highest and lowest RMSE is found; the lowest RMSE represents a high agreement among SPIs and vice versa. Whereas figure 4 shows the spatial distribution of figure 3 but only for SPI 6 (as results show a similar pattern using other timescales). Results indicate SPI based on JRA-55 generally tends to be distant to other datasets almost across all continents with 34% of the total global land grids average (except Antarctica), followed by CPC-based SPI with 23% global land. CPC likely differs from other datasets, particularly in higher latitudes. Whereas MSWEP yields SPI value closest to the ENSEMBLE with coverage of almost over 52% global land, followed by GPCC and MERRA-2 with respectively 13% global land with smallest RMSE to the    ENSEMBLE. The notable agreement of MSWEPbased SPI to ENSEMBLE was expected as this dataset was obtained by merging various precipitation datasets. Moreover, the gauge-based dataset (i.e. GPCC, CRU) shows a moderate agreement with the other datasets since this dataset has been used primarily for assimilation and adjustment for most precipitation products.

Correlation intercomparison 4.2.1. Timescale
The correlation between DI based on multiple timescales (1-12 months) and input datasets and crop yield anomaly were obtained. Here, we used ENSEMBLE as DI inputs to demonstrate the pattern for each crop linked with different DI timescales. Figure 5 shows significant global crop areas obtained from the correlation between crop yield anomaly and multiple timescales DI. It offers a similar response pattern for all crops; higher timescales significantly correlate with crop yield anomaly. In maize, the significantly correlated global crop areas increase dramatically from a timescale of 5 month and reach the maximum at 8 month with 23% (see table S2). Similarly, the significantly correlated crop areas for soybean get the most at 5 month DI with 30% global crop areas and are kept higher until at least 12 month timescale. For wheat, the significantly correlated crop area pattern shows is linearly related to the timescales. The higher correlated crop area is achieved gradually for medium to longer timescales, particularly from 5 month until 11 month by 29%. The significantly correlated grids for maize, soybean, and wheat are dominated by negative correlation, reflecting higher drought affecting crop yield loss (see figure S6). However, significantly correlated crop areas for rice are very small, with only ∼8% obtained at a 5 month timescale.
We revealed that the most correlated timescale is 9 month DI with 24% significant global crop areas (median across different crops), based on ENSEMBLE. Higher correlated timescales are particularly from medium to long timescales, which is apparently from 5 month. This suggests that crop yield anomaly tends to be more sensitive to mid-longer-term precipitation deficit, indicating the delayed drought processes that induce the plant stresses, which then affect yield reduction. This timescale is also associated with the growing season period for all crops, generally from 3 to 12 months from sowing to harvesting, depending on the crop type and local to regional practice. For a shorter crop growing season (e.g. <9 month), this longer timescale can start affecting the crop even several months before the cropping season. For a longer crop growing season, lack of accumulated precipitation during the whole or part of the season may significantly affect the crop yield. Moreover, this significance of a longer timescale of drought to crop yield anomaly exhibits interseasonal impact with different precipitation patterns and climate conditions involved (Matiu et al 2017). For instance, in spring wheat case, a lack of soil moisture recharge during winter may significantly affect the crop yield harvested in the later growing season (Peña-Gallardo et al 2019b). Figure 6 presents the correlation between crop yield anomaly and DI based on different input datasets in a fixed 9 month timescale as representative most correlated timescale. The horizontal axis in figure 6 shows ordered (left-to-right) from less to higher correlation between crop yield anomaly and each dataset from which DI was developed. The result indicates that ENSEMBLE-based DI has a higher significantly correlated global crop areas compared to the others, with around 24% (median among crops), followed by MERRA-2, GPCC, CRU, CPC, UDEL, PRECL, ERA-5, MSWEP, and JRA-55 with 23%-8%, respectively (see table S3). ENSEMBLE-based DI outperforms those based on other datasets, particularly for maize, soybean, and wheat, while for rice, GPCC reveals the highest significant global crop area. The mean difference between ENSEMBLE-based DI and other datasets for maize, rice, soybean, and wheat is 4% of the significant global crop area. Moreover, these results signify the performance of observation-based datasets (i.e. GPCC, CRU, CPC, UDEL, PRECL) that show a relatively higher correlation with crop yield anomaly. On the other hand, the reanalysis data (i.e. JRA-55, ERA-5) show relatively less correlation with crop yield anomaly, except MERRA-2.

Uncertainty based on input datasets and timescales
We assessed the variability of significantly correlated global crop areas for each crop based on different input datasets and timescales. We excluded DI with short timescales of 1-4 month and JRA-55 as input precipitation datasets, given less correlation with crop yield anomaly and a relatively distinct performance than others. Thus, there are 72 DI models with different timescales and input datasets (8 timescales × 9 input datasets). We quantified the uncertainty represented by the 95% confidence interval (CI) of the combinations based on 1000 bootstrapped samples (figure S8). We found that significantly correlated global crop areas for maize, rice, soybean, and wheat, respectively, on average, are 18.2% (CI 17.8%-18.7%), 6.5% (6.3%-6.7%), 22.4% (21.4%-23.2%), and 22.3% (21.6%-23.1%). Wheat and soybean exhibit the highest impacted crop area with a similar extent of global crop areas significantly correlated by drought (∼22% respectively), followed by maize (∼18%) and rice (∼6%).
The average value of significant global crop areas affected by drought based on the combined timescales and input datasets exhibit a lower value than that using best-correlated timescales and ENSEMBLE as input. Therefore, the operation of drought assessment to crop yield impact may consider this particular uncertainty for selecting drought parameterization and input datasets. For instance, using ENSEMBLE and MSWEP based 9 month timescales, the significant correlation between DI and crop yield anomaly may differ ∼7% of global crop area (for maize). This indicates a notable variability of using different input datasets and timescales to assess crop yield impact to drought, which should be considered. Nevertheless, this uncertainty does not yet account for the different choices of drought index and the crop yield data. Thus, the uncertainty obtained in this study may be only valid under this study's framework. Figure 7 demonstrates a general pattern of regions where crop yield is significantly correlated with drought based on ENSEMBLE based on the most correlated timescales calculated independently for each crop type (see section 4.2.1). For maize, the drought-induced crop yield loss can be profound in the Great Plains of the US, the east coast of Argentina and Paraguay (the Pampas), Mediterranean Europe, Southern and Southeast Africa (i.e. Mozambique, Zimbabwe, and Zambia), and Borneo Indonesia.

Drought impacted regions
There are a few regions with a strong negative correlation for rice. For soybean, drought likely affects crop yield loss in the Great Plains of the US, the Pampas, and South Africa. While for wheat, the droughtinduced crop yield loss can be significant in the Great Plains of North America, Spain, the northern coast of Morocco, East Africa (i.e. Ethiopia, Kenya), Turkey, Northern Kazakhstan, and Australia. It is worth noting that the high proportions of cropland with high negative correlation can be due to higher weight assigned in the grids (especially in the Great Plains and the Pampas owing extensive fraction of cropland). The drought-sensitive regions revealed in this study agree with Kim et al (2019), as this study used a similar crop yield dataset (Iizumi and Sakai 2020).
Moreover, in rice, the relationship between drought and crop yield anomaly is not evident across all global regions, as indicated by previous studies (Matiu et al 2017, Kim et al 2019). This may be due to irrigation support during the drought period that can alleviate the crop stress (Murthy et al 2009, Carrão et al 2016, Hayashi et al 2018, Fitriyah et al 2019, Lu et al 2020, given the highest global percentage of irrigated rice cropland by 62% global harvested area (Portmann et al 2010). Rice is typically cultivated in humid regions with larger annual precipitation, such as Asian monsoon regions. Moreover, the annual rice crop yield likely affected by flood in the rainy season may cause the yield variations to be less sensitive to a drought period in another season (Chau et al 2015, Chen et al 2018, Hendrawan and Komori 2021. This emphasizes rice crop yield insignificancy to drought worldwide. Figure 7. Correlation between crop yield anomaly and DI based on ENSEMBLE using the most correlated timescales (8, 5, 5, and 11 month) for maize, rice, soybean, and wheat, respectively. Pie charts in the lower left of each panel express the total global crop areas (%) based on MIRCA2000 for each range shown in the lower color bar. Non-significant grids are denoted by gray color.

Discussion
This study calculated the drought index (DI) based on SPI, which relies on precipitation data as input. Comparison among SPI developed from different datasets shows a discrepancy of SPI value, especially in datascarce regions where precipitation observations are also often sparse (e.g. Africa) (Sun et al 2018, Kamali et al 2018b, Golian et al 2019. Since the valid precipitation dataset cannot be determined, this comparison can be used only as a benchmark. The comparison among precipitation itself can be referred to, such as in Gehne et al (2016), Herold et al (2017), and Sun et al (2018, which also provide useful information regarding precipitation data comparison and the associated uncertainty. ENSEMBLE data obtained in this study are determined by the selected specific precipitation datasets and may change when other sources and resolutions are used. In addition, there might be dependency among the datasets themselves; thus, the ensemble mean obtained using a simple average (with equal weight) may not be used. Moreover, the development of SPI relied on the use of standardization based on the gamma distribution applied for all grids (Vicente-Serrano et al 2010, Spinoni et al 2014, Stagge et al 2015. A nonparametric standardization approach can also be employed to give no prior assumption on a specific distribution function of precipitation data , Madadgar et al 2017, Golian et al 2019. Our results reveal that mostly global major crops have a more robust response to medium to more prolonged drought precipitation deficit represented by SPI. Previous studies also indicate that medium to longer precipitation deficit timescales also have a higher correlation with soil moisture globally (Zampieri et al 2017, Hoffmann et al 2020, Turco et al 2020. Peña-Gallardo et al (2019b) also indicated crop response (barley, wheat, soybean, maize, and cotton) to mediumlong timescale in several US counties. A significant response to longer drought timescale has also been indicated in Vicente-Serrano et al (2013) by linking SPEI and some vegetation growth parameters (e.g. vegetation indices).
Nevertheless, these results might differ from several previous studies using county-level yield data in the US (Zipper et al 2016, Peña-Gallardo et al 2018. In Zipper et al (2016), most maize and soybean crops show a more significant response to a shorter SPEI timescale (1-3 months). Similarly, (Peña-Gallardo et al 2018) also emphasized the significance of a shorter drought timescale (1-3 months) for most of the response of the crops (barley, maize, cotton, soybean), while wheat distinctively showed varying timescale response for most of US counties. The underlying methodological difference between our study and the previous studies that may cause this disagreement are as follows. First, we used a fixed drought period associated with harvesting months based on Sacks et al (2010), while the previous studies used the most correlated month in the year to represent the annual drought events. Second, we used GDHY (Iizumi and Sakai 2020) that integrated country-level yield data and a remote sensing approach to obtain yield in each grid point. Some discrepancies might occur when GDHY is compared to the aggregated data which come from local observation. However, since our drought index here was developed based on the crop growing season, our study may provide a more inherent definition of agricultural drought. It should be noted that the crop yield data used in this study average major and minor crops for multiple crop seasons (maize, rice, wheat); therefore, the crop growing season used here is assumed to correspond to the major growing season in which the yield has a more dominant share.
Comparing the response of each crop to the drought, our results show a distinct pattern of two major groups of crops: maize, soybean, and wheat considered dry farming crop and rice crop as a wet farming crop. The more vigorous response to drought is likely indicated in dry farming crops, shown by a mostly negative correlation as drought affects crop yield loss. It implies the higher vulnerability of dry farming crops to climate variability, which is primarily rainfed and cultivated in regions with lesser precipitation than wet farming crops (Portmann et al 2010, Matiu et al 2017, Kim et al 2019. It also emphasizes different resistance and resilience mechanisms between those main groups of crops as the response of crops to drought relies on drought timescales characteristic.
Nevertheless, since this study only considers precipitation deficits exclusively as a climate impact on crops, other climatic factors can also be considered to understand unexplained variations (e.g. temperature, evaporative demand) ( . Moreover, considering the characteristics of crop vulnerability and adaptive capacity to drought on a broader view, other various factors may also play an important role, for example, agronomic factors (e.g. irrigation fraction, fertilizer application, management practices, pests), environmental (e.g. soil, water holding capacity), and socioeconomic factors (i.e. gross domestic product GDP, human development index, population, agriculture share in GDP) (Antwi-Agyei et al 2012, Simelton et al 2012, Fraser et al 2013, Wurster et al 2020. Thus, further studies can address the dominant mechanism of drought impact to crop globally by assessing multivariate factors affecting yield variations using an integrated approach combining physical and socioeconomic factors. Finally, our study relies on the limitation of the study framework and data used. Along with the uncertainties from the choices of input datasets and timescales, uncertainties are also obviously present in the other dataset inputs (i.e. precipitation, crop yield, and crop calendar data). Notably, as an important independent variable in this study, GDHY was developed based on remote-sensed net primary production (NPP), country statistics from FAO, and crop calendar (Sacks et al 2010). The uncertainty and usage considerations have been pointed out in Iizumi and Sakai (2020), including, for example, the reliability of the country yield dataset and the use of a constant crop calendar as a basis to calculate accumulated NPP for yield share estimation in a gridscale. Moreover, the spatial distribution of uncertainty from yield and drought index used in this study may propagate to our finding, notably, in the drought-prone area and in regions where agronomic and precipitation data are lacking.
Furthermore, regarding crop calendar and crop area, a change of crop growing season and crop area year to year may not be captured as the static data were applied in this study (Portmann et al 2010, Sacks et al 2010. The use of a dynamic crop calendar and crop areas data, combining observed data with the aid of satellite remote sensing data (i.e. crop phenology), is one of the possible approaches for further studies, such as in van Hoolst et al (2016) and Mathison et al (2018). In addition, it is worth mentioning that the GDHY crop yield data only includes around 77%, 69%, 90%, and 70% of the MIRCA2000 global harvested area for maize, rice, soybean, and wheat, respectively. Therefore, percentage values obtained in this study may not be translated to the actual estimated global crop areas (i.e. hectares) according to MIRCA2000. Rather, the results in this study may represent only samples of global crop areas where yield data is available.
Spatial explicit information crop sensitivity to drought is essential for understanding the vulnerability variation since agricultural vulnerability may vary between smaller regions within the country (Antwi-Agyei et al 2012). This gridded analysis of crop sensitivity to drought events using historical data can help us indicate the vulnerable hotspot of crop areas as a key to increasing the agricultural system's resilience to disaster events , Fraser et al 2013. Moreover, this agricultural impact on drought can be referred to as a conceptual framework of disaster risk; risk as a function of hazard, exposure, vulnerability, and resiliency (Simelton et al 2009, Aitsi-Selmi et al 2015, UNISDR 2015. Assessing drought impact using a robust drought index is needed to understand the climatic hazard component to agriculture, while the exposure is the component of crop growing season in which crops are mostly exposed by climate disruptions. Estimating hazard and exposure by considering coincidence between crop growing season and drought episode can be an approach to assess disaster impact to agricultural risk (i.e. crop yield anomaly). Lastly, this study provides a simple framework to evaluate drought impact on agriculture. We can extend our knowledge of the crop-disaster vulnerability towards global change using this past understanding.

Conclusion
This study developed a global multiscalar drought index (DI) using different precipitation datasets to assess the relationship with crop yield anomaly over 1981-2016. Results show the discrepancy among SPIs developed by different input datasets is profound in data-scarce regions, which is also likely vulnerable to drought impact. The drought index based on ensemble mean (ENSEMBLE) provides a higher correlation with the crop yield anomaly. Based on ENSEMBLE, this study reveals that 23%, 8%, 30%, 29% of global crop areas for maize, rice, soybean, and wheat, were significantly affected by drought at different timescales from medium to longer timescales (5-12 months). This study suggests that drought estimation using combined several precipitation datasets may increase the representation of crop yield anomaly associated with drought. However, rice shows less correlation with drought among all crops as it is generally irrigated and cultivated in humid regions with less drought exposure. The significantly affected regions include the Great Plains, the Pampas, Mediterranean Europe, Morocco, Southern and Southeast Africa, Borneo Indonesia, and particularly for wheat, including East Africa, Turkey, Northern Kazakhstan, and Australia. This study provides a valuable reference for data selection in drought index development and a better understanding of the drought impact on crops using different timescales on a global scale.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://doi.org/10.6084/m9.figshare.14880396. The source codes of the study analysis are available from the corresponding author upon request.