A global drought monitoring system and dataset based on ERA5 reanalysis: A focus on crop‐growing regions

Drought monitoring systems are real‐time information systems focused on drought severity data. They are useful for determining the drought onset and development and defining the spatial extent of drought at any time. Effective drought monitoring requires databases with high spatial and temporal resolution and large spatial and temporal coverage. Recent reanalysis datasets meet these requirements and offer an excellent alternative to observational data. In addition, reanalysis data allow better quantification of some variables that affect drought severity and are more seldom observed. This study presents a global drought dataset and a monitoring system based on the Standardized Precipitation Evapotranspiration Index (SPEI) and ERA5 reanalysis data. Computation of the atmospheric evaporative demand for the SPEI follows the FAO‐56 Penman‐Monteith equation. The system is updated weekly, providing near real‐time information at a 0.5° spatial resolution and global coverage. It also contains a historical dataset with the values of the SPEI at different time scales since January 1979. The drought monitoring system includes the assessment of drought severity for dominant crop‐growing areas. A comparison between SPEI computed from the ERA5 and CRU datasets shows generally good spatial and temporal agreement, albeit with some important differences originating mainly from the different spatial patterns of SPEI anomalies, as well as from employing long‐term climate trends for different regions worldwide. The results show that the ERA5 dataset offers robust results and supports its use for drought monitoring. The new system and dataset are publicly available at the link https://global‐drought‐crops.csic.es/.


| INTRODUCTION
Drought is one of the most damaging hydrometeorological hazards, affecting environmental and socio-economic systems (Wilhite & Pulwarty, 2017).Quantifying drought is very difficult since it depends on multiple physical factors (e.g., precipitation, soil moisture, streamflow, groundwater).Also, drought-related impacts strongly rely on the vulnerability of the affected environmental systems and socio-economic sectors (Anderegg et al., 2019;Flo et al., 2021;Kim et al., 2019).Impact data are rarely available, so drought quantification is usually based on physical variables obtained from direct observations or modelling approaches.These physical variables are transformed into indices to establish spatial and temporal comparisons of drought severity, irrespective of the magnitude and seasonality of the variable of interest and to assess spatial drought extent.
Drought indices are nowadays the most useful, comprehensible and widely used way of quantifying and characterizing drought severity (Heim, 2002;Mukherjee et al., 2018).There are a wide variety of drought indices, each with its advantages and shortcomings, but all allow determining the severity and duration of droughts (Mishra & Singh, 2010;Mukherjee et al., 2018).Some of these indices are designed with a particular focus (e.g.determining hydrological droughts, soil moisture deficits or precipitation anomalies), while others have more general use.Examples of the latter are the Palmer Drought Severity Index (PDSI; Palmer, 1965) and the Standardized Precipitation Evapotranspiration Index (SPEI; Vicente-Serrano, Beguería, & López-Moreno, 2010).Both indices have shown a close relationship with hydrological deficits, plant stress and crop yields (Bachmair et al., 2016(Bachmair et al., , 2018;;De Keersmaecker et al., 2017;Potopová et al., 2015;Scaini et al., 2015;Vicente-Serrano et al., 2014;Yuan et al., 2020).
Drought forecasting is nowadays very uncertain.Despite the recent advances (AghaKouchak, 2014;Sheffield et al., 2014;Trnka et al., 2020), the skill of current forecasting systems is disappointingly low (Wood et al., 2015).Although drought monitoring does not allow determining the possible temporal evolution of the drought, it does allow defining drought onset and development and determining drought spatial extent and severity (Bokal et al., 2014;Dracup, 1991).It seems that drought monitoring is, therefore, the primary tool to assess current drought conditions and implement drought mitigation measures and management plans (Bokal et al., 2014;Dracup, 1991).
The recent development of climate reanalysis datasets, updated with high frequency and a suitable spatial resolution for regional studies, allows implementing climate monitoring systems.Reanalysis datasets have several advantages over observational ones.They provide climate variables that are not commonly available through observations or are difficult to spatialize (e.g.wind speed).Moreover, they typically provide updated and detailed spatial information at high frequency, which is especially relevant during drought development and intensification.During these critical periods, determining the precise spatial extent and severity of drought conditions has the highest applied interest.Moreover, as reanalysis datasets show physical consistency between the different variables, they provide spatial and temporal coherence for many climatic variables.Some of these variables are highly relevant for drought monitoring and are seldom registered in observational networks.In particular, global warming has increased the role of atmospheric evaporative demand (AED) in the onset and evolution of droughts and their impact on ecological and agricultural systems (Allen et al., 2010(Allen et al., , 2015;;Asseng et al., 2015;Lobell et al., 2011).Thus, some studies suggest that increased AED plays a pivotal role in intensifying droughts worldwide (Dai et al., 2018;Dai & Zhao, 2017).With the available real-time observational meteorological datasets, it is impossible to accurately quantify the AED because temperature is the only variable available in most observatories.Temperature data alone allow only rough and uncertain estimates of the AED (Vicente-Serrano et al., 2020).Physically based methods that allow for an accurate estimation of the AED, such as the FAO-56 Penman-Monteith equation, consider both radiative and aerodynamic components (Pereira et al., 2015).In addition to temperature, these methods require information about air humidity, solar radiation and wind speed.Unlike observational datasets, reanalysis products provide these variables routinely.
Although drought management is usually performed on a local to national scale, there are different drought monitoring systems available on a global scale, allowing to assess general dry or humid conditions over large areas (Beguería et al., 2014;Hao et al., 2014;Turco et al., 2020;Wood et al., 2015).These systems can also determine drought severity in regions where operative drought monitoring is non-existing, which is the most common situation.Even with their inherent uncertainties, mainly for precipitation (Alexander et al., 2020), reanalysis products could be a landmark in developing global drought monitoring systems based on meteorological drought indices.This is mainly because they allow better characterization of the AED while offering near real-time updates.
This study describes a global drought monitoring system based on the SPEI calculated from the ERA5 dataset.The information is provided at a 0.5° longitude and latitude spatial resolution and is updated weekly.In addition, this work assesses drought severity for specific cropgrowing regions, illustrating the potential applicability of the system.In order to accomplish this task, crop yield masks for the world's most common crops have been added to the system.Importantly, this system provides information in near real time while also providing a weekly dataset dating back to 1979.
ERA5 is a new reanalysis product developed by the European Centre for Medium-range Weather Forecasts (Hersbach et al., 2020).It provides hourly precipitation information from 1950 and is updated regularly, providing preliminary daily precipitation data with a four-day delay.ERA5 also generates a potential evaporation (Epot) product based on a surface energy balance that could be assimilated to the atmospheric evaporative demand since it uses a constant vegetation parameter corresponding to crops/mixed farming and assumes non-limited water availability (ECMWF, 2019).Nevertheless, the ERA5 Epot data has shown problems, as it generates a general underestimation and regional artefacts (ECMWF, 2020).For this reason, we decided to calculate the AED using the FAO-56 Penman-Monteith reference evapotranspiration equation based on daily inputs from different ERA5 variables.For this purpose, we used daily data of 2-m maximum and minimum air temperature, downward surface solar radiation, 10-m wind speed and 2-m dewpoint temperature.Using the ERA5 dataset with a spatial resolution of 0.5°, all variables required to calculate the AED and precipitation were extracted.This resolution provides sufficient spatial detail for a global drought monitoring system, as it permits the identification of regional patterns of drought severity at continental and national spatial scales.Although the ERA5 dataset contains information since 1950, some problems have been identified before 1970 (Simmons et al., 2021).For this reason, we decided to focus exclusively on the period from 1979 to the present, which corresponds to the ERA5 first release.This, of course, does not impede updating the dataset to include earlier dates once less uncertain records are released.
The available daily information from 1979 and onwards was grouped on a weekly temporal scale.Nevertheless, as the SPEI is a relative metric that requires same periods each year, it was impossible to use 'week' as the reference time step for calculations.This is so because the first day of each year can fall on different weekdays, and this temporal offset propagates throughout the entire year.The occurrence of leap years also potentially increases the different periods to be compared.For this reason, each month was divided into four artificial 'weekly' periods: the first from the 1st to the 8th day; the second from the 9th to the 15th day; the third from the 16th to the 22nd day; and the fourth from the 23rd day to the end of the month.This approach ensures that the analysis periods correspond to the same Julian days year after year, except for leap years that contain a one-day offset since February 28th.In addition to the meteorological variables listed above, we also used elevation, latitude and the Julian day to calculate the weekly AED.To match the spatial resolution of the ERA5 dataset, the elevation extracted from the Global 30 Arc-Second Elevation (GTOPO30) was resampled to a common grid resolution of 0.5° using the bilinear interpolation method.

| Crop masks
To facilitate the assessment of drought conditions over major global crop areas, we implemented spatial masks corresponding to the following crops: wheat, barley, corn, soybeans and cotton.The layers of the different crops were obtained from the HarvestChoice project (https://www.ifpri.org/project/harve stchoice), which provides gridded information on the spatial coverage of individual crops at very fine spatial detail (0.08°).To match the spatial resolution of the ERA5 data, crop data were resampled to a grid resolution of 0.5° by means of a majority filter.

| DATA CALCULATION AND UPDATING
Using the weekly precipitation and AED, we calculated the SPEI at different time scales (0.5, 1, 3, 6, 9, 12, 24, 36 and 48 months) according to the methodology detailed in Vicente-Serrano, Beguería, and López-Moreno (2010).As the primary purpose is to use this information for real-time monitoring, it is necessary to generate an operative update of the new SPEI data as soon as new data are available.Recalculating the entire SPEI dataset every week is not feasible due to the necessary computing time that would delay the information update.Moreover, if the whole dataset is recalculated every time new data arrives, the previous data would also change, limiting the comparability of the new data with the earlier records.For this purpose, we initially calculated the necessary parameters of the log-logistic distribution used to obtain the SPEI for 1979-2020.The parameters corresponding to each week of the year and the time scale were pre-calculated and stored.They were used afterwards for calculating the SPEI without recalculating the entire dataset each time.Also, this enables updating the dataset fast and efficiently as soon as new weekly data are available.

SPEIBASE
The SPEIbase was generated in 2010 (Beguería et al., 2010;Vicente-Serrano, Beguería, López-Moreno, Angulo, et al., 2010) and updated with different versions of the Climatic Research Unit (CRU) global gridded dataset (Harris et al., 2020).The SPEIbase uses precipitation and potential evapotranspiration data as a surrogate of the AED to calculate the SPEI at different time scales.This dataset is available at https://spei.csic.es/spei_database,and it has been widely used in drought studies considering other points of view, from the analysis of physical processes (e.g.Das et al., 2016;Yao et al., 2018;Zhao et al., 2017) to the study of drought impacts (e.g.Bachmair et al., 2018;Feldpausch et al., 2016;Wang et al., 2014).The SPEIbase is not used for real-time monitoring in any system because it is updated annually and lacks the data updates required for real-time monitoring.Nevertheless, it provides useful long-term information for comparison with the long-term ERA5 dataset.Here, we compare the spatial and temporal agreement between the new ERA5 SPEI dataset and the SPEIbase.As the CRU data have a monthly temporal resolution, we also calculated the SPEI from monthly ERA5 data.To make both datasets comparable, we calculated the SPEI using the same reference period of 1979-2020.
The SPEI sensitivity to AED is strongly dependent on the average precipitation and AED magnitude (Tomas-Burguera et al., 2020).Nevertheless, we do not expect a bias related to this issue in the comparability between the SPEI from CRU and ERA5 data as both datasets are similar in terms of spatial patterns and magnitude of both variables (Figures 1 and 2).
In general, there is a high correlation (Pearson's r > 0.9) between the 3-month and 12-month SPEI time series calculated from the CRU and ERA5 datasets in vast regions of the world (Figure 3).There are, however, some differences.The temporal correlation between the datasets is higher in Europe, most of Asia, Australia, North America (except for high latitudes), East and Southern South America and Southern Africa.This means that in these regions, the identification of drought conditions is independent of the dataset used.In some other areas, however, the correlation is much lower.This is particularly the case in the Amazon basin and western North America, most central Africa and hyper-arid areas of the Sahara and the Arabian Peninsula.
The factors that explain these differences can be diverse.On the one hand, the spatial distribution of meteorological stations may play a role.In most regions that show a high correlation between the datasets, there is good coverage of meteorological stations.On the contrary, the assessment of convection processes in humid equatorial regions in the ERA5 dataset is also affected by uncertainties (Taszarek et al., 2021) and may affect the comparability between the datasets in these regions.In hyper-arid regions such as the Sahara or the Arabian Peninsula, the SPEI is driven mainly by changes in the AED.In these areas, land-atmosphere feedbacks related to the extremely warm land strongly control air temperature and vapour pressure deficit (Brutsaert, 1986;Brutsaert & Stricker, 1979).It is possible, too, that the reanalysis is affected by uncertainties in assessing these processes.Also, wind speed assessment could play a role because the CRU dataset employs the 1961-1990 average monthly wind speed when calculating AED.On the other hand, as a result of the complexity of the interaction between the relief and the dominant airflow direction, modelling ERA5 wind speed is significantly more uncertain than modelling other variables (Deng et al., 2021;Minola et al., 2020).
We assessed the degree of agreement in drought conditions between the two datasets.For this purpose, we compared the percentage of cases in which dry (SPEI < 0), mild (SPEI < −0.84) and extreme drought (SPEI < −1.65) occurred concurrently.To accomplish this, we calculated the proportion of cases in which drought conditions recorded in the CRU SPEI are well reproduced by the ERA5 dataset (Figure 4a).Also, we looked for cases in which no drought conditions were recorded in the CRU dataset, while drought conditions were presented in the ERA5 dataset (i.e.false positives).
Considering dry conditions (SPEI < 0) the majority of the world regions exhibit high agreement between the two datasets, indicating that the occurrence of dry conditions with the CRU dataset is generally well-reproduced by the ERA5 dataset.When focusing on mild and extreme drought events, the agreement decreases, but large areas of the world agree on the occurrence of drought conditions.Moreover, there is a small percentage of false positives corresponding mainly to extreme and mild drought conditions, demonstrating that when CRU SPEI is greater than 0, the ERA5 SPEI showed similar patterns during the most anomalous drought conditions.Also, the percentage of false positives increases for dry conditions (SPEI < 0).Overall, it can be concluded that the ERA5 SPEI data is capable of identifying the dry conditions identified by CRU dataset, especially in regions of dense meteorological records.Rather, more uncertainty is introduced in regions characterized by scarce and sparse network of observations.
A careful look into the SPEI under specific periods shows that in areas with low meteorological station density, the CRU SPEI seems to be strongly determined by the stations' spatial distribution (Figure 5), which show important spatial and temporal differences (see Figure 1 in Harris et al. (2020)).The CRU SPEI maps present circular shapes (around available observations) that do not correspond to the expected smooth spatial variation of wet and dry conditions.A representative example is the areas of South America (Brazil) and central Africa.The ERA5 dataset, on the other hand, offers a much more coherent spatial distribution of the SPEI, with smooth, gradual transitions and no strange shapes.
In general, there is good agreement in the spatial distribution of the SPEI from both datasets, and they record similar large-scale drought periods.Nevertheless, the spatial correlation is not very high (Figure 6).Correlations vary between 0.5 and 0.6, which means a general agreement between the two datasets.Still, there are noticeable differences on the regional and local scales.
Considering only the spatial structure of the SPEI values in particular months (see Figure 5), it seems that the ERA5 dataset is more coherent with the spatial variations in drought.Nevertheless, modelling precipitation and other variables involved in calculating the AED in the ERA5 dataset, particularly in areas with a low density of meteorological stations, still introduces a degree of uncertainty to this dataset.
As stressed above, the assessment of drought conditions seems to show agreement between both datasets, so for drought monitoring, particularly in areas characterized by high data availability, the use of the ERA5 SPEI appears to be highly recommendable.Nevertheless, for the longterm assessment of drought conditions, further research is required.Some studies have found divergent trends in annual precipitation between ERA5 data and observational datasets (Shobeiri et al., 2021;Tarek et al., 2020).In particular, Nogueira (2020) compared the annual precipitation trends from the GPCC and ERA5 datasets and found that ERA5 shows ample precipitation declining trends in several regions of Africa and South America that are not observed with the GPCC.We found stronger increase in the surface area affected by drought using the ERA5 than the CRU dataset (Figure 7).The difference is likely due to the stronger decline in precipitation in ERA5 than in CRU (results not shown).However, further research is needed to determine the reliability of ERA5 data to assess longterm drought severity trends.
Here, it should be noted that data temporal homogeneity is not a major concern for the real-time monitoring application, which is more concerned with establishing parts of the world are equally affected by this uncertainty.Though global SPEI trends in the CRU and the ERA5 datasets appear to be largely in agreement (Figure 8), there are notable differences between the two datasets in the regions where there is a limited number of meteorological stations and where any precipitation database suffers from significant limitations.In contrast, SPEI trends in Western Europe, North America and Australia are more closely aligned with each other, which strengthens the new system's ability to monitor drought in these regions.Reanalysis of the ERA5 data is expected to improve in the future in terms of data homogeneity, and these data will be used in the drought monitoring system in the future.

| DATASET LOCATION AND FORMAT
The drought monitoring dataset is available at https:// globa l-droug ht-crops.csic.es/.Figure 9 shows a screenshot of the system.In the top-right of the screen, there are two menus.The first one is a timeline bar that allows selecting any weekly timeframe between 1979 and the present.The second one is a menu organized by different SPEI time scales (0.5, 1, 3, 6, 9, 12, 24, 36 and 48 months).Under each time scale, a menu allows selecting the spatial mask of choice.The default option is to the entire world (i.e.no mask applied), and then there are six options corresponding to the major global crop-growing areas.When a crop-growing area mask is selected, only the grid cells in which the particular crop is cultivated are shown.The map is fully navigable, so the user can zoom in and out and change the map centre.Figure 10 shows a zoom-in over the barley-cultivated areas in North America.
Three different options for downloading data are available and appear in the lower-left corner.The first one allows downloading data at a particular grid cell.If a specific point is selected on the map, the system shows the grid cell coordinates.The SPEI data for that grid cell and the time scale chosen can be then downloaded in plain text, comma-separated, format (Figure 11).In addition, it is also possible to display a time plot showing the temporal evolution of the SPEI at that particular point (Figure 12).The figure is interactive, as it allows zooming-in to a specific period, and it also shows the values of singular moments upon cursor moves.
Finally, the entire dataset can be downloaded in netCDF v4 format.There are two downloading options for that.The first one is to download the whole dataset from 1979 to the present, while the second one only downloads the last layer of the dataset.This last option is intended to suit the needs of particular users who already have the entire dataset and need to update it with the latest weekly data.The technical specifications of the netCDF format used are shown in Figure 13.It maintains the coordinate system (geographic) and the number of latitudes (361) and longitudes (720) of the original ERA5 dataset.The dataset times will vary with time, increasing as the dataset is updated with new values.
In comparison with the SPEI global drought monitor (https://spei.csic.es/map)developed in 2011, the inclusion of the ERA5-based dataset has improved the SPEI monitoring system.Since 2011, the system has been maintained live based on GPCC precipitation and CPC mean temperature data.The main improvements to the current system are the following: since every week new SPEI information is generated; and (4) the quality and reliability of the SPEI data included in the system has been improved significantly when considering the AED role, since the new system uses the physically based Penman-Monteith equation to calculate AED.With respect to the SPEI global drought monitor, the AED was calculated using the empirically based Thornthwaite equation (Thornthwaite, 1948) that uses mean temperature data.This simple method was chosen because the Climate Prediction Center (CPC) mean temperature (Fan & Dool, 2008) was the only possible real-time input that allowed estimating the AED in quasi real time.Nevertheless, it is widely known that empirical AED approximations based on temperature data alone show limitations under global warming, as is not the case with physically based approaches (Sheffield et al., 2012;Vicente-Serrano et al., 2020).Although the SPEI global drought monitor will be maintained given its high accessibility (more than 3,000 visits per month), we recommend that current and future users of real-time SPEI data migrate to the new system based on ERA5 input given its higher temporal frequency, spatial resolution and physical consistency in AED calculation.

| CONCLUSIONS
This study describes the generation of a global drought dataset based on the Standardized Precipitation Evapotranspiration Index (SPEI) using the ERA5 reanalysis dataset.The dataset is maintained in near real time and updated weekly and is integrated into a user-friendly interface with several functionalities.These characteristics make it an effective global drought monitoring system.The system provides information on a global scale, but users interested in crop-growing areas may apply spatial masks to show only the areas where specific crops are grown.
The associated long-term dataset, covering the period between 1979 and the present, is potentially helpful in assessing the possible effects of global warming on drought given the role of the AED.Although this information is helpful in determining the severity of particular drought events, further research is necessary to

F
I G U R E 1 Spatial distribution of average annual atmospheric evaporative demand and precipitation from 1979 to 2020 based on the CRU and ERA5 datasets.The difference (CRU minus ERA5) between the average values is also presented.

F
Scatterplots of the average annual, January and July atmospheric evaporative demand (AED) (left) and precipitation (right), using the ERA5 and CRU datasets.The scale represents the density of points.F I G U R E 3 Spatial distribution of the Pearson's r correlation between the 3-month and 12-month SPEI between 1979 and 2020 using CRU and ERA5 datasets.Statistically significant trends are set at Pearson's r = 0.11.spatial differences between areas affected by dry or humid conditions.It is evident that the metrics used to quantify the severity of drought could be affected by the lack of temporal homogeneity in the input data.Nonetheless, this level of uncertainty can be found in any precipitation dataset (Hassler & Lauer, 2021).In addition, not all F I G U R E 4 (a) Percentage of cases in which there is an agreement between the occurrence of droughts identified with the CRU SPEI, (b) percentage of false positives: Cases in which the CRU SPEI does not show drought conditions and the ERA5 SPEI suggests drought of different severity.F I G U R E 5 Spatial distribution of the 3-month SPEI from CRU and ERA5 datasets for June 2006 (upper panels) and September 2019 (lower panels).

F
Evolution of Pearson's r spatial correlations at the global scale between the CRU and ERA5 SPEI at the 3-month (blue) and 12-month (red) timescales.F I G U R E 7 Evolution of the percentage of surface areas affected by mild (<−0.84;yellow), moderate (<−1.28;red) and extreme (−1.65; dark red) drought conditions on the global scale.Results are presented for the 3-and 12-month SPEI based on the CRU and ERA5 datasets.To avoid the role of the strong temporal autocorrelation in the trend analysis, trends were calculated from the December SPEI at the time scale of 12 months and the average of March, June, September and December SPEI at the time scale of 3 months.Trend analysis was based on the Mann-Kendall test, while the magnitude of change was calculated by means of linear regression.The correlations between the CRU and ERA5 series are shown at the bottom of the figure.

F
Three-month SPEI trends from 1980 to 2020 using the SPEIbase and the new ERA5 dataset and the differences between the two datasets.The units are in SPEI z-unit/decade −1 .
(a) the new dataset multiplies by four the spatial resolution, as it goes from 1° to 0.5°; (b) the temporal frequency of the new dataset increases from the monthly time scale available in the SPEI global drought monitor to the weekly frequency of the new SPEI monitor based on ERA5 data; (c) the new system has also been enriched with more updated information F I G U R E 9 View of the drought monitoring system based on the ERA5 data and the SPEI.