AsiaRiceYield4km: seasonal rice yield in Asia from 1995 to 2015

. Rice is the most important staple food in Asia. However, high-spatiotemporal-resolution rice yield datasets are limited over this large region. The lack of such products greatly hinders studies that are aimed at accurately assessing the impacts of climate change and simulating agricultural production. Based on annual rice maps in Asia, we incorporated multisource predictors into three machine learning (ML) models to generate a high-spatial-resolution (4 km) seasonal rice yield dataset (AsiaRiceYield4km) for the 1995–2015 period. Predictors were divided into four categories that considered the most comprehensive rice growth conditions, and the optimal ML


Introduction
As one major staple crop, rice (Oryza sativa L.) provides more than a quarter of calories for approximately half of the population but accounts for only 11 % of the arable land on Earth (Maclean et al., 2002;Alexandratos and Bruinsma, 2012;Birla et al., 2017;Qian et al., 2020).Asia produces and consumes more than 90 % of the global rice (Bandumula, 2018).The rice production is dominated by poor smallholder farmers. .Therefore, information on the rice yield in Asia is essential to maintain food security and farmers' livelihoods (Laborte et al., 2017).In the last half-century, the growth of rice yields has contributed more to an increase in production than the expansion of cultivation areas (Blomqvist et al., 2020), and yield will remain a dominant factor considering land-use policies aimed at reducing environmental pressure (Lambin and Meyfroidt, 2011;Kim et al., 2021).In addition, Asia has complex rice-cropping systems, and rice may be cultivated multiple times within 1 year (G.Zhang et al., 2020).It is critically necessary to identify the long-term and seasonal Asian rice yields -at high spatial resolution -to monitor and guide agricultural production.
Previous global-scale crop yield datasets, including the Harvester Area and Yields of 175 crops (M3-Crops) (Monfreda et al., 2008), the Spatial Production Allocation Model (SPAM) (You and Wood, 2006;Yu et al., 2020), the Global Dataset of Historical Yields of Major Crops (GDHY) (Iizumi et al., 2014;Iizumi and Sakai, 2020) and the Global Gridded Crop Model Intercomparison (GGCMI) phase 1 (Müller et al., 2019), have been produced and widely employed in many studies (Folberth et al., 2020;Kaltenegger and Winiwarter, 2020;Iizumi et al., 2021;Lin et al., 2021;W. Liu et al., 2021).However, due to the different research goals and technical restrictions, their spatial resolutions are relatively coarser (e.g., ∼ 10 km for M3-Crops and SPAM; ∼ 55 km for GDHY and GGCMI phase 1), and their temporal resolutions are mostly annual (Laborte et al., 2017).Only a few datasets have seasonal temporal information (e.g., GDHY), but they still cannot cover all rice seasons (Kim et al., 2021).In addition, the time spans are limited (e.g., only 1 year for M3-Crops; every 5 years for SPAM).For the long-term rice yield dataset (GDHY), the authors used a fixed rice area base map that did not provide the interannual spatial dynamics of the rice yield.To the best of our knowledge, a long-term seasonal rice yield dataset with higher spatial resolution and a dynamic spatial distribution is currently unavailable for the major rice cultivation regions in the world.
To address the above issues, there is a significant need to acquire multisource data and wiser technologies for rice yield estimation (Chlingaryan et al., 2018;Cao et al., 2020;van Klompenburg et al., 2020;Z. Zhang et al., 2020;Chen et al., 2022).With the rapid development of remote sensing technology in recent years, large-scale and long-term high-spatiotemporal-resolution observations provide ample and timely phenological and growing information for rice.
Ground-based data, such as climate and soil, also provide more key environmental information (Folberth et al., 2016;Zhang et al., 2021).Many publications that have successfully combined satellite-derived data and ground environmental information for yield estimation have expanded our knowledge (Huang et al., 2013;Mosleh et al., 2015;Cao et al., 2021;Fernandez-Beltran et al., 2021).Nevertheless, to date, few studies have employed annual rice paddy areas for yield estimation.Moreover, machine learning (ML) models, such as random forest (RF), extreme gradient boosting (XGBoost) and long short-term memory (LSTM), have been increasingly and successfully used in crop yield estimation (Cai et al., 2019;van Klompenburg et al., 2020;Sakamoto, 2020;Luo et al., 2022).Such ML models can overcome the drawbacks of two traditional estimation methods: processbased crop models (PCMs) and statistical regression methods (SRMs).Compared with PCMs, ML can wisely select input variables according to the actual requirements and local geographical environmental conditions without complicated parameters (Jeong et al., 2022).Due to the complex functions with higher efficiency and flexibility, the yield estimation results of ML are always better than those of SRMs (Chlingaryan et al., 2018).In addition, ML has a good spatial generalization.Therefore, ML models combined with multisource data potentially provide a good chance for large-scale gridded yield estimation and associated accuracy improvement.
In this work, we integrate multisource data and annual rice maps into ML models in order to generate a seasonal rice yield dataset at a 4 km resolution across Asia (AsiaRiceYield4km) for the period from 1995 to 2015.AsiaRiceYield4km will better support agricultural monitoring systems and related research over a large scale because of its higher spatiotemporal resolution and longer time span.

Study area
Asia is the most important rice-producing area, accounting for 89 % of the cultivation area and 91 % of global production (FAO, 2022).Considering the accessibility of local censusbased rice yield data, 14 main rice-producing countries in Asia were selected and then divided into 27 cases (one case refers to one specific rice-cropping period in a country) based on different rice-cropping systems (single, double or triple rice), as shown in Fig. 1.

Data
Multisource data were collected for rice yield estimation, including annual rice area maps, the rice yield of 1400 administrative units (minimum administrative-division-scale units for each country with available rice yield), leaf area index (LAI) information (from remote sensing products) and rice  (Han et al., 2021(Han et al., , 2022)).The pie chart represents the area proportion of different rice-cropping systems.Panel (b) presents the case numbers and cropping system for each country.Double rice follows the order of early before late (e.g., 12 and 13 represent the early-season rice and late-season rice in China, respectively), and triple rice follows the order of spring, autumn and winter (e.g., 25, 26 and 27 represent the spring season rice, autumn season rice and winter season rice in Vietnam, respectively).
growth environmental conditions (location, time, soil and climate).In addition, considering the necessity for phenological information, we also produced gridded key phenological dates from LAI data based on inflection-based and thresholdbased methods (Sect.2.3.1).Except for yield records at the administrative unit scale from official statistics (Table S1), the other data were resampled to a 4 km × 4 km resolution using the nearest-neighbor resampling method in ArcMap 10.2 (the original spatial information is listed in Table S2).

Rice area maps
We selected the latest public rice distribution map dataset, APRA500 (an annual dataset of rice paddy area at a 500 m resolution from 2000 to 2020), in this study (Han et al., 2021(Han et al., , 2022)).APRA500 provides annual rice distribution information, which can reduce the influence of other land-cover types.Due to the topographic conditions, cloud contamination and the mixed-pixel effects owing to fragmented cropland fields, the rice area in APRA500 was somehow underestimated (Han et al., 2022).To reduce this effect, we used the combined rice area of 3 years (the current year, the previous year and the following year) to represent the rice area of the current year (e.g., the area of 2005 was the union of 2004, 2005 and 2006).Specifically, the union of area information from 2000, 2001 and 2002 was also applied to the years from 1995 to 2000 due to unavailable area maps.

Seasonal rice yield
Rice seasons were mainly determined based on RiceAtlas (Laborte et al., 2017).RiceAtlas is the most comprehensive and detailed database with respect to the rice season and has been widely used in many studies (van Oort and Zwart, 2018;Muehe et al., 2019;Fritz et al., 2019).The United States Department of Agriculture (USDA, https://ipad.fas.usda.gov/ogamaps/cropcalendar.aspx, last access: 7 April 2022) and the national statistics of each country were also referenced for rice season determination.The rice seasons have various names in different countries, such as "Aman", "Aus" and "Boro" for triple rice in Bangladesh and "Rabi" and "Kharif" for double rice in India.To make the data more readable and consistent, we used single rice (single season), double rice (early and late seasons) and triple rice (spring, autumn and winter seasons) to refer to the three rice-cropping systems in our study, as shown in Fig. 1b.A few rice seasons (e.g., the early season in Cambodia, Malaysia, Myanmar and Indonesia as well as the winter season in India) were not considered due to the lack of yield records. https://doi.org/10.5194/essd-15-791-2023 Earth Syst.Sci.Data, 15, 791-808, 2023 We collected seasonal rice yield data from the Food and Agriculture Organization (FAO) of the United Nations and other governmental websites (Table S1).Over 45 000 rice yield records from 1400 administrative units were collected for the period from 1995 to 2015.The quality of these data has been checked, and some yield outliers were filtered out according to the following rules: (a) they exceeded the actual biophysically attainable yields, and (b) they were beyond the average ± 2 times the variance during the 1995-2015 period (Zhang et al., 2014;Cao et al., 2020Cao et al., , 2021)).

Key phenological dates
The transplanting, heading and maturity dates are the three most important phenological dates during the rice growing period.The whole growing period (WGP) is divided into two subperiods, according to the three key phenological dates, as follows: the vegetative period (VEP; from transplanting to heading) and the reproductive period (REP; from heading to maturity).However, most rice phenology datasets are provided at the administrative scale and lack information on interannual variation.For example, the United States Department of Agriculture (USDA) provides countryscale growing phenological information, and RiceAtlas provides subnational phenology information but disregards the annual dynamics (Laborte et al., 2017).In addition, these datasets lack heading date information for rice.Here, we retrieved the three dynamic key rice phenological dates from remote sensing data in Asia during the 1995-2015 period at a 4 km × 4 km grid scale using inflection-based and threshold-based methods (Sect.2.3.1).The USDA and RiceAtlas datasets provided a threshold range for phenology and were used to validate our extracted phenological dates.

Location and time
Location information includes latitude (lat), longitude (long) and elevation (ele).The Global 30 arcsec (1 km) gridded digital elevation model (DEM) dataset (GLOBE Task Team et al., 1999) from the National Oceanic and Atmospheric Administration (NOAA) was employed in this study.The latitude and longitude information was collected from the centroid of each resampled 4 km pixel by ArcMap 10.2.The temporal information is represented by the year .

LAI
Remote sensing indices have been widely used in rice yield estimation (Son et al., 2020;Arumugam et al., 2021), but few studies were conducted before 2000 (C.Liu et al., 2021).To extend the period of the gridded yield dataset from 1995 in this study, we adopted Global Land Surface Satellite (GLASS) Advanced Very High Resolution Radiometer (AVHRR) LAI data (http://glass.umd.edu/Download.html, last access: 7 April 2022; Xiao et al., 2013Xiao et al., , 2016Xiao et al., , 2017)), which begun from 1981 with a fine spatial resolution of 4 km and a temporal resolution of 8 d.Compared with other similar products, GLASS AVHRR LAI has the highest accuracy and lowest uncertainty (Liang et al., 2021).The GLASS AVHRR LAI was used for rice phenological information extraction and yield estimation.

Methods
We applied three steps to generate AsiaRiceYield4km by incorporating multisource data into three ML methods: determining phenological dates, categorizing and selecting predictors, and developing the optimal models and generating the gridded rice yield (Fig. 2).Details of each step are provided in the following sections.

Determining phenological dates
Inflection-based (Chen et al., 2016;Luo et al., 2020a) and threshold-based (Manfron et al., 2017) methods were employed to detect rice phenological dates (Fig. 2, Step 1) according to the following rules: 1.For transplanting dates, the LAI always maintains a low value for a period before the transplanting date and dramatically increases after this date (Sakamoto et al., 2005;Chen et al., 2018).Therefore, if there is one point in the LAI curve where the following first derivative is >0 or its second derivative is equal to 0, this point is defined as the transplanting date.2. For heading dates, the inflection point from VEP to REP (Wang et al., 2018) is characterized by the maximum value of the LAI between the transplanting date and the maturity date (Son et al., 2013).
3. For maturity dates, the physiological activity of rice will sharply drop during the harvesting period.The first inflection point at the LAI curve where its first derivative becomes negative is considered the maturity date.
In addition, LAI values of pixels beyond the average ± 2 times the standard deviation (SD) were filtered (Zhang et al., 2022).
If the phenological dates in some grids cannot be detected nor filtered using the above rules, the average value of the administrative unit where the grids are located is applied.

Categorizing and selecting predictors
To provide comprehensive rice growth information for the ML models, we divided the multisource data into four categories including 50 predictors (Table S3): cumulative growing predictors (CGPs) of different growing periods, extreme growing predictors (EGPs), constant environmental conditions (CECs) and temporal information (TI) (Fig. 2 scale.The predictor values of grids located in one administrative unit were averaged to this administrative unit.High-dimensional predictors often affect the accuracy and computational efficiency of ML methods (LeCun et al., 2015;Zhang et al., 2019).To reduce this effect, Pearson correlation analysis was employed to estimate the relationship between yield and other variables for each case.The variables with a significant correlation (p<0.05) were selected as predictors (Cao et al., 2021).The yield and selected predictors of one case were input into one model.Specifically, the four predictors (long, lat, ele and year) were considered to have a stable impact on rice yield and were included in all 27 estimation models for the 27 cases (Ray et al., 2019;Huntington et al., 2020).Considering the covariate relation of CGPs for the WGP and the remaining two periods, the predictors of the WGP would be selected if its Pearson R was higher than that in the remaining two periods, or vice versa.

Developing the optimal models and generating gridded rice yield
The optimal yield estimation models were developed and used for gridded rice yield dataset generation according to the following process: 1. Dataset division rules.To effectively reduce overfitting effects (Dinh and Aires, 2022), we divided all of the data into three sets (training, validation and testing) that were used to optimize the ML parameters, select the optimal model and evaluate its generalization ability, respectively (Ripley, 2007).A diagram of the database division process is shown in Fig. 2 (Step 3).For each case, the whole database contained the selected predictors from all administrative-scale units during 1995-2010.
The database was randomly divided into two subsets by the administrative unit: 20 % of the samples were used for testing, and the remaining 80 % of the samples were randomly re-split into 70 % for training and 30 % for validation without considering the administrative units.Thus, the training, validation and testing sets contained 56 % (80 % × 70 %), 24 % (80 % × 30 %) and 20 % (20 % × 100 %) of the dataset, respectively.Such rules with respect to division avoid information leaking from the testing set to the training set (Meroni et al., 2021) and enhance the robustness of the model.
2. ML models.ML can develop transfer functions based on the relationships between predictors and target variables for rice yield estimation (Chlingaryan et al., 2018;Shahhosseini et al., 2020).Three widely employed ML models, RF, XGBoost and LSTM, were selected for rice yield estimation.The RF model is based on the bagging ensemble model, which generates multiple decision trees and obtains predictions by voting on all individual trees (Breiman, 1996(Breiman, , 2001)).In addition, extra randomness is introduced to the RF when generat-ing trees and searching for the best tree stages (Shahhosseini et al., 2020).It provides more diversity for trees and can generate overall better model performance (Zhang et al., 2019).XGBoost uses optimized gradient boosting for decision trees, which tries to make weak learners strong (Chen and Guestrin, 2016).This method adopts an updated strategy to train the estimated model, and the updated model minimizes loss by reducing errors from previous models (Obsie et al., 2020).LSTM is a special recurrent neural network (RNN) that was proposed to overcome the vanishing and exploding gradient problems of RNNs (Hochreiter and Schmidhuber, 1997;Sak et al., 2014;Tian et al., 2021).LSTM contains input, hidden and output layers, and the hidden layers consist of memory cells (He et al., 2019;Zhang et al., 2019).Tuning hyperparameters can effectively improve the accuracy for rice yield estimation (Shahhosseini et al., 2021).The hyperparameter tuning details and Python library information for the ML algorithm are given in the Supplement.
3. Model evaluation.The coefficient of determination (R 2 ) and root-mean-square error (RMSE) were adopted to evaluate the performance of each model for each case.
Here, i is the number of administrative units, n is the total number of administrative units and j is the year.Y ob i,j is the observed rice yield from governmental websites or the FAO website in the ith administrative unit of year j , Y ob i,j is the average of the observed rice yield in the ith administrative unit of year j and Y es i,j is the AsiaRiceYield4km yield in the ith administrative unit of year j .accuracy was selected as the optimal ML model.and lowest RMSE ad is regarded as the optimal model for each season in Fig. 1b.
5. Gridded rice yield generation.For each case, predictors of gridded scale consistent with administrative-scale selected predictors (Sect.2.3.2) were input into the optimal model, and the gridded rice yield was generated for the period from 1995 to 2015.All 27 cases followed this process and were combined to generated the AsiaRiceYield4km dataset.
6. Uncertainty spatialization.To provide the spatial uncertainty, the relative RMSE (RRMSE; Eq. 8) of AsiaRiceYield4km was calculated according to Luo et al. (2020b).The RRMSE of each administrative unit was allocated to the centroid of the unit, and the kriging interpolation method was used to spatialize the distribution of uncertainty.
where m is the total number of years.

Performance of the estimated models
After selecting the optimal ML model for each case, we plotted the scatter of the seasonal training, validation, testing and adjusted accuracy (Fig. 3).The training R 2 is higher than 0.9 for all cases; validation and testing R 2 values average 0.78 and 0.69, respectively.The R 2 ad ranges from 0.60 to 0.90 (average of 0.77), with the lowest R 2 ad found for the single season in Malaysia and the highest R 2 ad found for the winter season in Bangladesh (Fig. 3c).As for the RMSE, the average values for training, validation and testing are 105, 408 and 489 kg ha −1 , respectively.The RMSE ad ranges from 162 to 817 kg ha −1 , and its average is 396 kg ha −1 .The highest RMSE ad is for single rice in China (Fig. 3d).The rice yields in China are mostly higher than those of other countries, which might cause more modeling uncertainty.For doublerice systems (Fig. 3b, e), no significant difference is found between their modeling accuracies, with values of approximately 0.77 for the R 2 ad and 410 kg ha −1 for the RMSE ad .For triple rice, the winter season in Bangladesh has the highest R 2 ad (0.90; dot no.24 in Fig. 3c), and the spring season in Vietnam has the lowest RMSE ad (327 kg ha −1 ; dot no. 25 in Fig. 3c).Additionally, the 27 optimal models consist of two types of ML models -XGBoost for 15 seasons and RF for 12 seasons -but no LSTM models.The 27 optimal models and their hyperparameters are listed in Table S5.

Comparing AsiaRiceYield4km products with the observations
After aggregating AsiaRiceYield4km into administrative units, we compared them with the observed yield at administrative and annual scales.At the administrative scale, comparisons were separately conducted for single, double and triple rice, as shown in Fig. 4. The estimated and observed yields are close to the 1 : 1 line.The overall R 2 is higher than 0.87, and the RMSE is lower than 921 kg ha −1 , suggesting that AsiaRiceYield4km is mostly identical to the observations.The accuracy of single rice (R 2 = 0.88, RMSE = 920 kg ha −1 ) is slightly lower than that of double rice (R 2 = 0.91, RMSE = 554 kg ha −1 ) and triple rice (R 2 = 0.93, RMSE = 494 kg ha −1 ), mainly because some high-yielding units are not well estimated for single rice (Fig. 4a).Moreover, late rice shows higher accuracy than early rice (R 2 = 0.92>0.89,RMSE = 553 kg ha −1 <556 kg ha −1 ), which is consistent with previous work (Cao et al., 2021).As for triple rice, winter rice has higher accuracy than spring and autumn rice, even though its yield range was the greatest.
At the interannual scale, the annual average yield from AsiaRiceYield4km and the observed yields for each case are presented.All seasons are statistically highly significant (p<0.001), and the R 2 value of all of the results is higher than 0.8.In addition, the differences of the SD are also presented in Fig. 5.The largest difference is the early season for double rice in Vietnam and is mainly attributed to the underestimation in AsiaRiceYield4km after 2006.All SD differences are lower than 200 kg ha −1 , indicating that AsiaRiceYield4km can estimate and capture the interannual variations in observed yields well.https://doi.org/10.5194/essd-15-791-2023 Earth Syst.Sci.Data, 15, 791-808, 2023    2005 and 2010 than those of SPAM, respectively, and the corresponding RMSE values are 570, 692 and 592 kg ha −1 lower than those of SPAM.Moreover, AsiaRiceYield4km shows better spatial consistency with the observed yield across the whole area.The spatial variation in the yield in AsiaRiceYield4km and the observed yield are identical in the IGP, whereas some administrative unit yields are overestimated in SPAM (Fig. 6a1-c1).

Spatiotemporal characterizations of AsiaRiceYield4km
Based on the estimated seasonal yields from optimal ML models, we characterized the spatiotemporal patterns of rice yields during the 1995-2015 period.Spatially, single rice is widely distributed in 11 countries across the whole area, where its yield varies greatly from 400 to 10 000 kg ha −1 with an average of 5428 kg ha −1 .Specifically, the highest average yield is in China (7384 kg ha −1 ), and the lowest yield is in India (1889 kg ha −1 ).Such a large difference might be ascribed to better irrigation in China (  al., 2010) and relatively low-level soil fertility, investment and technology in India (Srivastava and Mahapatra, 2012).Double rice is mostly distributed between 30 • N and the Equator.Double rice shows insignificant differences between the early yield and late yield: the early-season rice yield ranges from 1041 to 8347 kg ha −1 with an average yield of 4598 kg ha −1 , whereas the late-season rice yield ranges from 666 to 7977 kg ha −1 with an average yield of 4539 kg ha −1 .Three rice seasons (triple rice) exist in Bangladesh and Vietnam.The rice yield for spring, autumn and winter ranges from 3034 to 6249, from 2690 to 6986 and from 2514 to 10 870 kg ha −1 , respectively, with corresponding averages of 4153, 4716 and 7794 kg ha −1 .Notably, the highest average yield is 8597 kg ha −1 for winter rice in Bangladesh, due to high-yielding hybrid rice varieties and well-managed fieldwork (e.g., fully irrigated and increasing fertilizer, pesticide and herbicide applications) (Meroni et al., 2021).
Temporally, the interannual rate of yield change (defined as the yield difference between the previous year and the current year divided by the yield of the previous year) from 1995 to 2015 for each case is shown in Fig. 8.The annual rate ranges from −18.55 % to 25.57 %.The average interannual rate during the 1995-2015 period increases in most cases, with the exception of single rice in Japan (−0.01 %) and the early season of double-rice systems in Thailand (−0.11 %).Among all cases, the greatest average rate is 2.65 % in Cambodia.

The frequency and importance of the predictors in ML models
In this study, 50 predictors were used in ML models, but their contributions greatly varied.First, only predictors with a sig-nificant correlation with yields were selected for ML models, with the exception of temporal and spatial predictors (year, long, lat and ele) (see Sect. 2.3.2 for details).As a result, the selection frequency of temporal and spatial predictors was 27 times, and the selection frequency of other predictors ranged from 2 to 25 times (Fig. 9a).Using the selected predictors, ML models then estimated rice yields and ranked the importance of each predictor (Fig. 9a).importance (>0.05) and that the importance of the remaining predictors was lower than 0.03 (Fig. 9a).
For different growing periods, the REP predictors had greater average importance (0.010) in ML models, followed by the WGP and VEP predictors (0.007 and 0.005, respectively).The average selection frequency for the WGP and VEP predictors (8.4 and 10.9 times, respectively) was much lower than that of REP (14.5 times).Therefore, REP predictors contributed the most to yield estimation, which was also consistent with previous studies (Chang et al., 2005;Nazir et al., 2021).In addition, we also found that EGPs had greater average importance and selection frequency (0.014 and 21.3 times, respectively) than CGPs (0.007 and 11.3 times, respectively), indicating the stronger response of rice yields to extreme growth conditions.
Figure 9b further shows the proportioned importance of the four predictor categories for each rice season.Although the proportioned importance varied for different rice seasons, the overall contribution was highest for CECs (45 %), followed by EGPs (21 %), TI (18 %) and CGPs (16 %).CECs had the greatest proportioned importance for most countries, which suggested the great importance of the geographical environment for rice yield estimation.More interestingly, the

Improvements in AsiaRiceYield4km
AsiaRiceYield4km is a seasonal rice yield product with a high spatiotemporal resolution and a long time span across the dynamic rice cultivation areas in the main rice-producing countries of Asia.Compared with SPAM, the spatial resolution of AsiaRiceYield4km is 4 km, which is the current highest resolution among all rice yield datasets.Additionally, the product period covers the years from 1995 to 2015, includes multi-seasonal rice yields within 1 year and incorporates more information than most other rice yield datasets.Similarly, AsiaRiceYield4km considers both the annual dynamic change in rice cultivation areas and phenological information at the grid scale, rather than a constant cultivation area map and a fixed growing period.Such dynamic information assisted us in capturing better spatial and temporal variations in rice yields and, consequently, greatly improved the accuracy of our product.Moreover, we applied four predictor categories and the optimal ML models to estimate seasonal yields.Four predictor categories provided comprehensive rice growth information to ensure the accuracy of yield estimations.The optimal models for each rice season are determined by the IPW method.As it is a weighted ensemble assessment that fully considers training, validation and testing accuracy, we are certain that the IPW method is more robust and reasonable with respect to selecting the optimal model for seasonal rice yield in Asia.

Uncertainty analysis
With respect to the spatial uncertainty, the RRMSE values in most areas were below 30 %, indicating the low uncertainty in AsiaRiceYield4km.High uncertainty in the RRMSE (above 50 %) was distributed in northeastern China and western India for single rice and in central Bangladesh for the winter season of triple rice (Fig. 10).
In this study, we have improved the yield prediction processes in order to ensure the accuracy of the AsiaRiceYield4km product as much as possible; however, several factors might still negatively impact its accuracy.Due to the limitations of remote sensing techniques (e.g., clouds and topography), some rice paddy areas cannot be recognized, leading to map errors (Han et al., 2022).Moreover, the use of rice areas before 2000, based on the combined rice area from 2000 to 2002, also introduces some uncertainty due to the unavailability of specific information on these rice areas.The spatial resolutions of multisource data also cause uncertainty.For example, given that the rice cultivation areas in Asia are always fragmented (Lowder et al., 2016)   the LAI resolution in this study is somewhat coarser (0.05 • ), mixed-pixel effects will inevitably influence the accuracy of AsiaRiceYield4km in small rice cultivation areas.Although the GLASS LAI has highest accuracy and lowest uncertainty, and we have made several efforts to mitigate the uncertainty, there is still uncertainty and inevitable effects on the rice yield estimation (Liu et al., 2018;Li et al., 2018;Fang et al., 2019;Chen et al., 2020).In addition, the crop intensity used in this study is an administrative-scale value.The annual crop intensity variation in rice still alters the yield estimation results.Finally, due to the lack of a process-based mechanism, ML is weakly traceable and interpretable for rice yield vari-ability (Muruganantham et al., 2022), especially for extreme rice yields.Nevertheless, compared with other public products (Fig. 6), our methods still generated better seasonal rice yield predictions at a higher spatiotemporal resolution for a longer period.
We encourage users to independently verify data products before using them.

Conclusions
We produced a long-term seasonal rice yield dataset with a high spatiotemporal resolution on dynamic rice paddy areas in Asia by using multisource data and ML models.Our AsiaRiceYield4km dataset has higher accuracy than other public datasets and shows more spatial consistency with the observed yield.We attributed such improvements to more dynamic information (e.g., rice area and phenological dates), full consideration of rice growth conditions and the novel IPW method to select the optimal ML model.Moreover, we discovered that constant environmental conditions contributed the most (∼ 45 %) to rice yield prediction compared with other growing conditions.REP predictors had a higher impact on yield predictions than those in the WGP and VEP.Our dataset can address the lack of seasonal rice yield datasets and support studies related to agricultural production and development.

Figure 1 .
Figure 1.(a)Rice cultivation areas with different cropping systems in the main rice-producing countries of Asia.The green area represents the maximum rice paddy area where paddy rice grew for at least 1 year during the 1995-2015 period(Han et al., 2021(Han et al., , 2022)).The pie chart represents the area proportion of different rice-cropping systems.Panel (b) presents the case numbers and cropping system for each country.Double rice follows the order of early before late (e.g., 12 and 13 represent the early-season rice and late-season rice in China, respectively), and triple rice follows the order of spring, autumn and winter (e.g., 25, 26 and 27 represent the spring season rice, autumn season rice and winter season rice in Vietnam, respectively).

Figure 2 .
Figure2.Flowchart for generating long-term and high-resolution gridded rice yields by incorporating multisource data into ML models for one case.All 27 cases followed these steps and were combined to obtain the AsiaRiceYield4km dataset.
, Step 2).A CGP includes the sum of each LAI and climate variable in different growing periods (the VEP, REP and WGP), reflecting the overall growing and weather difference among the three continuous growing periods.An EGP consists of the maximum and minimum of each climate and LAI variable, considering the impact of extreme events.CEC predictors reflect the influence of the geographical environment on rice growth.TI reflects long-term agronomic technological improvements and variety renewal(Huntington et al., 2020).All of these predictors were aggregated to the administrative https://doi.org/10.5194/essd-15-791-2023EarthSyst.Sci.Data, 15, 791-808, 2023

Figure 3 .
Figure 3. Accuracy, with respect to the R 2 (a-c) and RMSE (d-f) of the estimated yields for seasonal rice in each region.The color of the dots indicates different training accuracy ranks.The testing accuracy is given on the x axis, and the validation accuracy is given on the y axis.The size of dots represents the adjusted accuracy.Note that the numbers beside each dot represent each case shown in Fig. 1b.

Figure 4 .
Figure 4. Comparison of AsiaRiceYield4km with observed yields at the administrative-unit scale for (a) single rice, (b) double rice and (c) triple rice.

3. 3
Comparing AsiaRiceYield4km products with SPAM Due to the limited temporal coverage and rice season information in SPAM, only single-rice systems in 2000, 2005 and 2010 were compared between AsiaRiceYield4km and SPAM.The spatial distribution of the rice yield in AsiaRiceYield4km and SPAM as well as the observed yield in 2005 are presented in Fig. 6a-c, with zoomed in views of the Indo-Gangetic Plain (IGP) in Pakistan and India (Fig. 6a1-c1).After aggregating AsiaRiceYield4km and SPAM data to administrative units, both products were also quantitatively compared with the observed yield for 2005 (Fig. 6d).Similar comparisons for 2000 and 2010 are shown in Fig. S1.Compared with SPAM, AsiaRiceYield4km has a higher R 2 and a lower RMSE.Specifically, the R 2 values of AsiaRiceYield4km are 0.18, 0.23 and 0.20 higher in 2000,

Figure 5 .
Figure 5. Interannual comparison of AsiaRiceYield4km with the observed yield from 1995 to 2015.

Figure 6 .
Figure 6.Yield distribution of (a) AsiaRiceYield4km, (b) SPAM and (c) observed yields in 2005 as well as (d) quantitative comparisons with the observed yields in 2005.Panels (a1) to (c1) are the zoomed in views of the IGP in Pakistan and India for (a1) AsiaRiceYield4km, (b1) SPAM and (c1) the observed yields.

Figure 7 .
Figure 7. Spatial patterns of the estimated rice yields (averages for the 1995-2015 period) for different seasons.

Figure 8 .
Figure 8. Temporal variation in the estimated rice yield change for different seasons from 1995 to 2015.

Figure 9 .
Figure 9. Panel (a) shows the average frequency and importance of each predictor, and panel (b) presents the proportioned importance of each predictor category for seasonal rice.