Utilizing rainy season onset predictions to enhance maize yields in Ethiopia

For countries dependent on rainfed agriculture, failure of the rainy season can lead to serious consequences on the broader economy. Maize, a common staple crop in these countries, often expresses significant interannual variability, given its high sensitivity to water stress. It is traditionally planted at rainy season onset to maximize the growing season and potential yield; however, this risks planting during a ‘false onset’ that can damage the crop or require replanting. Rainy season onset forecasts offer some promise in reducing this risk; however, the potential for increasing yield has not been explicitly quantified. This study quantifies the yield gap associated with suboptimal maize planting times using a process-based crop model over a 36 year historical period across Ethiopia. Onset-informed and forecast-informed approaches are compared with a baseline approach, and results indicate a strong potential for yield gains in drier regions as well as reductions in interannual variance countrywide. In contrast, regions with reliably sufficient precipitation illustrate only minimal gains. In general, integration of onset forecasts into agricultural decision-making warrants inclusion in agricultural extension efforts.


Introduction
Agriculture forms the backbone of the Ethiopian economy, employing three-quarters of the labor force and accounting for nearly 40% of the country's gross domestic product (Dorosh and Minten 2020). In most of the country, the growing season revolves around the summer Kiremt rains, which account for 65%-95% of total annual rainfall (Segele et al 2015). A failure of these rains can have catastrophic consequences not only for rainfed farmers, but for the broader economy as well (Segele and Lamb 2005). While much literature has been devoted to the intensity of the Kiremt rains and its relationship with agricultural production, relatively little has focused on the timing of the rains, despite its importance in determining farm practices, particularly in the context of increasing variability under climate change (Darabant et al 2020). Indeed, as recently as 2015, delayed onset of rains contributed to widespread drought in Ethiopia, a trend which is most likely to increase over time (Philip et al 2018). This is particularly relevant for early season crops such as maize, which are planted soon after the rains start to maximize the length of the growing season (Liben et al 2015). The benefits of early planting, however, can be negated in years of 'false onset,' in which a wet period that prompts planting is followed by a dry spell that can reduce seedling density or require replanting (Kipkorir et al 2007, Tadross et al 2009. Soil moisture in the early stages of crop growth is critical to a good harvest, particularly in dry years (Yang et al 2021). The intense drought of 2015, for example, was especially acute in that both the timing and intensity of rainfall (i.e. a late start followed by only sporadic rainfall) lead to failed harvests (Philip et al 2018). Climate change is likely to increase the frequency of such dry seasons (Philip et al 2018), prompting a need for both quality forecasts and adaptive management practices. Farmers have already begun to adapt to climate change through shifting of cropping seasons, adoption of new cultivars, and more flexible timing of planting dates to align with precipitation (Darabant et al 2020), but the need for clear, useful forecasts remains.
Studies detailing forecasts for seasonal total rainfall and corresponding agro-economic impacts are well-developed (Block 2006, Alexander et al 2019, Zhang et al 2020; however, forecasts for rainy season timing have only recently been developed, and thus minimal investigation into the impacts of false onsets has been conducted. Yang et al (2020) adapted the Decision Support System for Agrotechnology Transfer model (DSSAT;Hoogenboom et al 2012) to Ethiopia and find that precipitation is significantly correlated with maize germination, motivating evaluation of rainy season onset forecast impacts. MacLeod (2018) utilizes dynamic models from the European Centre for Medium-Range Weather Forecasts to predict onset and cessation of rainfall over most of East Africa, while Lala et al (2020) apply a statistical approach for localized onset forecasts specifically within Ethiopia. These forecasts have shown promise in predicting false onsets; however, their utility has not been quantified in terms of potential yield gains. This study quantifies yield gaps associated with suboptimal maize planting times by combining rainy season onset predictions with a dynamic crop model to highlight the value of pre-season information for agricultural decision-making.
Ethiopia's heterogeneous geography and climate contributes to a wide range of cropping seasons and farm practices, with eight distinct agro-ecological zones (AEZs) delineated by elevation, hydroclimatic regimes, and moisture zones (FAO 2010; figure 1). This study considers yield gaps from maize planting only in the summer Kiremt rainy season, during which ∼90% of maize production in Ethiopia occurs (Taffesse et al 2012). The spring Belg rains, although occurring in much of the country, are generally not drivers of maize production, except in the far south, and are not included in this study. Even within the Kiremt-dominated zones, however, precipitation regimes can range from dry and variable to wet and reliable, suggesting that forecast value may vary depending on location. This research aims to quantify this value, allowing for the targeted use of forecasts to increase user benefits (Alexander et al 2019).
Farmers typically aim to time maize planting with rainy season onset, but clear demarcation of the onset can be difficult. Definitions often include a precipitation threshold (e.g. 25 mm of rain over three days, and no dry spells of eight days thereafter; Segele and Lamb 2005) to represent optimal soil moisture for planting. This methodology, however, may be hampered by limited long-term station data and remotely sensed gridded precipitation datasets not well suited for daily-scale thresholds, particularly in representing of dry days (MacLeod 2018). As an alternative approach, the cumulative rainfall anomaly relative to some long-term mean is calculated and onset defined as the minimum of this anomaly over a given year (see Liebmann and Marengo 2001, Dunning et al 2016, MacLeod 2018. These methods also have the advantage of avoiding 'false onsets' by considering dry spells throughout the year or season of interest (MacLeod 2018), and they often differ little from threshold-based methods in practice (Dunning et al 2016, Lala et al 2020. Furthermore, although planting at onset is assumed to be optimal for maximizing yields by ensuring a long growing season, the true onset date is unknown prior to the season. Onset predictions, however, have demonstrated reasonable skill in predicting onset, with mean errors ranging from 9-12 days based on localized empirical studies (Lala et al 2020) or a wider 0-28 days using regional dynamic model studies (MacLeod 2018). Defining onset in a way that is useful for adaptive planting approaches, as well as for prediction, therefore constitutes another goal of this research. Finally, although forecast-informed outcomes may represent improvements over the baseline scenario, there is long-standing evidence that farmers, as a whole, are risk averse (Lins et al 1981, Bar-Shira et al 1997 and may be reluctant to adopt forecast information. Given this, agricultural extension efforts may be hampered if interventions include losses as well as gains (Yesuf and Bluffstone 2007). Minimal year-to-year variance in production, therefore, is also assumed valuable to farmers. Accordingly, this study incorporates variance and risk aversion to more accurately represent farmer preferences.

Defining onset and planting dates
Remotely sensed gridded precipitation datasets, which provide for larger, country-wide studies, are best paired with anomaly-based onsets (MacLeod 2018); however, the exact period for which to calculate anomalies is not necessarily clear. This study therefore investigates two variants of an anomalybased onset: (a) a yearly method, in which the cumulative anomaly is calculated for each year relative to the long-term daily average (Dunning et al 2016; supplementary figure A1 (available online at stacks.iop.org/ERL/16/054035/mmedia)), and (b) a window method, in which anomalies and daily averages are computed over a window of interest (in this case, June-December to avoid the earlier Belg rains that occur in some parts of Ethiopia; MacLeod 2018). Precipitation data is taken from the Multi-Source Weighted-Ensemble Precipitation dataset (MSWEP; Beck et al 2017a) from 1979 to 2014 at three-hourly temporal and 0.25 • spatial resolutions. MSWEP has been shown to accurately capture precipitation relative to other gauge-corrected precipitation datasets globally (Beck et al 2017b(Beck et al , 2019. Although the exact planting date clearly relies on a variety of factors, the UN Food and Agriculture Organization maintains data on the typical planting month for many crops, including maize, for the eight AEZs across Ethiopia (FAO 2010). We assume, as a baseline, that farmers will plant on the first day of the typical planting month in which soil moisture and temperature are optimal; planting is triggered when surface soil moisture is above 40% of saturation and temperature is above 10 • C (Yang et al 2020).
As the temperature threshold is nearly always reached by the FAO planting month, the effective trigger is soil moisture alone. The window definition of onset is not used to inform planting dates directly; rather, it is used to delineate areas for which the Kiremt season is the dominant season for maize planting. In areas where the median window onset precedes 1 August, the Kiremt season dominates; Belg-dominated areas in the southwestern part of the country are not evaluated in this study (supplementary figure A2). Mean onset dates from the yearly definition approximately match that of the FAO planting months (supplementary figure A3) and that of the threshold-based definition of Segele and Lamb (2005), thus, it is considered here as the preferred onset-based planting date. As a further refinement, the onset date is limited to no more than one month before or after the typical calendar month for planting (e.g. for a default planting month of May, onset-based planting is limited to 1 April-31 June). This ensures reasonable planting dates even in extreme years.
To investigate the value of forecasts for increasing yield, this study compares a realistic forecast-informed planting date with the true onset date. For each grid cell (g) and year (i), we generate synthetic forecasts (j = 1-30) by randomly sampling an error (ε) from a normal distribution with a mean of 0 and standard deviation of 12.5 days, corresponding to a mean absolute error of 10 days (i.e. comparable to average errors from MacLeod 2018, Lala et al 2020) and add it to the true onset date (equation (1)). If a forecast planting date falls before the baseline date, yet soil conditions are too dry (defined in the following section based on estimates from the crop model), the baseline date is assumed for planting. The forecast-informed planting (FIP) date is thus defined as: To test adaptive strategies, this study employs three different sets of planting criteria over the study period of 1979-2014: (a) A baseline criterion representing historical yields, based on the first instance of optimal soil moisture within a given planting month as set by FAO (2010), and corresponding to the criterion used in the calibrated DSSAT crop model (Yang et al 2020). (b) Planting during the true onset date (the onset-informed approach), defined using the yearly method (Dunning et al 2016) and limited to one month before or after the FAO planting month. (c) A forecast-informed planting date, in which the true onset date is perturbed by introducing random error to create a prediction comparable with the onset forecast accuracy of MacLeod (2018) and Lala et al (2020). The forecast is disregarded if it occurs before the baseline planting date, since farmers will not plant in dry soil. Thirty sets of simulations, each with different sets of forecasts, are evaluated, and resulting yields are averaged for each location.  (2020) found the model to demonstrate generally low bias for maize yields across Ethiopia. Yields from individual model simulations based on the three planting criteria (baseline, onsetinformed, and forecast-informed) are averaged by woreda (district-level administrative divisions) for comparison. To explore different perspectives on improvement over the baseline scenario, we employ common risk-informed metrics from economics (expected utility; von Neumann and Morgenstern 1944) and finance (average value at risk, a.k.a. expected shortfall; Artzner et al 1998), such that three evaluation metrics are used: These metrics are compared across planting scenarios to infer the relative benefit of onset-informed and forecast-informed planting to the baseline planting both spatially and temporally.

Results and discussion
Onset-informed and forecast-informed planting generally return higher yields in the northern, western, and northeastern parts of the country, in comparison to the baseline scenario, while the central highlands show little or no improvement (figures 2, 4, and 5). Overall, the share of area that demonstrates improvement over baseline ranges from 73% to 81%, depending on the planting strategy or performance metric (table 1). The baseline scenario more closely matches observed yields in most areas than do the informed approaches (figure 3), suggesting it more properly represents current planting practices. Across scenarios, bias is most pronounced in the east (semiarid AEZ, generally overestimating yields), whereas much of the rest of the country exhibits low bias, especially for the baseline approach. The informed approaches exhibit lower yields relative to both the baseline and observed data in the central highlands (tropical highland humid and tropical highland subhumid AEZs), suggesting that onset-based planting strategies may not necessarily be ideal or representative of regions with plentiful moisture. On a year-toyear basis, the forecast-informed and onset-informed planting scenarios closely match, irrespective of performance, compared to the baseline (figure 5), suggesting that forecast errors are minimal and/or do not translate into notable differences in yields for most years. Regardless of overall gains or losses in mean yield, the onset-and forecast-informed approaches reduce interannual variance in yields, eliminating the drastically poor outcomes due to a delayed onset. This is particularly notable in the case of the tropical highland example ( figure 5(b)), in which the adaptive approaches averted crop failure due to the 1999-2000 and 2008 droughts despite having lower yields in most years relative to the baseline.
Risk-informed metrics, such as expected utility and average value at risk, demonstrate similar spatial patterns to that of mean yield, with improvements in the north, west, and northeastern parts of the country (figures 6 and 7). However, due to the lower overall interannual yield variance of the forecastinformed and onset-informed approaches relative to the baseline, decreases in performance in the central highlands are less stark. In terms of benefits to farmers, therefore, the relative advantages of interannual reduction in yield variance may offset mean yield losses from adaptive approaches, since farmers are likely to weigh severe losses (present in the baseline approach) more highly than overall gains. Regarding average value at risk, α-the bottom percentile of yields from which to average (this study considers values from 5% to 30%; figure 7 presents results for α = 10%)-plays a small role; lower levels of α tend to demonstrate slightly better performance (i.e. yields in the worst years improve the   , 2004-2013 (upper left), and relative biases of mean yields relative to observed yields, %, for the baseline (upper right), onset-informed (lower left), and forecast-informed (lower right) scenarios. most), but spatial patterns are similar regardless of α level.
In general, onset-informed and forecastinformed planting results in large yield gains in drier areas, including most of the semi-arid, tropical highland semi-arid, and sub-humid AEZs. The central highlands (tropical highland humid and tropical highland sub-humid AEZs), which tend to have more substantial and reliable rainfall, demonstrate minimal gain-or even loss-when adopting these alternative planting approaches. This may be attributable to the nature of the yearly onset definition; for anomaly calculations based on long-term mean precipitation, wet areas by definition must receive more absolute rainfall to trigger onset. Indeed, onset in these wet areas tends to be several weeks later than the typical planting date (supplementary figure  A3). Moreover, this difference appears to be increasing due to climate change, which may explain the slight downward trend in yields (figure 5); evidence suggests that farmers have already begun to alter planting times to counter this pattern (Darabant et al 2020). Mirroring this, in the eastern parts of the tropical highland semi-arid AEZ, a drier area where onset is generally several weeks earlier than the typical planting time, the alternate planting approaches also result in lower yields. For most other regions, onset corresponds well to typical planting dates and yield differences are not as stark. Such limitations in the application of anomaly-based onset definition (as opposed to an absolute threshold of precipitation) must therefore be considered, despite   their advantages for large-scale studies using gridded precipitation datasets (MacLeod 2018). Further, in moisture-reliable areas, other factors, such as the expected cessation of the rainy season and amount of solar radiation, may play a larger role in determining both yields and farmers' planting decisions (Yang et al 2020, Alexander andBlock 2021). We note, however, that onset-and forecast-informed planting may still offer benefit by reducing inter-annual variability in yields, as evidenced by the slightly less severe losses as measured by expected utility and average value at risk. In most parts of Ethiopia, the onset-and forecastinformed approaches are comparable, suggesting that modest forecast errors are outweighed by other factors (e.g. solar radiation and temperature ;Yang et al 2020). In the eastern parts of the country, however, differences between the two informed approaches tend to be more stark. Mean yields and average value at risk are somewhat higher for the onset-informed approach in the tropical highland semi-arid AEZ, while they are lower in the semi-arid (non-highland) AEZ. These same regions tend to have more extreme precipitation variance across years, along with generally low yields overall, which may indicate a lack of precision in the model rather than true differences in the onset-vs forecast-informed approaches. Indeed, absolute differences in yields across scenarios-rather than relative differences-are rather small in the eastern parts of the country (see appendix).
Although the large improvements seen in the dry and drought-prone north are partially due to consistent yield gains relative to the baseline across years, some gains are also a result of the more flexible nature of the onset-and forecast-informed approaches, which allow for planting outside the default planting month. In the baseline scenario, if soil moisture is never sufficient for planting within the set month, the model assumes a 'failed season ' and no production (e.g. 1987 and 1997 in figure 5(a)), even if soil moisture may be sufficient in the following months. Although this is a limitation of the model, it also demonstrates that gains from the climateinformed planting approaches are most pronounced in years of unexpected or false onset, which farmers cannot easily anticipate. Nevertheless, very large gains  in the far north of the country should be interpreted appropriately.

Conclusion
This study considers the role of onset-informed and forecast-informed planting strategies in determining maize yields across Ethiopia. Results indicate (a) a large potential for yield gains using an onset-or forecast-informed planting date, particularly in drier parts of the country. Conclusions in the wet highlands are less clear, showing both benefits and drawbacks of new strategies. On one hand, (b) informed planting dates appear to slightly reduce yields in the wet highlands, suggesting that other factors dominate production. On the other, however, (c) these alternative approaches also tend to reduce variance across years such that risk-informed metrics show a more neutral assessment. Such spatial heterogeneity in yield gaps may be related to the nuances of defining onset; the cumulative precipitation anomaly-based definition used in this paper is advantageous for large-scale studies using gridded precipitation data (MacLeod 2018), but (d) differences in the absolute amount of rainfall required for different climate zones may result in wet areas having onsets that occur well after soil moisture is sufficient for planting, which may in turn lead to an excessively short season by the time of rain cessation. More generally, this study concludes that planting based on reliable rainfall is only part of the optimal solution, and that farmers may improve upon these results by considering forecast information alongside other factors not considered in this paper, such as indigenous knowledge (e.g. the appearance of flora and fauna associated with impending rain) or the adoption of new technologies. Indeed, farmers in Ethiopia have already begun to adapt their cropping seasons to a changing climate by considering onset trends (Darabant et al 2020), making the adaptation efforts described in this study more feasible and possible than ever before.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https:// doi.org/10.5281/zenodo.4594203. Data will be available from 01 July 2021.   Part 2. Maps of absolute improvements (kg ha −1 ), rather than relative improvements (%), corresponding to Figures A3-A4 and A6-A7 Figure A4. Mean observed yield, 2004Mean observed yield, -2013, and relative biases of mean yields relative to observed yields, kg ha −1 , for the baseline (upper right), onset-informed (lower left), and forecast-informed (lower right) scenarios. Figure A5. Improvement in mean yields (kg ha −1 ) for onset-informed planting over baseline (left), forecast-informed planting over baseline (center), and onset-informed over forecast-informed planting (right). Figure A6. Improvement in expected utility (kg ha −1 equivalent) for onset-informed planting over baseline (left), forecast-informed planting over baseline (center), and onset-informed over forecast-informed planting (right). Figure A7. Improvement in average value at risk (kg ha −1 ), α = 10%, for onset-informed planting over baseline (left), forecastinformed planting over baseline (center), and onset-informed over forecast-informed planting (right).