Seasonal forecasts of the rainy season onset over Africa: Preliminary results from the FOCUS-Africa project

probabilistic fashion by accounting for the frequencies of normal, early and late onsets predicted by the forecast system. We compute the skill using the hindcasts (forecast simulations conducted for the past) starting at the beginning of each month in the period 1993 – 2016. We detect the onset timing of the rainy season using a non-parametric method that accounts for double seasonality and is suitable for the specific time-window of the seasonal forecast simulations. We find positive skills in some key African agricultural regions some months in advance. Overall, the multi-model ensemble outperforms any individual model ensemble. We provide targeted recommendations to develop a useful climate service for the agricultural sector in Africa.


Practical Implications
This research showed the scientific value and technical advances of predicting rain onset over the African continent using one instance of non-parametric method that is suitable for operational purposes.The robustness of this climate service can be enhanced by increasing the simulation ensemble size (as we demonstrated), involving different methodologies for the onset determination (Bombardi et al., 2020;Dunning et al., 2016;Liebmann et al., 2012;MacLeod, 2018) and different observational references (Bombardi et al., 2019).However, to move from the scientific exercise and theoretically useful service, to an actionable climate service, we need to move away from a supply-driven approach and provide services that are user-driven and context-specific (Findlater et al., 2021).This requires involving stakeholders and future users of this service in a co-exploration and co-design of the products.The process of new knowledge coproduction, rather than just its outcome, is the one that brings value to both the scientific and user community (Steynor et al., 2020).Coproduction should hence be an iterative and inclusive process, that supports learning and trust and partnership building, and enables the testing and evaluation of the new knowledge and services (Bojovic et al., 2021;Findlater et al., 2021;Steynor et al., 2020).These coproduction partnerships need to be equitable and to address power imbalances between actors (Vincent et al., 2020).This unequal power dynamics could appear in the case of North-South partnerships, and could be addressed by developing shared understandings and questioning Western standards as the only possible baseline (Daly and Dilling, 2019;Hoppers, 2011).
Information about the rain onset is a typical user requirement in agricultural-based economies, particularly emphasised in the regions strongly affected by the changing climate.Expectedly, information about onset was repeatedly mentioned as the key information for agricultural planning during the FOCUS-Africa fieldworks in Tanzania and Malawi.From farmers and extension workers, over staffs from international NGOs, such as FAO and WFP, to ministry officials and representatives of the NHMSs, everyone agreed that providing information about seasonal forecasts of the rainy season onset could be considered useful climate service.The co-exploration work done during the fieldworks did not only show a strong interest in the onset, but also confirmed that stakeholders are already familiar with seasonal climate forecasts supplied by NHMSs.Extension workers in Tanzania and Malawi receive training to help them interpret this information and disseminate it to farmers.Certainly, this plethora of users has different technical knowledge, practice of using climate information and need for detailed information.In accordance, while scientific information presented in this paper could support the current seasonal climate information development by NHMSs and could be of interest for some representatives of the ministerial departments, such agricultural research and services, extension workers could be more interested in the comprehensibility of the final service and hence it's visualization so that they can communicate it in a simple way to farmers.
The initial co-exploration of the needs for this type of climate service paved the way for the FOCUS-Africa co-development process which will further fuel discussion and knowledge exchange between scientists and users.The co-developed stage will address the drawbacks that we recognised in this research, such as lack of skill, as well as visualization of probabilistic forecasts.Benefits of using information from climate forecasts, instead of climatology, in many cases only become obvious when considering results over the long term, requiring inspections of long-term performance of forecasts with users (Terrado et al., 2019).Correspondingly, to more transparently communicate the concept of skill and demonstrate usefulness of probabilistic forecasts with relatively low skill, it has been suggested to include a long-term perspective.When it comes to visualisation, some of the aspects of seasonal climate forecasts of the rainy season onset that require particular attention are the representation of uncertainty and involvement of users to evaluate visualisation to make it more accessible, as well as to finding a shared language and terminology between service providers and users (Terrado et al., 2022).
Since the skill of the seasonal forecasts obtained in this study is not sufficient to meaningfully inform decisions in several key agricultural areas of Africa, these regions require particular attention in the coproduction process.However, we demonstrated that a larger ensemble of simulation could be used in order to increase the overall skill.This could be partly achieved by considering all starting dates other than just the 1st of the month, which would provide 21 more members by the Met Office model (see Methods) and additional members additional models that are populating the C3S ensemble.Other multimodel ensembles from major seasonal forecast centers such as the North American Multi-Model Ensemble (NMME) could be included too.These, however, would need a separate skill assessment period (either 1991 to 2020 or 1982 to 2020).Including different definition for the onset determination (Bombardi et al., 2020;Fitzpatrick et al., 2015) as well as alternative observational datasets to determine the reference onset (Bombardi et al., 2019) should be considered in order to improve the robustness of the results.Finally, using bias corrected model outputs could further ameliorate the final performances.In fact, the ranked anomalies analysis is partially expected to reduce some of the model biases, which we recommend for further analysis.All these approaches that could improve or address the lack of skill require further discussion and ideas exchange, particularly with advanced users from the regions with low or no skill in seasonal forecasts.By working side by side also with intermediary users, such as employees from the ministry of agriculture, including extension workers, scientists could more easily understand what is crucial for informing decisions and how the information about the onset complements other information types and sources, and fits into different decision-making contexts.

Introduction
Africa is emerging as a potential hotspot of climate change (Fan et al., 2021;Turco et al., 2015).Following the "wet areas getting wetter and dry areas getting drier" paradigm (Held and Soden, 2006;Toreti et al., 2013), the main climate change signal for precipitation over Africa broadly consists of the wetting of the Tropics and the drying of the Subtropics (Dosio et al., 2019;Spinoni et al., 2020).However, Africa experiences substantial inter-annual and intra-seasonal climate variability that is also increasing (Dosio et al., 2021;Nikulin et al., 2018) and that could affect vegetation seasonality (Peano et al., 2019) and stability (Zampieri et al., 2021).
Agriculture in Africa is heavily affected by climate variability and is vulnerable to climate change (Challinor et al., 2007;Zampieri et al., 2019).Therefore, provision of climate services is a necessary prerequisite for facilitating climate change adaptation in the agricultural sector over Africa, and seasonal forecasts are among the main tool for that purpose (Hansen et al., 2019;Vaughan et al., 2019;Wanders and Wood, 2017).However, transforming climate science into climate services and turning seasonal climate forecasts into useful and usable information, requires an iterative, co-production process that brings together academic and domain knowledge (Carter et al., 2019, Bojovic et al., 2022;Bremer et al., 2019;Findlater et al., 2021;Norström et al., 2020).
Seasonal and monthly anomalies of atmospheric circulation, surface temperature and precipitation flux can be potentially predicted at the seasonal time-scale over several African regions, particularly in those characterized by a proven statistical relations with the El Nino Southern Oscillation (ENSO) state (Joly and Voldoire, 2009;Lenssen et al., 2020;MacLeod, 2019).Dynamical models show strong predictive skills in South Africa for temperature and precipitation anomalies in boreal summer (Ratnam et al., 2014), which translates in skilful forecasts of e.g.temperature extreme and drought indices (Winsemius et al., 2014).Given the association between tropical Pacific Ocean temperature anomalies and the Indian Ocean Dipole, Eastern Africa as well is a good candidate for skilful climate forecasts (Hastenrath et al., 2004).For this reason, dynamical seasonal predictions have been able to anticipate both the extremely dry October-December 2010 and March-May 2011 rainy seasons, affected by a strong La Niña event (Dutra et al., 2013).In general, the forecast for the short rain season has higher skill than that for the long rainy season, due to high predictability of the Indian Ocean sea surface temperatures (Mwangi et al., 2014).Short rains' empirical predictions have been successfully tested in the region since the 1990′s (Mutai et al., 1998), and recently the statistical and dynamical approaches have been merged together to enhance forecast skill (Colman et al., 2020).Dynamical forecasts of precipitation for Western Africa and the Sahel also show positive skill and reliability, although uncertainties are larger in this region (Pirret et al., 2020).Despite the large land--atmosphere coupling exhibited in most part of the Sahel and the northern coast of Guinea (Koster et al., 2004), soil moisture memory decreases very rapidly during the dry season and does not contribute to the predictability of the all-summer monsoon rainfall (Douville et al., 2007), while the end of the rain season is generally better predicted.
The onset of the rainy season is an important climate indicator that is frequently requested by farmers (Golding et al., 2019;Mittal et al., 2021).It was a clear requirement that emerged from preliminary discussions with climate information users in Tanzania and Malawi, from farmers and extension workers to people working in agricultural department, within the EC-funded H2020 project FOCUS-Africa (Boscolo et al. 2021).In fact, with the exception of the most humid regions, precipitation is the main limiting factor for agricultural production in the tropical and subtropical climates of Africa (Iizumi et al., 2019;Sacks et al., 2010).The date of onset often determines the length of the growing season, which is key for high crop yields.A late onset, even if followed by high rainfall, may lead to a short season with crops not reaching full maturity.The onset date is also an indicator of the overall seasonal rainfall amount (Camberlin et al., 2009).Information about the amount of rainfall early in the season and the onset date can also influence decisions on transhumance southward during the very late dry season with the aim of meeting the rain (Rasmussen et al., 2014).Therefore, several studies already addressed the skill of the seasonal and sub-seasonal model simulations in predicting the onset of the rainy season over Western Africa (Kumi et al., 2020;Rauch et al., 2019;Vellinga et al., 2013), Eastern Africa (MacLeod, 2018) and Southern Africa (Ratnam et al., 2018) using different definitions for the onset (Bombardi et al., 2020).Given the importance and the demands for such service, national weather services also provide operational assessments of the rainy season onset occurrence based on the timing when precipitation starts being consistently larger than certain empirical precipitation thresholds that depend on the specific location (see e. g. https://agromet.meteo.go.tz/).
This study aims at providing a first overall picture of the rainy season onset predictability for the entire African continent implementing an algorithm to detect the onset from seasonal forecast model simulations.We chose an objective algorithm that allows determining the rainy season onset in relationship to the local long-term mean daily precipitation without the need of setting arbitrary thresholds (Liebmann et al., 2012).This methods was tested on continuous data and does not account for double seasonality that characterizes in particular the Horn of Africa, a zonal equatorial strip extending from Uganda to Equatorial Guinea, and a small region on the southern West African coastline (Dunning et al., 2016).Therefore, we use a modified version of this algorithm to operate over time-windows of 6 months, similarly to other seasonal forecast applications (MacLeod, 2018).This solution might seem appealing to automatically remove the issue of double seasonality.However, double seasonality still represents an issue, especially if one season carries less water than the other, on in the case of a particularly dry year (Bombardi et al., 2017;MacLeod, 2019MacLeod, , 2018)).Several methods have been proposed to address this issue (Bombardi et al., 2020;Liebmann et al., 2012).In order to assess whether the location under scrutiny is characterized by double seasonality, we conduct a preliminary pre-processing of the precipitation data through harmonic analysis similarly to Dunning et al (2016).Over such regions, we apply an adaptive threshold that is always capturing the onset of the season under scrutiny, independently from the fact that it is a minor season and a particularly dry year.Our method is suitable for application in an operational context.We test it for all starting months over all of Africa, using a large ensemble of seasonal forecast models provided by the Copernicus Climate Change Service (C3S).We compute the probabilistic skill compared to the past observations.Finally, we discuss how to turn this information into a usable climate service.

Data and methods
This study uses the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) version 2 (Funk et al., 2015) as observational reference dataset for rainfall.The CHIRPS dataset is obtained by blending surface stations and satellite observations.The accuracy of the onset determination from CHIRPS is discussed by Dunning et al. (2016) in comparison with other observational datasets.CHIRPS data are remapped from the original 0.05 • spatial resolution to the 1 • grid of the seasonal forecast model using a conservative method (Chen and Knutson, 2008;Jones, 1999), in order to be compatible with the seasonal forecast data.We apply a fast Fourier transform (FFT) algorithm to compute the first and second harmonics of annual precipitation, corresponding to the single and double seasonal cycles (Figure S1).As already noted (Dunning et al., 2016), several regions in Africa are characterized by marked double seasonality of precipitation.We find significant double seasonality signals also over arid regions (Figure S1).As noted before, this signal can be a consequence of the narrow rainfall peak.
The seasonal forecasts dataset is composed of a multi-model ensemble of daily precipitation output simulated for the past (hindcasts) that is made available on the Copernicus Data Store (cds.climate.copernicus.eu) at 1-degree horizontal spatial resolution.Such ensemble is used to assess the skill on the past onset anomalies estimated at each starting month of the year.It is composed of: -40 simulations by the CMCC-SPS 3. We use the simulations starting the 1st day of all the months from January 1993 to December 2016.Simulations last about 6 months, with small differences in duration among the models.The full dataset therefore would comprise more than 36 thousand simulations (127 ensemble × 12 months × 24 years) of about 57 Megabytes each (about 2 Terabyte total).The UK Met Office model also provides different starting dates (the 9th, 17th, 25th) that are not used in the present study, for consistency with the other models.The UK Met Office model is therefore less represented in the multi-model ensemble used for this study.
For each of the available simulations the standard Liebmann method involves the following steps.a) compute the annual average of daily precipitation from the multiannual precipitation mean (i.e.long-term annual mean/365) for each model.Such average is computed considering data simulated for all starting dates and all lead times.b) for each simulation, compute the difference between daily precipitation and the annual average of daily precipitation.c) integrate such difference in time for each seasonal forecast simulation.This results in a new ensemble of the same size of the original, but composed of cumulated daily precipitation anomalies instead of daily absolute values.d) compute the onset timing as the timing of the minimum of the cumulated daily precipitation anomalies.
Compared to a more intuitive estimation based on the absolute daily data, the minimum of cumulated daily anomalies avoids 'false onset' that occur when a single precipitation event is triggering the onset algorithm, but it is followed by a dry spell immediately after, before the 'real' start of the season (Liebmann et al., 2012).This formulation allows uniform application of the algorithm to heterogeneous locations because it depends on local climatology instead of arbitrarily determined empirical thresholds (Liebmann et al., 2012).Another advantage of this method is that, in principle, it doesn't require bias correction of different datasetsa significant factor when considering multiple forecast systems, because each model results is processed on the basis of its own climatological values.
Our implementation includes several differences with respect to the standard Liebmann method: a) as Dunning et al (2016), we conduct a preliminary analysis (FFT) to assess whether the location under scrutiny is characterized by single or double seasonality (see Figure S1).b) as MacLeod ( 2018), we apply the algorithm to the six-month timewindows that typically characterize seasonal forecast models' outputs.This involves the additional issue of handling the detections that occur at the boundaries of the six-month period.c) as several investigators noted (Bombardi et al., 2020(Bombardi et al., , 2017;;Dunning et al., 2016;Liebmann et al., 2012;MacLeod, 2019MacLeod, , 2018)), Liebmann's methods could fail in case it is applied to detect the onset of a minor season in regions characterized by double seasonality, especially in particularly dry years.For operational purposes, we conceived a simple method based on an 'adaptive threshold' that represents a good compromise between accuracy, simplicity and robustness.
The detections that occur within the boundaries of the six-month period, excluding the boundaries, are considered accurate.The cases for which the minimum is occurring either at the beginning or the end of the simulated period can be considered informative as well.In fact, if the minimum is found at the beginning or at the end of this time-window, we can in principle assume that the onset occurred before the starting of the simulation, or after the end of it, respectively.Fig. 1 provides a visual representation of the onset estimation applied on six-month timewindows based on idealized data in the case of a single annual precipitation cycle and of different types of double seasonality.
The Liebmann (2012) method effectively captures the onset timings that occur within the forecasts' period in the case of single seasonality (see Fig. 1a).Moreover, as shown in Fig. 1a, it provides useful indications in the cases where the onset cannot be determined precisely, because it occurs outside the time-window covered by the simulations.In case of symmetrical double seasonality (Fig. 1b), the method produces valid detections only.However, in case of asymmetrical double seasonality (Fig. 1c), the method can fail to detect the onset for a timewindow starting in proximity of the beginning of a rainy season with less rain than the following one (second starting date, orange line).This issue was similarly noticed by Bombardi et al. (2017) and MacLeod (2018) when applying the algorithm to observed and simulated precipitation in the case of very dry seasons.In the graphical example presented here, this issue is visible for the simulation started in February (the second starting date in the plot) because the algorithm captures the onset of the following season, corresponding with the absolute minimum of the accumulated precipitation anomalies.This issue can be aggravated if the difference between the precipitation levels of the two rainy seasons is larger, affecting also other starting dates (not shown).As a matter of fact, the first season onset is never detected if the following season produces around double the amount of rain with respect to the first.A simple way to overcome this issue consists of computing the cumulated precipitation anomalies based on the average precipitation computed over the current 6-month time window under examination, instead of using the long-term mean.The results of the method implementing such adaptive threshold are shown in Fig. 1e and 1f.Compared to the corresponding panels 1c and 1d, the new method always provides good detections at the expense of a small uncertainty in the exact date.Such uncertainty is presumably smaller than that associated to the climate simulations, and definitely smaller than that related to missing the onset completely.
In order to construct a method that is working for all Africa and surrounding regions, we propose a formulation that gradually shifts between the original Liebmann (2012) method for single seasonality regions and the method based on adaptive threshold for double seasonality regions.Our method can be described using the following threshold definitions to compute the precipitation anomalies: θ L = Σ y=1,ny Σ t=1,365 (p y,t )/(ny⋅365) (1) Where θ L is the Liebmann (2012) precipitation threshold, defined as the long-term mean daily precipitation.p y,t is daily precip at day t and year y, ny is the number of available years.d is the starting date of the detection window.θ A (y,d) is the adaptive threshold defined as the mean daily precipitation over the six-months window under scrutiny.θ A (y,d) is therefore depending on the year and on the detection window.θ c (y,d) is a weighted mean of the Liebman and the adaptive threshold.c 1 and c 2 are the first and second harmonic coefficients corresponding to unimodal and bimodal precipitation annual cycles (Figure S1), respectively, computed on the observed precipitation climatology.Such methodology is reasonably simple to be implemented operationally and can be potentially extended for tri-modal precipitation annual cycles as well, if needed.
The probabilistic skill of the onset determination is assessed for each starting date of the hindcast period  by computing the Ranked Probability Skill Score (RPSS).The onset timings are determined for each models' simulations and starting dates.The tercile thresholds (i.e. the 33.3rd and the 66.7th percentiles) are computed, for each model and starting dates.These thresholds determine the ranges of the three categories considered here, which are early onsets (first tercile, p < 33.3rd percentile), later onsets (third tercile, p greater than 66.6th percentile) and normal onsets (otherwise).A similar process is used to categorize the onset occurrence in the CHIRPS observational dataset, considering the same time-windows of the seasonal forecast simulations.The 33.3rd and 66.7th percentiles are computed independently for the observation and for each forecast model.In this way each category (early, normal, late onset) is expected to represent the same frequency (33.3%) in the reference period for both the observations and the model simulations.
The distribution of the onsets determined over the six-month window is a superposition of two discrete and one continuous distributions.In fact, on the one hand, onset estimations occurring within the window can be considered a continuous variable (yet discretised at a daily time scale).While, on the other hand, onsets occurring at the window edges are two discrete histograms.In such a situation, terciles are well defined if and only if the 33.3rd and 66.7th percentiles are not over the edges of the detection window.In this case the terciles based ranges are actually holding the 33.3% of the data each.If the 33.3rd and 66.7th percentiles are over the edges of the detection window, we cannot compute the frequencies reliably.In the latter case, we mask the data for that particular region and starting date and we do not proceed with the skill assessment.
In case the terciles are well defined, we evaluate the potential usefulness of seasonal forecast models in terms of their ability to produce ensembles of simulation whose departure from the climatological distribution is consistent with the observed anomalies.This would happen if the more frequent tercile category of the model's ensemble (early, normal, late onset) matches the observed category for each year and at each starting date.The overall skill of the models' ensembles can be therefore computed through the Ranked Probability Skill Score (RPSS), which accounts for the times when the most likely category predicted by a model matches the observed anomaly (Weigel et al., 2007).This is defined by the Ranked Probability Skill (RPS) given by equation ( 4): where y and d represents the starting date of the simulations.k represents the categories (earlier, late onset).Y k are the frequencies of simulations with onset falling in each category.O k is equal to one if the observed onset is falling in the k category, zero otherwise.The score is then defined by comparing the simulated skill to the one that would be expected according to the climatology: ( where RPS CL is computed as equation ( 4), but imposing Y k = 1/3 for each category.The median is computed from CHIRPS using the same 24 years of the seasonal forecasts hindcast period that are available in Copernicus (1993Copernicus ( -2016)).Regions where more than 33.3% of the onset are occurring outside the 6-month time-windows (two-sided) are masked in white.
Positive RPSS implies that the models perform better than the climatology.RPSS equal to one would be the upper limit if the models' simulations were always capturing the observed anomaly.Negative RPSS implies that the models' simulations are worse than the climatology.

Onset determination
Fig. 2 shows an example of the developed determination algorithm for the onset timing applied to obtain the start day of the onset from real data (CHIRPS) over a time-window of six months.In the presented example (Fig. 2), two isolated precipitation events are occurring before the actual onset.In this and other simulated cases (not shown), the algorithm is often successful in finding a consistent onset over the individual events exceeding the threshold that could be followed by a few dry days.
For operational purposes, it is convenient to express the onset timings in days since the beginning of the time-window spanned by the seasonal forecasts' simulations (Fig. 3), rather than in absolute dates.This might be less intuitive, but it simplifies considerably the software complexity, limiting the possible sources of error.Fig. 3 can be taken as reference for understanding the time-window of seasonal forecasting of the onset timings over Africa.
Detected onset timings reflect expected seasonality in the continent.Looking at the seasonal forecasts starting in January over Western Africa (Fig. 3, window 'Jan-Jun'), the rainy season onset occurs very soon in the Gulf of Guinea and progressively later on when moving north in the Sahel region.However, the bimodality of precipitation over the Gulf of Guinea is only partially captured.Over the Horn of Africa, two onsets are detected, one starting in April and one in September.Over Southern Africa, valid detections are found for the time windows starting in July ('Jul-Dec' window) to December ('December-May' window).A different seasonality is diagnosed over the Cape Town region and along the Atlantic coast, which is characterized by a precipitation regime typical of the middle-latitudes.

Probabilistic skill assessment
The probabilistic skill assessment is computed through the Ranked Probability Skill Score (RPSS, see methods), which assesses the ability of the simulations ensemble to capture the observed anomalies (Fig. 4).The assessment is performed for the multi-model ensemble weighting equally each model run.
The skill of the multi-model ensemble is consistently positive over several regions for several key starting dates, especially in South Africa, Eastern Africa and the Middle East (Fig. 4).In most cases, the extent of the areas with positive skill is larger than that one with negative skill.In some cases, positive skills are found several months in advance.Assuming that the 'exploitable value' of the forecasts increases linearly with the time range on the anticipations, it is possible to define a proxy based on a factor that is equal to one if the onset is about to occur in the first month of simulation, equal to two is the onset is expected to occur in month two, etc.According to such idea, a timeliness skill value (TSV) could be simply defined as: where the onset is here expressed in months (of fraction of months) from the beginning of the simulations time windows (i.e. in days/30.5).From this heuristic conceptual framework, we find that the areas with the most favourable conditions for rain onset forecasts (e.g.TSV greater than 0.3) are limited and mostly scattered in most of the study domain (see Fig. 5).However, spatial and temporal consistent regions with promising results are found in the Fertile Crescent for starting dates ranging from June to October, in Eastern Africa from May to November.Positive but less consistent applicability conditions are found for Southern Africa.Scattered and mixed results are found in other regions of Africa such as the Sahel (Fig. 5), where the inconsistency between 'local' and 'regional' definitions of monsoon onset could be quite inconsistent among them (Fitzpatrick et al., 2015).This issue might characterize other regions of Africa and suggests that there could also be some inherent bias related to the inconsistent spatial resolution between observations and model simulations, even though they are remapped on common grids, especially in regions with sparse surface stations, and some level of intrinsic predictability at very high resolution.This preliminary analysis suggests that the results of the seasonal forecast should be communicated with caution and always in association with the skill assessment.More extended analyses and improved communication of the results will be implemented after co-exploration with stakeholders, including farmers and extension workers, and co-development with users with technical background, such as employees from agricultural departments and national meteorological and hydrological services (NHMS; see Discussion section).
On a positive note, we found that the skill of the C3S model ensemble outperforms any individual model ensemble (see Fig. S2-6).It is therefore advisable to implement a service based on larger model ensembles, or also coupled to additional sources of information such as predictions of large scale teleconnection patterns and climate indices (Fitzpatrick et al., 2015;Iizumi et al., 2021).

Possible visualization options
Dealing with large ensembles of seasonal forecast simulations also requires thoughtful strategies for analyzing and visualizing the complexity related to high degrees of freedom in a synthetic and clear fashion, especially for climate service provision purposes.However, we considered it useful and interesting to maintain a strong link with the underlying data, reflecting the probabilistic nature of the prediction problem.To improve comprehension of probabilistic frameworks affected by the different models' biases, we suggest focussing on a single location and a single starting date through a kind of 'abacus plot' (Fig. 6).Such analysis is motivated by our previous experience with users (in the EU H2020 MedGOLD and FOCUS-Africa project) who are mainly interested in specific locations.Comparing the current forecast with the previous ones was also considered particularly useful, as it related on the historical memories and personal experiences of the endusers.The following example shows, for a location in Tanzania, all the onset forecasts issued in September for the entire historical time series.Observed onset at this specific location occurs on average 100 days (i.e. the 9th of December) after the forecast time and shows considerable variability (SD = 17 days; Min = 76, Max = 149).
A considerable part of the anayses of seasonal forecast data consists of counting events occurring in predetermined frequency ranges defined, for instance, by percentile-based thresholds such as the terciles displayed in Fig. 6.As it is evident from that plot, individual simulations results are often widely spread and not of easy interpretation.Consensual predictions of a specific category are never found.This is, however, expected.In fact, large simulation ensembles are indeed very useful to sample the probability of occurrence of events reflecting the chaotic nature of the climate system.It is also worth noting that each simulation is equally probable, but we cannot know which one will be a-priori closer to the observation, which is in principle to be considered a single manifestation of many possible realizations as well (Lorenz, 1963).
Fig. 6 reveals the different models biases compared to the observed onset determinations.This is especially evident by comparing the tercile thresholds estimated for the individual models in comparison with the observed ones.This issue could be partially corrected by bias-correcting the model daily data before determining the onset.However, the probabilistic analysis we conducted does not strictly require this step, which is a great advantage.
Although the agreement between the individual observed events and the simulated values was not common, it was sometimes possible to observe it, such as in 2015 and 2016.The level of probabilistic skill in predicting the occurrence of normal, early and late onset is quantified by the RPSS, which is slightly positive in this case.However, it is difficult to efficiently extract useful information from such a plot.For this scope, and for practical application, it is possible to consider a 'condensed abacus plot' displaying aggregated information of the models' ensemble such as the most probable tercile (Fig. 7).
The 'condensed abacus plot' is aiming at delivering probabilistic information on the seasonal forecast models' ensembles skills for a particular location (Fig. 6).It is worth noting that the model ensemble median skill was not strictly tested yet in our study.Moreover, since the onset predicted by the models could be biased with respect to the observations (see Fig. 6), quantile mapping has been applied to the modelled onset for plotting purposes in Fig. 7.This adds a further source of uncertainty in the information delivery.This visualization strategy will hence be tested and further co-designed with users.Hopefully, visual inspection of this plot allows potential users to assess the performance of the seasonal forecast for their location(s) of interest.In this case, the skill is positive (RPSS = 0.13), meaning that the models are more reliable than the climatology for decision making.There is however also a risk, as apparent from Fig. 6, for the individual model result to sometimes lead to wrong decisions.We will consider dropping these visualization options of not considered useful by the end-users.In the next paragraph we show an already accepted concept for delivering the probabilistic results of the seasonal forecasts multi-model ensemble.

Operational forecasts
Another interesting perspective that can also depict the spatial variability of the predicted anomalies consists in showing the results of a single set of simulation in a probabilistic fashion to show the forecast issued on the most recent date using the visualization shown in Fig. 8.With this type of visualization, the probability of the most likely onset anomaly is mapped at the grid level (Rembold et al., 2023).Here, Fig. 8 is actually produced using hindcast data, as an example for illustrative purposes.The same kind of visualization is in fact suitable for operational purposes (see e.g.https://mars.jrc.ec.europa.eu/asap/seasonal_forecast.php).In the proposed example, early onsets are predicted in South Africa and part of Madagascar.Late onsets are predicted for eastern Africa, which is the most prominent anomaly signal, and some part of the Mediterranean coastal regions.
To find an optimal visualization approach and information amount and content, the 'abacus plots' as well as its suggested modifications for Fig. 6.Rainy season onset timings detected for the observations and model simulations at a particular location in Tanzania and starting date for the period 1993-2016, expressed in days since the starting date of the simulations.Diamonds correspond to observations and circles to model simulations.Circles correspond to the individual model simulations' results.Solid lines represent the 33rd and the 67th percentiles computed for the overall period for models and observations.Onset values occurring between the 33rd and the 67th percentiles are considered normal events, displayed in yellow.Onset values below the 33rd percentile are considered earlier than normal events, displayed in blue.Onset values above the 67th percentile are considered later than normal events, displayed in red.The multi-model RPSS for this location and starting month is reported in the title of the plot.visualizing a single location over many starting dates (i.e.Figs. 6 and 7), the proposed visualization of large-scale operational forecasts (Fig. 8) and the skill assessment analyses (Figs. 4 and 5) should be presented to users for their evaluation and inputs for improvements, preferably in a workshop setting.The workshop should include representatives of different stakeholder groups and allow enough time for the participants to understand, test and evaluate the information.The scientists can then collect feedback about how to make the information more intuitive and usable.A structured workshop discussion should allow for ideas exchange and for finding the compromise solutions, in a participatory and inclusive way.This final stage of the product's visualization codevelopment implies an iterative process of testing and improving the climate service.Hence, those users who express interest and availability to stay in contact with the scientists after the workshop will be contacted again to test the improved visualisation and the process will continue until a usable product with optimal visualization is obtained.

Conclusions
We have presented a first assessment of the skill of seasonal forecasts over Africa in predicting the onset of the rainy season, a widely requested product for climate services that are being developed in Africa.We use a slightly modified version of a well-established algorithm for the onset determination that we apply to the Copernicus C3S seasonal forecast simulation ensemble.Given the importance of the communication of the results and interaction with users in developing the seasonal forecast products that is emerging in the recent climate service projects such as the EU H2020 MedGOLD and FOCUS-Africa, we devoted a large part of the paper in discussing a graphical method to deliver the probabilistic information of the forecasts.The envisaged further work will move this discussion from the scientific domain to an inclusive, transdisciplinary setting, allowing for knowledge exchange and co-development of the new service.Not only will the final product benefit from integrating scientific with local knowledge, but the coproduction process itself might build new, equal partnerships and promote testing and evaluation of the new products, adding to continuous improvements of the climate service.
Although the presented results are to be considered preliminary, this paper provides a useful reference for a general interpretation of the results of the onset determination from seasonal forecasts over Africa and for the corresponding computer code and routines.This will be further discussed and co-developed with operational meteorological services in Africa and with other local stakeholders.We identified some possible development lines to improve the quality of the information, which appear clear and promising from the scientific perspective.These suggestions will be further discussed and analyzed with stakeholders from different African regions to co-develop a usable service, a service "that is contextually relevant and suited to users' technical capacity" (Findlater et al., 2021).The frequency is indicated by three different colour schemes: shades of blue for earlier onset (onset detections smaller than the 33rd percentile are more probable), shades of grey for normal onset (detections between the 33rd and the 67th percentiles are more probable), shades of yellow-red for late onset later (detections larger than the 67th percentile are more probable).

Fig. 1 .
Fig. 1.Idealized examples of rainy season onset timing determination for a) single seasonal cycle and b-c-d-e-f) double seasonal cycles.The black solid line represents the daily precipitation values for one year and a half.The dashed line represents the annual average of the daily precipitation values.The black stars represent the starting dates of the time-windows covered by the seasonal forecasts.The coloured lines represent the cumulated daily precipitation anomalies with respect to the annual mean (right y-axis).The circles represent the onset detections corresponding to the minimum of cumulated anomalies occurring within the time-window of the seasonal forecasts.Triangles represent detections that occur outside the time-window under scrutiny.Panel b represents the special case where the two seasons have the same amount of total water.Panel c and e represent a more generic case where the second season brings more water than the first (i.e.'major season' followed by 'minor' season).Panel e and f represent the other way round ('minor season' followed by 'major season').Onsets in panels a,b,c and d are computed with the Liebmann method using the long-term daily precipitation as threshold.Panels e and f use an adaptive threshold defined as the mean daily precipitation over the six-months estimation window.

Fig. 2 .
Fig. 2. Example of onset timing determination algorithm applied to the observed daily precipitation data starting on 1st September 2000 at 30E, 10S.

Fig. 3 .
Fig.3.Median observed onset timings over Africa determined over 6-month periods, starting at each month of the year (in days since the starting of each month).The median is computed from CHIRPS using the same 24 years of the seasonal forecasts hindcast period that are available inCopernicus (1993Copernicus ( -2016)).Regions where more than 33.3% of the onset are occurring outside the 6-month time-windows (two-sided) are masked in white.

Fig. 4 .
Fig. 4. Ranked Probability Skill Score (RPSS) of the rainy season onset timing assessed in period 1993-2016 by the C3S seasonal forecast models' ensemble.Regions where more than 33.3% of the onset are occurring outside the 6-month time-windows (two-sided) are masked in white.The text at the lower left corner of the panels indicates the percentage of area with positive skill and with skill larger than 0.1.

Fig. 5 .
Fig. 5. Timeliness value of the skill computed for the twelve starting dates of the hindcast simulations conducted over period 1993-Only regions with positive skills are plotted.

Fig. 7 .
Fig. 7. Rainy season onset timing detected for the observations and model simulations at a particular location and starting date for the period 1993-2016, expressed in days since the starting date of the simulations.Diamonds correspond to observations and circles to model simulations ensemble median (same as the top panel of Fig.3).The colour of the circles indicates the most probable tercile predicted by the model ensemble.The size of the circles is related to the size of that particular model ensemble.

Fig. 8 .
Fig.8.Most frequent tercile predicted by the multi model ensemble of simulations starting on September 2016.The frequency is indicated by three different colour schemes: shades of blue for earlier onset (onset detections smaller than the 33rd percentile are more probable), shades of grey for normal onset (detections between the 33rd and the 67th percentiles are more probable), shades of yellow-red for late onset later (detections larger than the 67th percentile are more probable).