Importance of Madden–Julian oscillation phase to the interannual variability of East African rainfall

Precipitation across East Africa shows marked interannual variability. Seasonal forecast skill for the OND short rains is significantly higher than for the MAM long rains, which also exhibit poorly understood decadal variability. On sub‐seasonal time‐scales rainfall is influenced strongly by the phase of the Madden–Julian Oscillation (MJO); here we investigate whether this influence extends to interannual and decadal scales. We show that the number of days that the MJO is active and in phases 1–3 has a greater influence than the mean amplitude of the MJO on interannual long rains variability (ρ = 0.59 for the count of phases 1–3, compared to ρ = 0.40 for amplitude). The frequency of these days is linked to a newly identified gradient in Pacific sea‐surface temperatures (SSTs), whose influence on long rains variability we show is itself mediated by the MJO. We develop a statistical model estimating East African rainfall from MJO state, and show that the influence of the MJO on seasonal rainfall extends to the short rains, and to a lesser extent also into January and February. Our results show the importance of capturing the SST‐MJO phase relationship in models used for predictions of East African rainfall across time‐scales, and motivate investigating this further.

Predictions of EA seasonal rainfall have enabled early action, saving lives (Funk et al., 2018;Hillbruner & Moloney, 2012;Sheffield et al., 2014). Seasonal predictions, either from dynamical or statistical models, are more skilled for SR than LR rainfall; for example, Walker et al. (2019) reported a correlation of 0.69 for Met Office GloSea5 short rains predictions, but a LR value of 0.07. This discrepancy is largely because of control by the Indian Ocean Dipole (IOD) during the short rains (Black et al., 2003;Hirons & Turner, 2018;Walker et al., 2019). Some dynamical models do have skill for the LR (MacLeod, 2019), while moderately skillful statistical models for the LR have been constructed (Nicholson, 2014), with predictions based on Indo-Pacific SSTs enabling early warnings of EA drought events in all seasons (Funk et al., , 2018. Global models' sub-seasonal skill has recently improved, enabling accurate predictions of EA rainfall on the time-scale of weeks; this is in part due to improved model representation of the Madden-Julian Oscillation (MJO), and also to improved understanding of the impacts of the MJO on local East African weather (Zaitchik, 2017;de Andrade et al., 2021;MacLeod, Dankers, et al., 2021;Vashisht & Zaitchik, 2022). The MJO (Madden & Julian, 1971, 1972) plays a wellestablished role in the intraseasonal variability of rainfall across EA, with phases 1-4 (5-8) generally associated with wet (dry) anomalies through both rainy seasons (Berhane & Zaitchik, 2014;Hogan et al., 2015;Omeny et al., 2008;Pohl & Camberlin, 2006a). These relationships are typically well captured by models, excepting some heterogeneity across the region (Hogan et al., 2015;MacLeod, Dankers, et al., 2021). There is an inherent asymmetry between the rainy season wet and dry phases, with the MJO driving larger wet anomalies than dry (Pohl & Camberlin, 2006a;Vellinga & Milton, 2018), and notable examples exist of large amplitude MJO events contributing to the onset of extreme wet periods over EA (Kilavi et al., 2018;MacLeod, Dankers, et al., 2021;Wainwright et al., 2021).
Intriguingly, it has been found that the amplitude of the MJO plays an important role in LR interannual variability (Pohl & Camberlin, 2006b;Vellinga & Milton, 2018), with ρ = 0.51 during March-April when regressing mean MJO amplitude against the leading principal component of EA rainfall (Vellinga & Milton, 2018). EA rainfall has been linked to anomalous westerlies bringing moist air from the Congo basin (Finney et al., 2020), which are themselves linked to the MJO (Berhane & Zaitchik, 2014). Recently, Walker et al. (2020) showed that variations in the westerly winds have influenced LR variability on both decadal and interannual time-scales, and that a similar, albeit reduced, influence existed for the mean MJO amplitude.
The role of the MJO in notably wet years (Kilavi et al., 2018;Vellinga & Milton, 2018;Wainwright et al., 2021), and the recently identified role of MJO variability in some tropical rainfall trends (Roxy et al., 2019), raises the question to what extent EA interannual, and decadal, rainfall variability can be explained by its state. The MJO is known to be influenced by Indo-Pacific SSTs (Liu et al., 2016;Pang et al., 2016;Pohl & Matthews, 2007;Slingo et al., 1999;Suematsu & Miura, 2018), with substantial control of the MJO life-cycle (Dasgupta et al., 2021;Roxy et al., 2019). Understanding SST controls of the MJO during the long rains, and to what extent SST influences on EA are expressed through the MJO, should provide insight into factors which must be represented to accurately model and predict EA rainfall. This paper therefore, after defining our methods and study region (Section 2), first considers the role of the MJO (especially life-cycle phases) in the LR (Section 3.1), further examining relationships with Pacific SSTs. We then consider the MJO's role across all seasons (Section 3.2), before presenting our conclusions (Section 4).

| DATA AND METHODS
We use daily Climate Hazards Center InfraRed Precipitation with Station data (CHIRPS) rainfall data at 0.05 grid resolution, spanning 1981-2019. CHIRPS combines infrared satellite estimates of tropical precipitation with a global network of tropical ground-based gauges . We have repeated appropriate analyses using daily rainfall from the Integrated Multi-satellitE Retrievals for GPM (IMERG) algorithm, for which data are only available from 2000.
Results shown for Pacific SSTs use the Met Office HadSST dataset (Rayner et al., 2003) of monthly global SSTs at 1 grid resolution; we have repeated our analyses using the National Oceanic and Atmospheric Administration (NOAA) Extended Reconstruction SST and Optimum Interpolation datasets, finding similar conclusions. The latter dataset underpins the monthly NOAA SSTOI anomaly indices computed for all Niño regions, which we use to determine relationships with ENSO. To study the state of the MJO we use the Wheeler-Hendon (WH) indices generated daily by the Bureau of Meteorology. The WH indices RMM1 and RMM2 (Wheeler & Hendon, 2004) are computed as the leading empirical orthogonal functions of combined near-equatorially averaged zonal winds and NOAA satellite observations of outgoing longwave radiation (Liebmann & Smith, 1996).
MJO amplitude and phase are computed as the norm and arctangent of the RMM indices, respectively. An amplitude greater than 1 indicates an active MJO, while the eight phases capture the MJO's life-cycle (Wheeler & Hendon, 2004). Figure 1 shows how MAM precipitation anomalies across EA depend on this cycle, with wet anomalies typically appearing in phases 1-3, when the MJO convective core is located over the Indian Ocean. We measure the frequency of such events in a given period by counting the number of days in which the MJO is both active and in phases 1-3; we label this measure MJO123, and refer to such days as EA wet-phase days.
To capture the effects of the full range of MJO amplitudes and phases on EA rainfall, we generate a model rainfall index dependent solely on MJO state. Our method uses the WH dataset to allocate daily mean percentage rainfall anomalies to phase and amplitude bins. We then compute the mean anomaly for each bin, group bins by phase, and apply weighted linear regression, yielding a fit function b r θ, jRMMj 2 À Á comprising eight monotonic series of percentage rainfall anomalies. The rainfall estimate r est (t) is computed by applying such a fit-function to the mean daily climatological rainfall r d , The index r est forms a daily time-series, since it depends ultimately on the WH indices RMM1 and RMM2. We refer to r est as an RMM estimate (of precipitation). In generating the fit function one may calculate independent fits over intervals of the year; we use seasonal fits, so b r comprises 32 independent linear fits (one for each MJO phase in each season). We also use π/4 phase bins with an amplitude width of 0.5, and require bins to contain at least three data points for inclusion in the regression, which is weighted to the standard error of the bin means.
Our main study region is the bold outlined area in Figure 1, since this region (which we henceforth label East Africa) is associated with distinct bimodal annual rainfall and recent decadal LR drying trends (Liebmann et al., 2017). Unless stated otherwise, the study period for all results is 1981-2019. In keeping with previous To separate interannual and decadal variability, we follow Walker et al. (2020) and apply linear regression to datasets from which long-term trends are removed by fitting, and removing, a cubic polynomial. Hereafter, such filtered series are referred to as detrended. The expected change in mean rainfall between P1 and P2 due to the observed change in the state of an external variable is calculated as the product of the gradient of the regression line and the observed change in variable mean from P1 to P2 (equation 1 of Walker et al., 2020). If the expected change in mean rainfall between P1 and P2 is approximately equal to the observed decadal change, then this supplies evidence that the mechanism linking rainfall and external variables on interannual time-scales also holds on decadal time-scales.
We calculate the Pearson correlation coefficient (ρ) for key indices, with p-values for significance estimated using a two-sided t-test. Full degrees of freedom are assumed, since key indices do not contain significant serial correlation. Correlations taken after first applying the cubic polynomial filter to indices are referred to as detrended correlations (ρ dt ). To test for mediation in the relationships of causal variables we use the regression framework outlined in Kolstad and MacLeod (2022). The statistical significance of composite mean differences is estimated using a Welch t-test without assumption of equal sample variance.
The drying trend in LR precipitation between P1 and P2 is clear (Figure 2a), whereas the decline in MJO123 count is far less striking. Applying a t-test to the mean values in P1 and P2 (indicated with horizontal lines in Figure 2a), precipitation change is highly significant, whereas MJO123 is far from significant.
To provide a quantification of the contribution of the MJO123 count to decadal LR variability, the approach of  The observed decline in mean rainfall from P1 to P2 is Δr obs = À0.43 ± 0.10 mm day À1 , while the change in the mean MJO123 count is À1.95 ± 3.96 days. From the regression equation, this amounts to an expected change in rainfall of Δr exp = À0.06 ± 0.12 mm day À1 ; only $14% of the observed decline in rainfall can thus be attributed to the decadal change in MJO123 count. Repeating for mean MAM MJO amplitude yields a very similar level of variance.
What drives the interannual variability in the MJO123 count? Indo-Pacific warmpool SSTs are known to influence the MJO life-cycle (Dasgupta et al., 2021;Roxy et al., 2019): given the relationship in Figure 2, it is pertinent to understand if any similar mechanism may influence LR variability through the MJO. Figure 3a displays the distribution of correlations between MAM MJO123 counts and mean February-March (FM) SSTs, when the observed correlations are strongest (variations by month discussed later). As is typical for such significance testing, some false rejections of the null hypothesis will occur by chance, hence significance thresholds should be stricter for spatial data (Katz & Brown, 1991;Wilks, 2016)-standard local levels quoted here are for guidance only, while we have also shown contours where p < 0.01. The correlations are positive across the bulk of the Indian Ocean, but rarely significant. Instead, the largest area of significant positive correlation lies in the central Pacific, while the only area of significant negative correlation lies in the southeastern Pacific. To quantify this dipole structure we identify two representative regions in Figure 3a, A1 (160 E to 160 W, 0 N to 12 N) and A2 (110 W to 85 W, 18 S to 6 S). We then define three SST indices: A1, A2, and the gradient A1 À A2. Regressing each index calculated over FM against MAM MJO123 count, for A1 (A2) ρ = 0.42 (À0.40), p < 0.05, while for A1 À A2, ρ = 0.53, p < 0.01 (all ρ dt values similar).
The corresponding values taken against LR precipitation are ρ = 0.37 (ρ dt = 0.41) for A1; ρ = À0.34 (ρ dt = À0.38) for A2; and ρ = 0.45 (ρ dt = 0.49) for A1 À A2. Again p < 0.05 for the indices individually, and p < 0.01 for the gradient. Note all values are lower for rainfall than MJO123 count. The corresponding spatial distribution of SST correlations with LR precipitation is plotted in Figure 3b, and again shows a gradient between significant positive (negative) correlation in the central (southeastern) Pacific. The primary difference from Figure 3a is the V-shaped pattern of negative (often significant) correlation in the western Pacific, especially at the apex of the V. A similar pattern has been found in previous studies examining the role of western Pacific SSTs in the LR drying trend (Funk et al., 2018(Funk et al., , 2019Funk & Hoell, 2015;Lyon & DeWitt, 2012). Comparing the distributions suggests that the relationship of western Pacific SSTs with EA rainfall totals is more independent of the MJO. Indeed, taking the FM means for region V1 shown in Figure 3b (150 E to 155 E, 15 S to 7 S), the correlation against LR rainfall is significant (ρ = À0.47, ρ dt = À0.38), but this is not the case for MJO123, where ρ = À0.26 (ρ dt = À0.28).
The differing relationships of the SST indices A1 À A2 and V1 to the MJO (and LR precipitation) can be further demonstrated by testing for mediation. First, consider the regression The results for this regression (FM A1 À A2, MAM MJO123 count) are reported in Table S1, showing that a 1 is not statistically significant (even though the direct correlation between LR and A1 À A2 is highly significant). Following Kolstad and MacLeod (2022), this result implies that the link between LR and A1 À A2 is mediated through MJO123. In other words, LR precipitation is conditionally independent of A1 À A2, when accounting for the relationship to MJO123. When the analysis is repeated using FM V1 in place of A1 À A2, the coefficient a 1 this time remains significant (Table S2), indicating that the impact of V1 on EA is not mediated through MJO123. Moreover, when regressing the two SST indices against LR (without MJO123; Table S3a), the coefficients for A1 À A2 and V1 are both significant; however as soon as MJO123 is added (Table S3b), the A1 À A2 coefficient is no longer significant. We therefore conclude that both SST indices account for significant independent variability in LR precipitation, but that the MAM MJO123 count contains all (or at least almost all) of the explanatory power of FM A1 À A2.
Considering monthly variations in A1 correlation with MAM MJO123, Figure 3c shows significant correlations starting from the preceding October, beginning to wane in April, and disappearing entirely in June. Similar regions surrounding the Maritime Continent are far less stable. For A2 (not shown) the persistent period is shorter, spanning January to April. Figure 3d quantifies these trends by plotting monthly correlations for the relevant SST indices, showing that correlations peak during FM. The period with strongest correlations is January-April, suggesting seasonal control of the MAM MJO. Similar results hold for correlations against LR precipitation, but with the strongest relationship in April (not shown).
The time-series of the FM SST indices are compared against that for LR precipitation anomalies in Figure 3e. For the SST indices interannual variability is clearly present, but decadal variability less-so. The regression of detrended FM A1 À A2 values against LR rainfall (MAM) is shown in Figure 3f; the regression equation is r = 0.39 s + 0.00, σ slope = 0.11 mm K À1 . The change in the decadal mean index from P1 to P2 is Δs = À0.16 ± 0.19 K; regressing, the SST index explains only $15% of the observed decline in LR precipitation. This differs from the decadal influence on LR rainfall exhibited by western Pacific SSTs; for example, applying the same methodology to region V1, around 80% of the LR drying trend is explained.
Regions A1 and A2 lie close to the ENSO regions, with A1 in particular resembling ENSO4 (5 S to 5 N, 160 E to 150 W). There therefore exist significant correlations between the FM ENSO indices and the indices defined here. However, there is no significant correlation between any mean FM ENSO index and either the MAM MJO123 count or LR precipitation; the strongest such relationship is with ENSO4, where ρ = 0.23, p = 0.16 against both variables. 1 These results agree with previous assessments of the impact of ENSO on LR precipitation (de Andrade et al., 2021), and suggest that the mechanism through which Pacific SSTs influence the MAM MJO and LR precipitation is separate to ENSO.

| Interannual variability of seasonal EA rainfall
The MJO plays an important role in interannual variability of the EA long rains; does this extend to other seasons? Figure 4a plots the correlation and R 2 coefficients between mean EA seasonal rainfall and three measures of the MJO: mean RMM estimate precipitation index for EA, described in Section 2; mean MJO123 count; and mean count of all active days. The first two measures have similar strength relationships in both MAM and OND, suggesting the MJO explains $30% of interannual variability of both rainy seasons; lower correlations in January-February (JF) and, especially, June-September The negligible difference between the RMM estimate, fitted to all MJO phases and amplitudes, and simple MJO123 count emphasises the key role played by the frequency of EA wet-phase days in interannual rainfall variability. The same correlations are plotted for mean seasonal MJO amplitude in Figure 4b. Comparing against the RMM estimate, it is apparent that MJO amplitude has a weaker relationship with EA seasonal rainfall than measures of phase frequency, especially during the short rains. As noted by Kimani et al. (2020), phase 2 magnitude has the strongest relationship, but this is only significant in MAM. We find similar results for monthly means (not shown).
The correlation between the RMM estimated and CHIRPS observed time-series for mean LR precipitation (shown in Figure 4c) is ρ = 0.58 (ρ dt = 0.53); for both results p < 0.01. The decline in correlation after detrending suggests the RMM estimate may have stronger decadal variability than previously considered MJO measures. Figure 4d shows the regression of detrended RMM and CHIRPS LR precipitation anomalies. The regression equation is r = 2.41r est + 0.00, σ slope = 0.63, and the change in the RMM estimate from P1 to P2 is Δr est = À0.05 ± 0.05 mm day À1 . Applying the regression equation yields an expected change in rainfall Δr exp = À0.12 ± 0.13 mm day À1 , or $28% of the observed change. This is higher than equivalent results for MJO123 or mean amplitude, but still relatively small compared to the standard error.

| CONCLUSIONS
We have investigated the role of the phase of the MJO in modulating interannual and decadal variability of the East African long rains. We have shown that the MAM count of days in which the MJO is active and in phases 1-3 (MJO123) has a significant correlation (ρ = 0.59) with LR precipitation, and that this relationship is stronger than the relationship with alternative counts of MJO days, or with measures of mean MJO amplitude. After detrending to isolate interannual from decadal variability, the correlation increases to ρ dt = 0.61.
The influence of MJO phase on LR variability is not substantial for decadal time-scales. We found that there is a decline in the frequency of MAM MJO123 days from 1981-1997 (P1) to 1998-2011 (P2), but that this is not statistically significant, unlike the decline in LR rainfall over the same period. The decadal change in MJO123 count explains only $14% of the decline in LR rainfall. A similar level of variance can be attributed to mean MJO amplitude.
We have demonstrated evidence of a control of MJO123 frequency from an SST gradient spanning the central and eastern Pacific. A dipole between two dominant SST regions of significant correlation with the MAM MJO123 counts was shown to exist, which we have quantified through a new SST index (A1 À A2). The strongest relationship between this dipole and the MJO exists across FM (ρ = 0.53, ρ dt = 0.52), but we have shown that a significant relationship is present between the MAM MJO and SSTs in all boreal winter months preceding the long rains.
To our knowledge, such a relationship with MJO phases 1-3 has not previously been reported. Roxy et al. (2019) have shown that increased Indo-Pacific warm-pool area increases the duration of MJO events in phases 4-6 during boreal winter, at the expense of phases 1-3. Our results suggest a different SST dipole anomaly impacts phase counts in boreal spring. The SST gradient A1 À A2 we have related here to the MJO lies east of the zonal gradient over 'western-V' warmpool SSTs previously linked to the LR drying trend, and, in stark contrast to that region, exhibits little decadal variability, explaining only $15% of the decrease in LR rainfall from P1 to P2.
The FM correlation of the SST gradient A1 À A2 to LR values (ρ = 0.45, ρ dt = 0.49) is lower than that with the MJO123 count, suggesting that the influence of this Pacific SST gradient on LR precipitation is expressed through control of the MJO. This hypothesis is further supported by evidence of mediation of the A1 À A2 influence through the MJO123 count. Again, this is in contrast to western Pacific SSTs (here, the V1 index), for which we find strong correlation with LR rainfall totals, but little correlation with MJO123, and no evidence of mediation. We have not investigated the detailed mechanisms underpinning the relationships shown here, and this should be a priority for future work, especially in light of recent advances for the OND short rains (Vashisht & Zaitchik, 2022).
The influence of the MJO phase also holds during the SR, with $30% of interannual variability in both seasons explained by the corresponding MJO123 phase count. Near identical seasonal correlations were found from a novel daily estimate of EA precipitation which accounts for MJO phase variability and linear relationships of rainfall with MJO amplitude; this similarity of correlation highlights the importance of phases 1-3. The modelestimate does however capture a higher proportion of the decadal variability of EA rainfall, explaining $28% of the observed LR decline between P1 and P2.
The MJO amplitude is typically used to measure the MJO when considering teleconnections to interannual LR variability. Our results suggest that emphasis should also be placed on the frequency of MJO events across the Indian Ocean (i.e., phases 1-3). Furthermore, the relationship of this frequency to Pacific SSTs (A1 À A2), as found in this paper, offers a potential source of improved MJO predictability itself (Stan et al., 2022). Accordingly, the representation of the relationships detailed in this paper should be investigated in both forecast and climate models. For example, although modelling of both the MJO and EA rainfall has improved in CMIP6 models (MJO propagation especially), persistent biases such as underestimated MJO contributions to variability, and underestimated LR accumulations, remain (Ahn et al., 2020;Bartana et al., 2022;Le et al., 2021;Lyon, 2022). There are substantial uncertainties in climate change projections of the long rains (Rowell, 2019): ensuring that models represent the relationships observed here is an important step towards further reducing such uncertainties.