Storm surge variability and prediction from ENSO and tropical cyclones

Storm surges are among the deadliest natural hazards, but understanding and prediction of year-to-year variability of storm surges is challenging. Here, we demonstrate that the interannual variability of observed storm surge levels can be explained and further predicted, through a process-based study in Hong Kong. We find that El Niño-Southern Oscillation (ENSO) exerts a compound impact on storm surge levels through modulating tropical cyclones (TCs) and other forcing factors. The occurrence frequencies of local and remote TCs are responsible for the remaining variability in storm surge levels after removing the ENSO effect. Finally, we show that a statistical prediction model formed by ENSO and TC indices has good skill for prediction of extreme storm surge levels. The analysis approach can be applied to other coastal regions where tropical storms and the climate variability are main contributors to storm surges. Our study gives new insight into identifying ‘windows of opportunity’ for successful prediction of storm surges on long-range timescales.


Introduction
Extreme sea levels, mostly owing to high tides and meteorological storminess, pose great risks to coastal regions. The coasts in the western North Pacific (WNP) are extraordinarily exposed to the natural hazards related to tropical cyclones (TCs) Tsimplis 2014, Edmonds et al 2020). Understanding of the variability and predictability of extreme sea levels is crucial for coastal management as it may significantly reduce extreme sea level damage. An extreme sea level event consists of mean sea level, a tidal contribution, and a tidal residual that is usually dominated by storm surge event. On interannual timescales, mean sea level in the WNP is associated with large-scale climate variability in the atmosphere and the ocean Cheng 2017, Cha et al 2021), while the interannual variability in tides can be well predicted using the equilibrium tidal theory (Feng et al 2015b). Identifying drivers influencing storm surge levels on interannual timescales is key to understanding and prediction of the year-to-year variability in extreme sea levels. Apart from affecting mean sea level, the largescale climate variability can also affect the storminess. The El Niño-Southern Oscillation (ENSO) is the most important interannual mode in the WNP (Goh and Chan 2010). ENSO can affect the genesis, track and intensity of WNP TCs by altering the amount of sea surface energy available for TC activity and by modulating the background steering flow in which TCs are embedded Sobel 2005, Feng et al 2020). However, the teleconnection of ENSO to TCs that affect the WNP coasts is complicated and may not be linear (Feng et al 2022). In El Niño years, WNP TCs tend to form in the central tropical Pacific and become more intense when making landfall in the west due to the longer lifetime over the ocean than in La Niña years (Camargo and Sobel 2005). However, these storms are less likely to strike the WNP coasts (e.g. the southeast Asia and south China) in El Niño years due to multiple factors, e.g. a higher possibility of northward recurving tracks and an eastward shift in the average position of TC genesis associated with a weak North Pacific Subtropical High (Chan and Liu 2004, Wu et al 2004, Kim et al 2011, Guo and Tan 2018, Song et al 2020, Zhao et al 2020, Chu and Murakami 2022. On the other hand, in the decaying seasons of El Niño events, associated with the warm sea surface temperature anomalies in the Indian Ocean (Xie et al 2009), the anomalous easterly winds in the South China Sea (SCS) may favour more TCs to generate locally, thus increasing the storm surge risk on the coasts. Furthermore, the multiple types of ENSO events, such as Modoki and mega-ENSO (Ashok et al 2007, Kim et al 2011, Wang et al 2013a, 2013b) may also alter the ENSO effect on WNP TCs. Therefore, the modulation of TCrelated storm surges on the WNP coasts by ENSO is not straightforward due to the complexity of the TC-ENSO relationship.
Leaving ENSO aside, although TCs play an important role in extreme sea level events on the coasts of the tropical and subtropical regions Tsimplis 2014, Zhang andSheng 2015), it remains a question which TC metrics are useful to describe the intensity and occurrence of storm surges on interannual timescales. Storm surges are influenced by multiple TC factors (e.g. intensity, size, translation speed, landfalling angle relative to the coast) and non-TC factors (e.g. local bathymetries). So far, no metrics of regional TCs have been found to be able to explain the variability in storm surges between different events and years (Irish et al 2008, Towey et al 2021. Identifying such TC metrics will help to explain the interannual variation of storm surge and then translate the TC prediction to storm surge prediction. In this paper, we will introduce a new analysis approach to decompose the joint effects of ENSO and TCs on the interannual variability of extreme sea levels on the WNP coasts. In this approach, several independent predicators are identified to interpret storm surge variability. And then we develop an empirical model to predict the storm surge levels for all exceedance probabilities. This framework is demonstrated in Hong Kong because (a) this is one of the most vulnerable areas for storm surge hazards (Hallegatte et al 2013), (b) there are more than 10 TCs affecting Hong Kong each year on average (Feng and Tsimplis 2014), ensuring sufficient samples of TC events in our analysis, and (c) Hong Kong has one of the longest and most reliable sea level records along the WNP coasts. This method can be applied to other coastal regions where TCs and large-scale climate modes play important roles in storm surge variability.

Sea level and TC observations
The study is based on the hourly sea level data in Hong Kong over the period 1962-2020. Data were recorded at North Point station from 1962 to 1986, after which the station moved to Quarry Bay, half a kilometre away. There is an upward offset of 1.02 cm in the data before 1986. After shifting the data by 1.02 cm, a 59 year sea level record was obtained (Ding et al 2001, Feng andTsimplis 2014). Quality control was carried out prior to the analysis, including the removal of duplicate records and obvious outliers caused by earthquakes and tsunamis. Augmented Dickey-Fuller test was performed on the tide gauge records and we confirmed the stationarity of annual and long-term sea level records. The T-tide Matlab box (Pawlowicz et al 2002) was used for tidal harmonic analysis. In the harmonic analysis, 58 tidal constituents (both amplitude and phase) are statistically fitted from the hourly sea level record for each year, based on a least square fit. The seasonal cycle (i.e. annual and semi-annual cycles) is included in the tidal analysis. The nodal correction in lunar tides is not applied, to avoid the discrepancy of the theoretically predicted nodal variations against observations (Feng et al 2015b). Details on the tidal analysis can be found in Pawlowicz (2002).
The predicted tides and annual mean sea level are then subtracted from hourly sea level data, yielding the tidal residuals (TRs), on a yearly basis. TR can be expressed as: TR = Hourly sea level − Mean sea level − Tides.
(1) For 57 out of the 59 years, the complete rate of hourly TR is above 95%, indicating a good quality for serial analysis. The exceptions are 1982 and 1984, with valid data covering 87% and 78% of time. Sensitivity analysis of data integrity has been performed by removing the 2 years from the analysis series.
Six-hourly TC track records were obtained from the Regional Specialized Meteorological Center Tokyo of Japan Meteorological Agency. Over 1962-2020, there are 1558 TCs recorded in the WNP basin (0-40 • N, 100-180 • E). TC track data include position and minimum sea level pressure, with maximum sustained wind speed recorded since 1978.

ENSO index
Niño 3.4 is used as the ENSO index, with monthly data obtained from the National Oceanic and Atmospheric Administration (NOAA). The ENSO index was defined by the standardized sea surface temperature anomalies in the Niño 3.4 region (5 • S-5 • N, 170 • W-120 • W) averaged over July-October when TCs are most active in the WNP. We also tested different averaging windows for the 4 month mean values of Niño 3.4, and found the July-October average has the best correlation with TRs.

Data analysis
The TC-influenced area is defined by a moving circle along the TC track positions with an 8 • radius (Frank 1977, Feng andTsimplis 2014). At every hourly time of the TC lifetime, when Hong Kong is within the TC-influenced area, TRs at this time are then identified as TC-related TRs. To make sure the same temporal resolution in TC and TR data, the 6-hourly TC tracks were linearly interpolated to hourly interval. We also evaluated the sensitivity of our results to the choice of the radius for TC-influenced area by changing the influence radius gradually from 5 • to 9 • , and we found the same conclusions.
We used percentile analysis to assess temporal variations of TRs (Jones et al 1999, Camargo et al 2007, Antunes 2011. For a given year, the hourly values of TR measurements were sorted into an ascending order. The largest sample size of hourly TR values for one year is 8784 in a leap year. The TR value at each percentile from 1 to 99 th was then estimated based on a linear interpolation. Combining the 99 th value in each year, for example, gets a time series of TR at the 99th percentile. Linear correlation coefficient is calculated between TR at each percentile and ENSO (or TC metrics). Pearson coefficient is used to evaluate the correlation. Considering the serial dependence in geophysical data (Lu et al 2020), an adjustment of effective sample size is used here for the evaluation of significant level. Following previous work (Clark and Chu 2002), the effective sample size is calculated by: where N is the original sample size of serial variables x, y (e.g. TR and ENSO indices), and γ ix and γ iy are the autocorrelation coefficients within the serial variables at lag i, respectively. The reduction of effective numbers of degrees of freedom due to the serial dependence may induce an increase in p-value.
We utilized the composite analysis to evaluate the associated wind circulation at the surface when TCs approach or affect Hong Kong, to understand the TC effect on TRs on long-time range. In this analysis, the satellite era 1979-2020 is chosen considering the reliability of TC tracks (Knaff et al 2010, Torn andSnyder 2012). Land and sea surface pressure, and low-level winds (at 10 m and 850 hPa), at 6 hourly intervals, from the ECMWF fifth generation climate reanalysis (ERA5) (Hersbach et al 2020) were used. To eliminate the effect of the longer-term variability, the daily climatology and annual mean were removed from the 6 hourly atmospheric data. The 6-hourly anomalies were finally used as the synoptic-scale fluctuations in the atmosphere. Figure 1(a) (red line and shading) shows the means and interannual variations (standard deviation) of TRs at different percentiles in Hong Kong throughout 1979-2020. From the lowest to the highest percentiles, the TR mean increases steadily from −28 cm to 60 cm, with 0 around the median percentiles. The amplitudes in the year-to-year variation also have larger values (2-18 cm) in low and high percentiles, and smaller in median percentiles. In our study, we include the low percentiles because they are critically important to inform the coastal activities, for which a minimum water level is required, such as vessel passage and water extraction. To understand the role of TCs in TRs, we confine TR values to the time when they are associated with landfalling and approaching TCs (figure 1(a), black line and shading), namely TC-related TRs (see section 2.3 Data analysis for details). The means and variations of TC-related TRs are much higher than the overall TRs at each percentile, by about 8-21 cm and 3-20 cm, respectively. The remaining TRs (non-TC-related) (figure 1(a), blue line and shading) have smaller mean values and variations than the overall TRs for the high percentiles, while for low-to-middle percentiles the difference between the overall and non-TC-related TRs is small. This suggests that TCs are an important factor for high percentiles of TRs. Now, we investigate the role of ENSO in the interannual variability of TRs. ENSO has significantly (at a 95% confidence level, unless stated otherwise) negative correlation with high percentiles of TRs (74-99 th ), with the strongest correlation (r) at the 90 th percentile (r = −0.66) ( figure 1(b), red line), explaining 44% of the total variability. ENSO has significantly positive correlation with the low-to-middle percentiles of TRs (1-54 th ), with r = 0.30-0.60. We find that this ENSO-TR relationship can be partly explained by the ENSO effect on TC-related TRs ( figure 1(b), black line). The high percentiles (85-96 th ) of TC-related TRs have a significant relationship with ENSO, with r = −0.44 to −0.31. This means that in El Niño years water level peaks (high percentiles) associated with TCs have smaller values than in La Niña years, suggesting a reduced impact from TCs.

Interannual TR variability related to ENSO
To test which metrics of local TCs are responsible for the large interannual variability of TC-related TRs ( figure 1(a), black shading), we calculated the annual number, the annual accumulated cyclone energy (Bell et al 2000) and the average intensity of TCs that are within the area with an 8 • radius around Hong Kong. We find that none of these metrics of TCs has significant correlation with TC-related TRs at any percentile higher than 90 th and lower than 20 th at the same time (shown in figure S1 in supplementary). This is consistent with the previous study on the east coast of the United States (Towey et al 2021), which also found that storm surges cannot be straightforward represented by a proxy of regional TC activity. The lack of relationship with direct measures of TC reflects the complexity in the storm surge processes. We also confirm that ENSO has no significant correlation with any of these metrics of TC either. The large interannual variability of TC-related TRs in Hong Kong is likely caused by the nonlinear integration of different aspects of regional TCs (e.g. frequency, intensity, translation speed and landfalling position). We speculate that ENSO is a skilful descriptor capturing the integral effect of TCs on the high percentiles of TRs.
For non-TC-related TRs, ENSO is negatively correlated with high percentiles and positively correlated with low percentiles ( figure 1(b), blue line). The strongest correlation occurs at the 99 th percentile (r = −0.54) and the 36 th percentile (r = 0.63). This confirms that ENSO can also depict other unidentified factors (i.e. non-TC factors) that significantly affect TRs in Hong Kong. These non-TC factors are complicated. They may include weak tropical storms, which have impacts on TRs but are not included in the TC records (Velden et al 2006, Torn andSnyder 2012), and the monsoons, which can affect TRs via the river discharge and the barotropic forcing (Feng et al 2021). The ENSO modulations of intra-seasonal climate variability (e.g. the Madden-Julian Oscillation), which could affect TRs in Hong Kong, may also be relevant.

Interannual TR variability unrelated to ENSO
In this subsection, we focus on the part of interannual variability in TRs that is unrelated to ENSO. To remove the ENSO effect, a linear regression on the yearly ENSO index with a linear least-square method is performed on the timeseries of TR and TC data. The TR and TC values determined by the ENSO index are then removed from the original timeseries, to obtain the timeseries of the non-ENSO-related TR and TC data. We find that after removing the effect of ENSO, again, none of the metrics of TCs affecting Hong Kong can explain the ENSO-filtered TRs (figure S2). In the next, we investigate TR indicators based on the spatial patterns of TC tracks, after the ENSO effect is excluded.

(d) Correlation coefficients between TRs and TCS and TCE, after the ENSO effect is removed. Thick lines indicate the correlations that pass the significance test at a 95% confidence level.
Correlations between the ENSO-filtered TC numbers and TRs are shown in figure 2(d). Without the ENSO effect, the TC E frequency is significantly associated with TRs at the 78-99 th percentiles, with the largest correlation r = −0.56 (explaining 32% of the total variability) at the 90 th percentile. For the low-to-median values of TRs (11-44 th percentiles), they are positively correlated with the TC E frequency (r = 0.31-0.55). The correlation between the TC S frequency and TRs is positive (r = 0.31-0.43) for the upper percentiles (84-95 th ) and negative (r = −0.5 to -0.3) for the low percentiles (1-14 th ). Neither the TC E nor TC S frequencies have a significant effect on the median values of TRs, related to small interannual TR variability ( figure 1(a)). The correlation between the TC E and TC S frequencies is not significant, indicating that they are statistically independent.
To further interpret the relationship between the ENSO-filtered TRs and landfalling TC frequency, we analyse composite anomalies of surface pressure and low-level winds when TCs strike the coasts (figure 3). In figures 3(a) and (b), the composites are based on the TCs in the neutral years (23 years) of ENSO, to eliminate the ENSO effect. For the East China Coasts TCs, an anomalous low-level cyclonic circulation, related to a low pressure (5-10 hPa lower than climatology), appears in the Taiwan Strait. This is accompanied by a weak anti-cyclonic circulation related to a high pressure around Japan (2-4 hPa higher than climatology) (figure 3(a)). The low-level cyclonic circulation means an offshore wind in Hong Kong, which pushes water away from coastline, causing TR values to reduce. The low-pressure anomaly in Hong Kong associated with the East China Coasts TCs may increase TRs by around 2 cm via the inverted barometer effect (Feng et al 2015a). However, the negative effect of the East China Coasts TCs by offshore winds dominates over the positive effect by low-pressure ( figure 2(d)). The positive correlations at low percentiles are expected to relate to the onshore wind anomalies and/or low-pressure before these TCs make landfall.
For the South China Coasts TCs, a low-level cyclonic circulation anomaly appears in the north part of the SCS (figure 3(b)). The low pressure and onshore wind anomalies increase TR values in Hong Kong, leading to the positive TC-TR correlations at high percentiles ( figure 2(d)). Similarly, before landfall and/or in the remnant stage, the South China Coasts TCs have negative correlations with TRs at low percentiles. The cyclonic circulation anomaly is stronger for the East China Coasts TCs than for the South China Coasts TCs. Figures 3(c) and (d) show composite anomalies in the whole period , suggesting that the ENSO phases do not substantially change the relationships between TRs and TCs.

A statistical prediction model from ENSO and TC indices
After identifying the drivers of year-to-year TR variability, a linear prediction model is devised to TR values at various percentiles. The ENSO index (Nino3.4), and the numbers of TCs landfalling on the south coast (TC S ) and on the east coast (TC E ) after removing the ENSO effect, are the three independent predictors. Before feeding the prediction model, the three predictors are normalized and nondimensionalized by their standard deviation. In a prediction year t, the predicted TR (TR p ) at each percentile (p) can be expressed as: where a, b, c are regression coefficients between predictand and predictors, and d is a constant representing the regression intercept. The regression coefficients are estimated individually for each prediction year. We use cross-validation to gain the regression coefficients: for each prediction year, a, b and c are calculated using 37 (N − 5) years of predictand and predictors with a 5 year omitted block centred on the year of prediction. TR p (t) in the prediction year t is then predicted based on the predictors for this year. After predicting TR for each year, we get a timeseries of predicted TR (p) . Correlation coefficient between observed TR and TR (p) for each percentile is then calculated to evaluate the performance of the prediction. Note that our statistical model is only utilized to predict the percentiles of TRs with standard deviation > 1 cm (i.e. this excludes the median percentiles as the variability is very small). Different methods of cross validation are tested in figure S3 and the results are similar. We evaluate the performance of both individual and overall predictors, by calculating the correlations and root mean square error (RMSE) between the observed and predicted TR values during 1981-2018 (figures 4(a)-(d)). As expected, ENSO has a significant skill (r = 0.37-0.64) in predicting the variability in TRs at low (1-42 th ) and high (75-99 th ) percentiles. The non-ENSO-related TRs can be well predicted by the frequencies of non-ENSO-related TC S and TC E , with r = 0.33-0.50. By combing the three predictors, the performance increases up to r = 0.36-0.79 for low and high percentiles. The best performance is achieved at the 90 th percentile, with r = 0.79 (predict 60% of the total variability). RMSE for each individual and all predictors are less than 1.5 cm at most percentiles, but increase rapidly at extreme high percentiles. The largest RMSE occurred at 99 th percentile in all four cases but are still less than 5 cm. Figure 4(e) (solid lines) illustrates the predicted timeseries of the 90 th percentile against the observations during 1981-2018. At the 99 th percentile, the performance of the prediction model is still significant (r = 0.57). We also extend the prediction period to 1962-1978 (figure 4(e), dashed lines), with r = 0.70. The slightly lower performance in the early period is likely due to the large uncertainty in TC track observations in the pre-satellite era (Knapp andKruk 2010, Feng et al 2021), which reduces the reliability of the TC predictors. We tested other cross-validation methods (figure S3) by changing the training and validating periods. We found that the performance of the predictors is not dependent on the validation methods. To investigate the effect of data integrity, we removed TR, ENSO and TC number data in 1982 and 1984, yielding a 40 year timeseries. Model performance is shown in figure S4 and main results are not affected.

Conclusions
In this paper, we first introduce an analysis approach to identify the effects of ENSO and TCs on the interannual variability of extreme sea levels on the WNP coasts. The approach proceeds from the 42 years of sea level records in Hong Kong which is one of the most vulnerable areas for storm surge hazards and has reliable sea level records. We find that ENSO significantly modulates the high (⩾70 th ) and low (⩽50 th ) percentiles of TRs, through altering TC and other forcing factors that affect Hong Kong sea levels. After removing the effect of ENSO, the interannual variability in remaining TRs can be significantly explained by the frequencies of TCs making landfall in south China and east China. We further confirm that this relationship is related to the barotropic effect of TCs on TRs.
In the last part of this paper, based on the significant effects of ENSO and TCs, an empirical prediction model for interannual TR variations is constructed. The model has good performance for TRs at both low and high percentiles, with the best skill for the 90 th percentile (r = 0.79), i.e. predicting up to 60% of the variance. The model is also capable of predicting the most extreme values, such as the 99 th percentile. This indicates the high predictability of annual TRs in Hong Kong and promises potential of the method for operational use.
The analysis approach in this paper can be applied to other coasts where tropical storms and climate variability are main contributors to storm surges, such as the Indian coasts and the east coast of the United States. Useful statistical and dynamical models have been developed recently to predict the ENSO and regional TCs in the WNP (Camp et al 2019, Wang et al 2019, Chu and Murakami 2022, Feng et al 2022. These offer an unprecedented opportunity to predict extreme sea levels months or seasons ahead by using the dynamically produced predictors in the prediction model for storm surges.