Dynamic genesis potential index for diagnosing present-day and future global tropical cyclone genesis

Tropical cyclone (TC) genesis potential index (GPI) has been extensively used to understand the processes governing climate variability and future change of TC genesis (TCG). However, the relative roles of the thermodynamic versus dynamic environmental factors in TC genesis remain elusive, especially under a warming world. Here we show that four leading dynamic factors, the 850 hPa absolute vorticity, 500 hPa vertical motion, tropospheric vertical wind shear, and 500 hPa shear vorticity of zonal winds, are objectively identified by the logarithmic stepwise regression analysis from 11 dynamic and thermodynamic candidate factors. We further demonstrate that the model results from a TC-permitting global model ascertain the four leading dynamical factors as the most influential in both the present-day simulation and future projection under global warming. A dynamic GPI, consisting of the four dynamic parameters, provides a diagnostic tool for understanding future change of TC genesis. Meanwhile, it improves skills in representing interannual variations of TCG frequency in the western Pacific and Southern Hemisphere oceans.


Introduction
It has been widely recognized since Gray's (1968) pioneering work that tropical cyclone (TC) genesis requires constructive large-scale environmental conditions including sea surface temperature (SST), the planetary vorticity (latitude), the low-level relative humidity, and the magnitude of vertical wind shears. Gray (1979) developed the yearly genesis parameter (YP) that was able to replicate the main features of the seasonal and spatial variability of observed TC genesis. Several follow-up studies have devoted to improving the quantitative linkage between global TC genesis (TCG) number and environmental conditions (e.g. Royer et al 1998, Emanuel and Nolan 2004, Tang and Emanuel 2010, Tippett et al 2011. The genesis potential index (GPI) has been widely used to diagnose interannual and interdecadal variability of TCG in North Atlantic Ocean (NA) (Gray 1979, Watterson et al 1995, Royer et al 1998, Goldenberg et al 2001 and to understand the processes by which El Niño-Southern Oscillation (ENSO) impact TCG globally (e.g. Camargo et al 2007). GPI values increase when large-scale conditions are favorable for TC genesis. The formulation for the GPI developed by Emanuel and Nolan (2004, ENGPI hereafter) is as follows: where RH 600 denotes the relative humidity (%) at 600 hPa; MPI represents the maximum potential intensity (m s -1 ) which is an empirical value and is determined by the vertical structure of temperature and moisture and SST (Bister and Emanuel 1998); V s is the magnitude of the vertical wind shear (m s -1 ) between 200 and 850 hPa, and ζ a850 is the absolute vorticity (s -1 ) at 850 hPa. The definition of MPI is based on Emanuel (1995), and modified by Bister and Emanuel (1998): where C k and C D denote the surface enthalpy and momentum exchange coefficients; T s is the sea surface temperature; T o is the outflow temperature; h * o is the saturation moist static energy of the sea surface, and h * is the saturation moist static energy of the free atmosphere.
Since the GPIs are derived from the climatological mean data, the extent to which the GPIs can explain interannual variability on the regional or basin-scale remains controversial (Waterson et al 1995). By comparing four TCG indices, i.e. ENGPI; the Gray (1979)'s Yearly Genesis Parameter; the Royer et al (1998)'s Modified Yearly Convective Genesis Potential Index; and the Tippett et al (2011)'s Index, Menkes et al (2012) found that all GPIs cannot reproduce interannual variations in the observed total TCG frequency, especially in the Indian Ocean and western North Pacific Ocean (WNP).
The large-scale factors controlling TC genesis may vary with time scales (Wang and Moon 2017). Using the ENGPI, Camargo et al (2009) found that vertical wind shear and MPI play a minor role, but the midlevel relative humidity makes the most significant contribution to the Madden-Julian Oscillation (MJO)-related genesis potential anomalies, followed by the low-level absolute vorticity. On the other hand, the 850 hPa relative vorticity weighted by the Coriolis parameter and 500 hPa vertical motion is found to be the most effective factors controlling intraseasonal TC genesis in both boreal winter and summer Moon 2017, Moon et al 2018), suggesting a primary role of ambient circulation factors in modulating TCG on the intraseasonal time scale.
The GPIs have also been widely used to explain the physical processes behind the projected future changes of TCG (Yokoi and Takayabu 2009, Yamada et al 2010, Murakami et al 2011, Yokoi et al 2012. However, the relevance of the GPIs' thermodynamic factors in explaining TC changes under global warming has been challenged (Camargo et al 2014). The inconsistency between the TCG number and GPI in the projected future changes (Gualdi et al 2008); Yokoi and Takayabu 2009) may be because GPI is optimized for the present-day climate, in which MPI is a critical element. The MPI increases with rising sea surface temperatures (SSTs) and always projects a significant increase in genesis potential under global warming. However, the threshold SST value for TC genesis in a warming world will also increase (Wehner et al 2015, Sugi et al 2015, casting doubt on the applicability of the MPI-related GPI to understanding of future change of TCG. On the other hand, Murakami et al (2013) found that dynamic variables are of primary importance for separating developing and non-developing disturbances in the present-day climate in WNP, and such a relationship remains unchanged in a future warmer climate.
An improved understanding of the relationships between TCG and large-scale climate variables is of critical importance for predicting climate variation on various time scales and for understanding the projected future change of TCG. Our objective is to understand the relative roles of dynamic and thermodynamic factors that affect the TCG potential on the regional and global scales. Of particular interest is to explore the applicability of the GPI derived from the present-day climatology to the future projections in a warming world and to signify interannual variability of TCG in a basin or regional scale. With the gained knowledge and understanding, a new GPI is proposed and evaluated.

Datasets
The observed TCG is determined from the IBTrACS (version v04r00; Knapp et al 2010) between 1979 and 2017. The IBTrACS consists of the TC data compiled by multiple organizations. In this study, we utilized the combination of the National Hurricane Center and the Joint Typhoon Warning Center. We consider only the TCs with tropical storm intensity (i.e. surface wind speeds of 35 kt) or above. TC genesis is considered when a storm intensity reaches 35 kt for the first time. All the genesis positions were counted for each grid box (2.5 • × 2.5 • or 5 • × 5 • ) within the global domain, and the total count was defined as the TCG frequency. The successful simulations of TCs were mainly due to incorporating a new deep convection scheme (Yoshimura et al 2015), rather than the effect of high resolution (Murakami et al 2012). Two 25 yr experiments were conducted. The first is for the present-day period , and the second one is for the future warmer climate state . The model simulation is forced by prescribing lower boundary conditions of SST and sea ice concentration (SIC). The observed monthly SST and SIC (HadISST1;Rayner et al 2003) are prescribed for the presentday experiment. The future projection is conducted with prescribed future SST, SIC, and atmospheric concentration of greenhouse gases (GHG), including CO 2 and aerosols, based on the Intergovernmental Panel on Climate Change (IPCC) Special Report on Emission Scenarios (SRES) A1B scenario (Solomon et al 2007). The A1B scenario assumes a future world of rapid economic growth, low population growth, and the rapid introduction of new and more efficient technologies, resulting in about 700 ppmv in CO 2 concentration at the end of this century. The future changes and trends of SST and SIC were estimated from the ensemble mean of 18 models from the World Climate Research Programme's Coupled Model Inter-comparison Project phase 3 [CMIP3; Meehl et al (2007)] under the IPCC A1B scenario.

Method for deriving the TC GPI
Assume Y and X i represent the TCG potential index (GPI) and influential factors for TCG. We first calculate the correlation coefficients between log Y and log X i using climatological monthly mean data. The stepwise regression and F-test are then used to select the significant top-ranking factors that make the best performing, multi-variable, linear-logarithmic regression equation: where the subscript 'i' denotes the selected factors. Finally, we transform the linear-logarithmic equation to a nonlinear GPI by taking the logarithmic of both sides of the equation (3). The stepwise regression selects the influential factors in sequential order by maximizing the regressed fractional variance at each step (Jennrich and Sampson 1968). The Fisher's Ftest was used to test the significance of the 'newly' added factor at each step based on its contribution to maximizing the increase of the regressed variance. This process continues until no statistically significant factors can be selected. The climatological monthly mean data from January to December in the climatological TCG domains are used for both X i and Y. The climatological TC domain was defined by all grids where the 39 yr (1979-2017) total TCG number exceeds some criteria N c . Three resolutions were tested: 2.5 • × 2.5 • , 5 • × 5 • , and 10 • × 10 • . A 9-point smoothing was applied to the 2.5 • × 2.5 • and 5 • × 5 • grid cells to obtain smoothing distribution patterns. The smoothing conserves the total number of TCG. Sensitivity test indicates that N c = 1.5, 5, and 20 provides optimal results for 2.5 • × 2.5 • , 5 • × 5 • , and 10 • × 10 • grid cells, respectively. The three optimal domains are similar for the three different grid sizes.

Objective selection of controlling factors for observed TC genesis
TCG locations are strongly constrained by the two factors recognized by Gray (1968), i.e. SST higher than 26.5 • C and the pre-TC vorticity seeding being sufficiently away from the equator so that Coriolis force can effectively spin up a TC (figure 1). These two factors can be considered as necessary conditions for TCG. As introduced in equation (1), Emanuel and Nolan (2004) used four influential factors to estimate the TCG potential: absolute vorticity at 850 hPa (ζ a ), relative humidity at 600 hPa (R), MPI (V pot ), and vertical wind shear between 200 and 850 hPa (V s ). In order to explore possible basin-dependence of the GPI, we consider seven more potential factors (table 1). All factors have a form of logarithm and their values range approximately from 0.4 to 2.7 when they are computed using the climatological monthly mean values. The adjustment of the logarithmic range acts as a normalization so that the ranges of variation are comparable for different candidate factors. The stepwise regression result is not sensitive to the range of logarithm.
Why do we implement the seven new factors? The absolute vorticity is a combination of two factors, and it is not clear which individual elements are regiondependent, so we test f and ζ r , separately. The 500-hPa vertical pressure velocity (ω) is shown to be an important factor in NA  and TCG on the intraseasonal time scale (Wang and Moon 2017). The 500 hPa vorticity due to meridional shear of zonal winds (U y ) and the zonal wind convergence at 850 hPa (U x ) was suggested of importance in WNP and the zonal wind confluence zone . Differing from the total vertical shear that depends only on the magnitude of the vertical shear, the vertical shear of zonal wind distinguishes the easterly and westerly vertical shear. From a dynamic standpoint, the easterly vertical shear, differing from the westerly vertical shear, favors for the development of low-level synoptic waves and TC  when its amplitude is not too large. The SST anomaly relative to the tropical (30 • S-30 • N) mean SST (SST a ) is selected over NA because previous studies have shown that the total number of TC genesis is substantially correlated with the relative SST anomalies in observations (e.g. Latif et al 2007, Swanson 2008, Vecchi et al 2008, Villarini et al 2010, Villarini and Vecchi 2012 and in dynamical models (Zhao et al 2010, Villarini and et al 2011, Ramsay and Sobel 2011, Murakami et al 2012, Knutson et al 2013. The stepwise regression can objectively select factors in sequential order and determine an optimum combination of multi-factors. We measure  . A domain more than 1.5 (3.0) of TCG is shown in black solid (doted) lines. Red contour indicates 26.5 • C for the climatological mean SST derived from observations. Purple contour indicates the climatological mean SST or the zero line for the relative SST anomaly (SSTa). Blue texts denote the 6 ocean basins: North Indian Ocean (NI); western North Pacific Ocean (WNP); eastern North Pacific Ocean (ENP); North Atlantic Ocean (NA); South Indian Ocean (SI); and South Pacific Ocean (SP).  1). The ensemble mean reflects the majority of the five reanalysis results. Over the global domain, the first four selected factors are ζ a , ω, V s , and the meridional shear vorticity of the 500 hPa zonal winds (U y ). In the SH, the same four are selected, but the U y was selected before V s . In the NH, the first three are selected but U y is replaced by the Coriolis parameter. The fourth factor reflects a hemispheric difference. As shown in figure 1, the SH TCG concentrates in a narrow latitudinal zone between 10 • S and 20 • S and from 50 • E to 170 • W along the Southern Indian Ocean and Southwest Pacific convergence zones where U y is more relevant than earth's rotational effect. On the other hand, the NH TCG tends to cover a significantly larger latitudinal range, especially over WNP and NA, where the Coriolis parameter is important. Results in table 3 explain how the four factors are selected for the global domain and why others not. Among the 11 parameters, three selected dynamic factors have the highest correlation coefficients (r) with the TCG frequency: ζ a (r = 0.42), Vs (r = −0.40), and ω (r = 0.39). The fourth factor, U y , is selected not because of its correlation with the observed TCG frequency but because it is complementary to all other three factors. Why are the thermodynamic factors, SST a , R, and V pot, not selected? The relative humidity, R, is highly correlated with ω (r = 0.84). This high correlation reflects a physical linkage between the two: 500 hPa ascent-related moisture convergence tends to moisten the lower troposphere and increase 600 hPa relative humidity. Thus, the role of R can be well represented by the 500 hPa vertical motion. The SST a is significantly correlated with TCG frequency (r = 0.33), but it correlates even stronger with ω (r = 0.65) and V s (r = −0.51); therefore, the stepwise regression considers it redundant after V s and ω are selected. The MPI (V pot ) has an insignificant correlation with TCGF (r = 0.25) and highly correlated with SST a (r = 0.94). Overall, the thermodynamic factors identified in earlier works are represented by the corresponding large-scale environmental dynamic factors once SST exceeds a critical value.

The GPI in a high-resolution GCM's present-day simulation and future projection
For the future projection of TCG, it is important to use those large-scale factors that are suitable for both the present-day climate and the future warming climate. For this purpose, we have conducted parallel analyses using the model outputs derived from the MRI 20 km-mesh, TC-permitting model (Murakami et al 2012) for the present-day simulation and future projection experiments. The largescale climate variables and the model-resolved TCG numbers allow establishing the relationship between the TCG and GPI. Stepwise regressions are performed using the model outputs for each three different grid cells (2.5 • × 2.5 • , 5 • × 5 • , and 10 • × 10 • ) and the global and each of six ocean basins. The selection score for each variable is defined as where i denotes each of three resolutions, j denotes each ocean basin, x ij denotes the orders of the selected predictor for the ith resolution and jth ocean basin.
Here we considered seven domains (the six ocean basins defined in figure 1 plus the global domain). The score is counted only when a variable is selected in the first five steps. A higher selection score indicates the variable tends to be selected in the earlier steps, representing its importance in the stepwise regression selection. Figure 2 shows that the dynamical factors are consistently identified as the most influential factors in present-day simulation and future projection. For the present-day climate, V s , ω, and ζ a , are the most critical factors followed by f, ζ r, V zs , and U y; while the thermodynamic factors are not prioritized ( figure 2(a)). Thus, the model results derived from the present-day simulation are generally consistent with those obtained from the five reanalysis datasets. In the model's projected future warming climate, V s , ω, and U y are the most important factors followed by Table 3. Mutual correlation coefficients among the 11 predictors and the predictand TCGF using the ensemble mean of the five reanalysis datasets within the domain of SSTa ≥ 0 (or SST ≥ 26.5 • C) on the 5 • × 5 • grid cells. Numbers in bold highlight. Boldface font indicates a correlation coefficient with statistical significance at the 99% level by the student t-test.

Vs
Vzs f, ζ a , and ζ r ( figure 2(b)). Again, the thermodynamic factors are not prioritized. The model simulation results indicate that the large-scale dynamical control of TC genesis tends to be stable from the present-day climate to the future global warming environment, suggesting that a set of dynamic controlling factors may be adequate for understanding the future change of the probability of TCG frequency under anthropogenically induced warming.

A new dynamic GPI for inferring global TC genesis
Using the four selected dynamic factors and the monthly climatological data on the 10 • × 10 • grids, we established the following approximate dynamic GPI (DGPI) formula: where the terms of V s , ζ a , ω, and U y are defined in table 1. In addition, we assume that TC genesis latitudes are 5 degrees away from the equator and SST is higher than 26 • C. Therefore, DGPI is set zero over the grids where SST < 26 • C or latitude within 5 degrees around the equator. Originally, stepwise regression yields the regression coefficients of −1.7, 2.3, 3.3, 2.4, and −11.8 for V s , U y , ω, ζ a , and e, respectively. To build a simple form, we used lower rounded power values 2, 3, and 2 for the numerator factors U y , ω, and ζ a , and to balance the decreased powers in the numerators, we use −1 for the denominator factor V s so that the DGPI computed by equation (5) is a good approximation. We have compared the DGPI in the exact form with the DGPI in the approximate form and found that the differences are not statistically significant.
To evaluate the diagnostic skill of the DGPI, we utilized the monthly mean large-scale variables derived from the ensemble mean of the five reanalysis datasets over a 2.5 • × 2.5 • grid box during the period from 1980 to 2017. Spatial correlation and rootmean-square error (RMSE) are computed for ENGPI and DGPI with reference to the observed TCG. Figure  3 shows that both GPIs can realistically capture the observed spatial patterns of TCG for the NH during May through October and for the SH during November through next April, although ENGPI moderately underestimates the maximum values over the North Pacific and South Indian Ocean (SI) ( figure 3(b)).
The spatial distribution of TCG shows large year-to-year variability in association with ENSO. Camargo et al (2007) have shown that the ENSOrelated TCG variability (the difference between the El Niño and La Niña years) diagnosed by ENGPI is in good agreement with observations. Similar to Camargo et al (2007), we computed the composite anomalies of DGPI for all El Niño years (figures 4(a)-(c)) and La Niña years (figures 4(d)-(f)), separately. Both ENGPI and DGPI display similar spatial anomaly patterns that are associated with ENSO compared with observations. DGPI shows a larger spatial contrast with a slightly higher spatial correlation than ENGPI during El Niño years. Results in figures 3 and 4 indicate that the DGPI and ENGPI have comparable skills in simulating climatology and the response of TCG to ENSO. Figure 5 shows that the DGPI and ENGPI exhibit somewhat different strengths in diagnosing interannual variations of the basin-total GPI values over individual basins. Because the TC season starts in October and ends in April in the SH, use of the calendar year to count annual total TC number is inadequate. For this reason, we used the TC year, which starts in June and ends in next May, to describe year-to-year variability for both the NH and SH TCG . Over the NA, both indices have the highest performance with spatial-correlation skills (r = 0.77), suggesting that the NA basin-scale TCG number can be best diagnosed by using large-scale environmental parameters. In the WNP, the interannual variability of TCG number is much better represented by DGPI (r = 0.55) variation than ENGPI (r = −0.04). In the eastern North Pacific Ocean (ENP), the ENGPI (r = 0.63) is better than the DGPI (r = 0.37). Over the SH, in both the South Pacific Ocean (SP) and  Comparison of observed (a) and diagnosed TCG numbers by using ENGPI (b) and DGPI(c). In NH shown are the climatological mean TCG number from May through October, while in the SH shown is the TCG number from November to the next April. GPI is applied to the ensemble mean of the five reanalysis datasets. Spatial correlation (R) and root-mean-square error (RMSE) are shown in the top left box of the corresponding panels for ENGPI and DGPI. The data covers the period of 1980-2017. SI, the DGPI performs better than ENGPI. The main reason is that in the SH oceans, the 500-hPa U y is the third leading factor for TCG (table 2). In the NI, both indices have the lowest skills, indicating that special effort should be made to represent the TCG variability during transitional monsoon seasons. Overall, the dynamic GPI shows significant skills in representing TCG variability in NA (r = 0.77), WNP (r = 0.55),  Correlation skills of the GPI in diagnosing the interannual variation of the basin total TCG. Shown are the correlation coefficients between the observed and diagnosed interannual variation of the basin total TCG number. Blue bars denote ENGPI, whereas red bars denote DGPI. The numbers over the bar indicate correlation coefficients. The asterisk indicates statistically significant at the 95% significance level using student t-test. The data period is 1980-2017. The ocean basins are defined in figure 1. SP (r = 0.48), SI (r = 0.38), and ENP (r = 0.37). These correlations are statistically significant at the 95% confidence level.

Conclusion and discussion
We applied logarithmic stepwise regression with F-test to objectively assess the relative prominence of 11 large-scale environmental dynamic and thermodynamic factors that are potentially conducive to TCG in the global ocean and regional ocean basins. The most prominent factors selected for the global ocean by using five different reanalysis datasets are the 850 hPa absolute vorticity ζ a , 500 hPa vertical velocity ω, and vertical wind shear V s . The 500 hPa meridional shear of zonal winds U y is the fourth factor for the global ocean and third in the SH oceans (table 2). The thermodynamic factors (i.e. the 600 hPa relative humidity; SST a ; and maximum potential intensity) are highly correlated with and well represented by the dynamic factors (e.g. ω or V s ) (table 3).
A new dynamic genesis potential index (DGPI) for recognition of TCG potential in global oceans is established by using ζ a , ω, V s , and U y . The DGPI and ENGPI have comparable ability to portray climatological mean distribution and the relationship between TCG and ENSO (figures 3, 4). For representing the year-to-year variations of basin total GPI numbers, the DGPI is better than ENGPI in the WNP, SP, and the South and North Indian Ocean (SI, NI), comparable over NA, but less skillful in the ENP (figure 5).
We demonstrate, using the MRI TC-permitting GCM experiments, that the four dynamical controlling factors are stable and selected for both the present climate and the future warming world (figure 2), suggesting that DGPI can be used to understand the causes of TCG changes under global warming.
A cautious note is that while the thermodynamic factors are not selected, it does not mean they are physically unimportant. The dynamic control of TCG implies that the dynamic factors can well represent the influence of these thermodynamic factors. More importantly, the dynamic factors are shown suitable for both the present-day climate and the future climate under significant global warming. The thermodynamic factors that are derived from the present-day climate, on the other hand, might not be fully applicable to the future climate as they may be sensitive to the effects of sea surface warming.
Several issues deserve further investigation. First, the selected factors show some basin-dependent features, but the first three dynamic factors, ζ a , ω, and V s (or its zonal component V zs ) remain the most common pickups (supplementary table S1 (available online at https://stacks.iop.org/ERL/15/ 114008/mmedia)). The differences are seen in some complementary factors selected, including the SST a in the WNP and NA, U y in the Southern Oceans, and R in the North Pacific. We have applied the stepwise regression to each basin and built basin-dependent GPIs, which show better skills in some ocean basins (e.g. WNP and SP) but not uniformly. In the vast ocean basins, one may consider deriving suitable GPI for some sub-regions. For instance, in the southeast quadrant of the WNP (0-17 • N, 140-180 • E), the correlation coefficient between a regional GPI and the total number of TCG can reach 0.80. Special effort should be placed on the Indian Ocean, where the current GPIs have poor skills in representing the interannual variability of TCG.
Another issue is whether the GPIs derived from climatology can faithfully reflect the interannualto-multidecadal variability of TCG. To improve the capability of DGPI in representing the variability of the TCG number and better to elaborate the processes leading to the variability, we speculate that the derivation of DGPI by incorporating information of the interannual variability may be required. The additional information may help identify different large-scale factors responsible for the interannual variations of TCG from basin to basin. While the dynamic GPI was shown to apply to a future warming scenario, the relative roles of the thermodynamic versus dynamic factors in TC genesis under a warning world remain unknown. This issue will be addressed in an accompanying work that compares the DGPI's and ENGPI's projected future changes of TC genesis potential against the TCpermitting models' projected TC genesis frequency.