Kernel Density Estimation of Tropical Cyclone Frequencies in the North Atlantic Basin

Previous research has identified specific areas of frequent tropical cyclone activity in the North Atlantic basin. This study examines long-term and decadal spatio-temporal patterns of Atlantic tropical cyclone frequencies from 1944 to 2009, and analyzes categorical and decadal centroid patterns using kernel density estimation (KDE) and centrographic statistics. Results corroborate previous research which has suggested that the Bermuda-Azores anticyclone plays an integral role in the direction of tropical cyclone tracks. Other teleconnections such as the North Atlantic Oscillation (NAO) may also have an impact on tropical cyclone tracks, but at a different temporal resolution. Results expand on existing knowledge of the spatial trends of tropical cyclones based on storm category and time through the use of spatial statistics. Overall, location of peak frequency varies by tropical cyclone category, with stronger storms being more concentrated in narrow regions of the southern Caribbean Sea and Gulf of Mexico, while weaker storms occur in a much larger area that encompasses much of the Caribbean Sea, Gulf of Mexico, and Atlantic Ocean off of the east coast of the United States. Additionally, the decadal centroids of tropical cyclone tracks have oscillated over a large area of the Atlantic Ocean for much of recorded history. Data collected since 1944 can be analyzed confidently to reveal these patterns.


Introduction
Tropical cyclones are a major environmental hazard for the southeastern United States. Approximately 12 percent of the world's tropical cyclones form in the north Atlantic/Caribbean/Gulf of Mexico basin, with about 23 percent of those striking the U.S.A. [1,2]. As coastal development continues to increase in the southeastern U.S.A. and elsewhere, it is increasingly important to improve our understanding of the spatial patterns and temporal trends of tropical cyclone tracks and landfalls to aid environmental planners and risk assessors in minimizing the loss of lives and property.
This study has two main objectives: 1) to identify the spatial distribution of North Atlantic basin tropical cyclones by category based on a kernel density estimation (KDE) approach that utilizes the Fotheringham et al. [3] smoothing algorithm; and 2) to identify the inter-decadal movement of tropical cyclones through the period of record based on decadal centroids found through the use of centrographic statistics.

Background
Keim et al. [4] found that the shortest return periods for tropical cyclones occurred in three specific areas of the U.S.A.: the north central Gulf of Mexico coast, the Florida Atlantic coast, and the North Carolina coast around the Outer Banks. This and other research [5,6] suggests that multiple broad-scale controlling mechanisms dictate the spatio-temporal patterns of tropical cyclone tracks and landfalls at intra-seasonal to millennial time scales. While some studies [4,[7][8][9] attribute variability in the spatial pattern of twentieth century return periods along the Gulf-Atlantic coast to the North Atlantic Oscillation (NAO), other research [8][9][10] suggests that the Bermuda-Azores high alone plays a stronger role in regulating Atlantic basin tropical cyclone tracks and landfalls at broader time scales.
The NAO indirectly affects the spatial patterns of tropical cyclones based on the positioning of the Bermuda-Azores high [4,11]. A positive NAO index occurs when the Bermuda-Azores high is stronger than normal and displaced to the north and east of the mean position and mid-tropospheric winds over the Atlantic are from generally west to east. A negative NAO index occurs when the Bermuda-Azores high is weaker than normal and displaced to the south and west, with anomalous mid-tropospheric ridging and troughing and weak circulation over the Atlantic. During a negative NAO index period, tropical cyclones track in a more westward direction and usually resist the normal curve to the northwest [11].
The "Bermuda-Azores high hypothesis" [5,12] alludes to an increase in tropical cyclone landfall frequency on the Gulf coast when the Bermuda-Azores anticyclone is in a more southwesterly location than usual and the NAO index is negative. Similarly, an increase in tropical cyclone landfall frequency on the Atlantic coast occurs when the Bermuda-Azores anticyclone is displaced northeastward and the NAO index is positive. During times of a strong Bermuda-Azores anticyclone, tropical cyclones have been shown to be more affected by its steering mechanisms, while the opposite is true of periods with a weak Bermuda-Azores anticyclone [9], with its migration being a strong indicator of track and landfall location at the annual, decadal, and multi-decadal time scales [12].
Two other modulators of Atlantic tropical cyclone frequency have been widely recognized in the literature: the El Niño/Southern Oscillation (ENSO) [11,13] and the Atlantic Multi-decadal Oscillation (AMO) [9,14]. Bove et al. [13] concluded that El Niño (La Niña) events are associated with a reduction (increase) in the probability of U.S.A. landfalling tropical cyclones. The decrease in tropical cyclone frequencies during El Niño events has been attributed to the anomalously strong tropical upper-tropospheric westerlies that tend to shear the tops off of developing tropical cyclones [15]. However, evidence of ENSO's influence on tropical cyclone tracks is weak [11]. The AMO involves long-term variability in the spatial extent of above-and below-normal sea surface temperatures (SSTs) in the north Atlantic Ocean but its role in affecting the tracking of tropical cyclones is unknown [16]. SSTs alone also play a more critical role in the development rather than the track of tropical cyclones [17,18].
The comparison of current tracks with historic tendencies is often made to address questions on the forcing mechanisms that allow for tropical cyclone formation and development. For example, Knowles and Leitner [12] used visualization techniques to identify the spatial relationship between historic tropical cyclone tracks and intensity of the Bermuda-Azores anticyclone. Bossak and Elsner [19] used historical tropical cyclone documentation from the early nineteenth century in conjunction with the NOAA best-track dataset, which begins in 1851, to estimate the track and intensity of storms from the pre-1851 period [20]. Other research [21] using known historical tracks is being conducted to reanalyze that dataset to reduce the perceived errors noted by numerous manuscripts [22,23].
Because systematic aircraft reconnaissance began to monitor tropical cyclones and disturbances that had the potential to develop into tropical cyclones only since 1944, a potential discontinuity exists in the ability to detect and record tropical cyclone frequency, features, and tracks. Therefore, Neumann et al. [24] and Landsea [25] suggested that only data recorded since 1944 should be used for climatological analysis. Conversely, some researchers [23,26] have chosen to use the frequency data but disregard the documented intensities in the pre-1944 period because of perceived inaccuracies. After completion of a reanalysis of the historical hurricane dataset [21], it may be beneficial to reevaluate the pre-1944 record of the best-track dataset, but for the purposes of our study we have chosen to include only post-1944 data.
Geographic Information Systems (GIS) have become the common platform for displaying and analyzing tropical cyclone data [11,12]. The use of spatial statistics is also gaining in popularity [12,27]. Centrographic statistics (e.g., mean center, median center, standard distance) estimate basic parameters of the spatial distribution of a set of points in space [28,29]. These indices can be used to describe spatial and temporal patterns in tropical cyclones, but their use in the literature is limited to date. More often, centrographic statistics are used to describe other patterns of spatial distribution, such as the movement of crime [30,31]. McGregor [27] used 20-year standard deviational ellipses to analyze the spatial and temporal characteristics of tropical cyclones in the South China Sea while Knowles and Leitner [12] used KDE to examine tropical cyclone patterns in the Atlantic basin related to the Bermuda-Azores high. For kernel density estimation a symmetrical kernel function is placed on an underlying, smooth continuous surface. Each point on the surface is given an equal weight with the weight decreasing with increasing distance away from the point. The density distribution is then estimated by summating the kernels at each location, thus producing a smooth density surface [32].
Previous research has noted the inherent problem concerning bandwidth (bin sizes) for KDE [3]. Fotheringham et al. [3] suggested a conservative approach that oversmooths to some extent, but according to the ap-proach any maxima observed in the estimated density curves are more likely to be real rather than a product of undersmoothing. Knowles and Leitner [12] showed that KDE, when performed correctly, can be a useful tool for identifying areas of high tropical cyclone intensity and that KDE output is easily interpreted.

Tropical Cyclone Data
A spatial database was created based on tropical cyclone position records from 1851-2009 obtained from the National Oceanic and Atmospheric Administration (NOAA) Coastal Services Center [33]. The database contains the storm category, latitude, longitude, and wind speed for each six-hour interval that each tropical cyclone existed. From the original database containing records from 1851-2009, a second database was established for this study that excluded all pre-1944 data. Any storm below hurricane strength (following the Saffir-Simpson classification - Table 1) was excluded; thereby minimizing the problems associated with uneven detection capabilities over the study period.
The dataset presented two different types of shapefiles that could be used for spatial analysis: tropical cyclone tracks (line data) and tropical cyclone 6-hour interval locations (point data). The 6-hour interval locations (also recognized as observation points) were selected, so that a slower-moving tropical cyclone would receive a higher density because of the higher number of observation points recorded, and vice versa for a faster-moving system. This approach is reasonable because the duration of time that the forcing mechanisms favor the tropical cyclone's existence is proportional to the representation in the KDE analysis.

Methods of Spatial Analysis
ArcMap 9.2 [34] was used to display tropical cyclone locations, centrographic statistics, and KDEs. Tropical cyclones were separated into five categories (as designated by the Saffir-Simpson scale) using the symbology tool. Centrographic statistics and KDEs were calculated using CrimeStat III [32], but the output was also added to ArcMap for visual display purposes.  25°W, 62.5°N). A normal method of interpolation with a fixed interval bandwidth was selected. The bandwidth (interval) was created using the following equation described by Fotheringham et al. [3]: where h opt is the optimal bandwidth, n is the number of observations (or tropical cyclones), and  is the standard distance deviation for each category (found from centrographic statistics computed earlier). The interval and area units were set to kilometers and the output units are reported in absolute densities.
CrimeStat III was also used to compute the centrographic statistics, including mean center [30,35], median center [36,37], minimum distance [38,39], and standard deviational ellipse [31,40], for each decade from 1944-2009. The mean center is the simplest descriptor of distribution and describes the mean of the X and Y coordinates [41]. The median center is the intersection between the median of the X and Y coordinates. This creates multiple points where lines intersect and thus produces an area of non-uniqueness in which any part of the area between the lines could be considered the median center [32]. The minimum distance defines the point at which the sum of the distance to all other points in a distribution is minimized [41]. The standard deviational ellipse describes dispersion in two dimensions by finding the standard deviations in the X and Y directions that define an ellipse [28]. The standard deviational ellipse shows where the majority of points exist as well as the directional trend of those points. Standard distance deviation was also computed for each category of hurricane for calculating bandwidth for KDEs. Standard distance deviation is the standard deviation of the actual distance of each point from the mean center and it provides a single summary statistic in kilometers [32].

Results and Discussion
A plot of the longitude and latitude coordinates of all tropical cyclone observation points in the dataset by Saffir-Simpson category reveals the more southerly locations of major (Category 3 or greater) tropical cyclones (Figure 1). Major tropical cyclones tend to be located in a narrow part of the Atlantic Ocean east of the Bahamas and slightly north of Puerto Rico, in the Caribbean Sea, and in the central Gulf of Mexico.
The KDEs of all tropical cyclones are illustrated in Figure 2(a). While tropical cyclones have struck the entire Gulf-Atlantic coast of the U.S.A., the most frequent areas of occurrence were in two zones: the northern Caribbean Sea/southern Gulf of Mexico, and the Atlantic coast from the southern tip of Florida to Cape Hatteras extending out to the Lesser Antilles and Bermuda. Figure 2(b) displays the KDE of Category 1 storms, which tend to occur in two locations of high density in similar areas of the Atlantic Ocean as most other storms combined from Figure 2(a). However, another area of high density occurs in the western region of the Gulf of Mexico south of Texas. Category 2 storms (Figure 2(c)) show relatively high density stretching from the western tip of Cuba, across south Florida, and in a large section of the Atlantic Ocean along the Atlantic seaboard and out to Bermuda. Category 3 storms show a comparatively large area of high density (Figure 2(d)) in an oval pattern from the southeastern Gulf of Mexico to an area north of the Lesser Antilles. The KDE of Category 4 storms assumes more of a flattened oval pattern and is located slightly farther south of Category 2 and 3 storms, but also farther west into the central and northern Gulf of Mexico (Figure 2(e)). Overall, category 4 storms occur predominantly east of the Yucatan peninsula in the Caribbean Sea and in an area of the Atlantic Ocean south and east of the Bahamas. Category 5 storms (Figure 2(f)) are concentrated in two main areas: the southern Carib-bean Sea and adjacent central Gulf of Mexico, and the Atlantic Ocean east of the Bahamas.
The KDEs showed similarities for each category of tropical cyclone, with the highest densities concentrated in the Atlantic Ocean between the eastern U.S.A., Bermuda, Haiti, and parts of the Gulf of Mexico (Figure 2). Higher SSTs are most likely the main factor for the southward shift in tropical cyclone densities for stronger category tropical cyclones [17]. The highest densities of category 4 and 5 storms were concentrated in the Caribbean Sea and Gulf of Mexico. These areas retain high SSTs longer in the tropical cyclone season. Some unexpected contrasts in tropical cyclone location by category were observed. A small area in the western Gulf of Mexico along the U.S.A./Mexican border showed a high density of category 1, 4 and 5 storms, but category 2 storms were almost non-existent in this same area. Some category 4 storms showed a propensity to venture northward toward the Outer Banks of North Carolina, but the densities of category 5 storms tapered off dramatically north of approximately 30°N latitude. Most of the observed tropical cyclone densities corroborate previous research [12,16], but the KDE approach helps to highlight often overlooked disparities such as the peculiar absence of category 2 storms in the western Gulf of Mexico.  The ability to define a bandwidth for the KDE approach aided greatly in accurately identifying density centers for each category of tropical cyclone and the method provided a reasonable measure of comparison across categories. Fotheringham et al. [3] described several methods of determining bandwidth for kernel density procedures. The chosen formula created an optimum bandwidth that used the standard distance deviation for each category to find an outcome that sought to balance the two primary concerns of oversmoothing and overfitting. Elsner and Jagger [42] examined tropical cyclone frequencies using Bayesian modeling in an effort to predict future tropical cyclone frequencies by using a bandwidth calculation formula from Venables and Ripley [43]. The formula took a similar approach to the bandwidth problem by disregarding insignificant peaks, while still highlighting actual peaks that may help to relate tropical cyclone frequencies with certain teleconnection indices.
All of the centrographic statistics (mean center, median center, and minimum distance) were concentrated in the same general area for each decade (Figure 3), thus increasing our confidence in analyzing the decadal centroid patterns of tropical cyclone tracks. Of the three measures of centrography, the mean center statistic was selected because it calculates the mean distribution, or central tendency, of all X and Y tropical cyclone coordinates occurring in each decade and is the most widely used centrographic statistic [30,35,44]. A temporal analysis of decadal mean centers (centroids) of tropical cyclones since the 1940s reveals a trend toward the east until the 1980s, when the pattern shifted back toward a more westward and southward displacement (Figure 4).
Previous research has indicated that shifting tropical cyclone patterns are often the result of influential teleconnections -most specifically the NAO and Ber muda-Azores high [9,16]. The temporal trends of decadal centroids are likely representative of the location and strength of the Bermuda-Azores high. It could be inferred from Figure 4 that the Bermuda-Azores high, on average, shifted to the east in the 1940s, 1950s, 1960s, and 1970s, but then began to shift back to the west in the 1980s, 1990s, and 2000s. This oscillation trend can also often be seen by examining landfall frequencies over time [4]. More research is needed to examine the precise degree of influence of the Bermuda-Azores high on a seasonal/annual basis. This relationship may be unclear, however, because the frequency of tropical cyclones can be influenced by multiple factors. Further research should explore centroids created from different time periods such as five-year periods or different end points such as ten-year periods that begin in 1949 instead of 1944. Such work may highlight inter-decadal trends and confirm that multi-annual spatial patterns may occur most likely in relationship to NAO and the Bermuda-Azores high [11].
Decadal centroids found in this study seem to contradict tropical cyclone landfall patterns identified by Elsner et al. [9], who found that more tropical cyclones made landfall on the Atlantic coast of the U.S.A. in the 1950s, 1980s, and 1990s, while an increased percentage of tropical cyclones made landfall on the Gulf coast of the U.S.A. in the 1960s and 1970s. This seems to suggest that the centroid for 1955-1964 (i.e., 1964 in Figure 4) and for 1965-1974 (i.e., 1974 in Figure 4) should be the westernmost centroids instead of two of the easternmost centroids. The contradiction is likely the result of multiple factors. The main factor is probably related to the longitude where the tropical cyclones formed each decade [9]. Tropical cyclones that formed in the Caribbean Sea and Gulf of Mexico were probably less affected by the position of the Bermuda-Azores high whereas tropical cyclones that formed in the central and eastern Atlantic were probably influenced more by the Bermuda-Azores high. A more southwesterly Bermuda-Azores high may have also caused Gulf tropical cyclones to curve northward near the beginning of their life cycle resulting in a more easterly located decadal centroid. The uncertainty implies that more statistical methods must be employed to confirm current spatial trends and hypotheses as well as to identify previously undetected spatial patterns. This study may lead to more novel applications of spatial statistics to reveal undiscovered tropical cyclone trends.

Summary and Conclusions
Through the use of a smoothing bandwidth proposed by Fotheringham et al. [3], kernel density estimation (KDE) identified areas of highest density of occurrence for each category of tropical cyclone in the north Atlantic basin from 1944 through 2009. Results confirmed previous hurricane density assessments, but also found nuances in location of densities for each storm category. Centrographic statistics provided a new perspective on multiyear spatial shifts of tropical cyclones, but the results should be examined more thoroughly in future research efforts to fully understand the reason(s) for these shifts. The principal findings of this study are as follows: 1) The highest tropical cyclone densities occurred in the Atlantic Ocean between the eastern U.S.A., Bermuda, and Haiti and parts of the Gulf of Mexico with stronger storms (3+) concentrated in the Caribbean Sea, Gulf of Mexico, and southern Atlantic Ocean.
2) An area in the Gulf of Mexico near the U.S.A./ Mexico border exhibited high densities of most storm categories except category 2 storms.
3) The bandwidth calculations proposed by Fotheringham et al. [3] were useful in highlighting hurricane densities. 4) Centrographic statistics identified decadal movements of the center of gravity of hurricane tracks as well as decadal changes in directional dispersion.

Acknowledgment
The authors would like to thank the reviewers for their excellent editorial suggestions during the review of this manuscript. Dr. Andrew Curtis and Gerardo Boquin were helpful in the design and implementation of this study and Dr. Jason Blackburn assisted in reviewing and making suggestions for the geo-statistical components.