Background & Summary

In tropical and subtropical regions, tropical cyclones (TCs; hurricanes or typhoons) represent the major extreme meteorological events. The strong winds and extreme waves associated with such systems are of critical importance for a range of applications and processes. These include: coastal and offshore engineering design, coastal beach erosion and coastal inundation, the safety of shipping and studies of extreme air-sea interaction. The extreme winds and complex vortex structure of the moving tropical cyclone, are also a significant test for our understanding of wind-wave generation and propagation. As such, tropical cyclones are often regarded as one of the most demanding tests of contemporary wind-wave prediction models.

Despite the importance of tropical cyclones, obtaining consistent, reliable, and extensive datasets of wind and wave conditions in such systems is demanding. The relatively small geographic extent of TCs, their infrequent occurrence and the extreme nature of the meteorological forcing, mean observations are relatively rare. The most obvious source of observations of TC wave fields is the use of in situ buoy data, which has been the subject of numerous studies1,2,3,4,5,6,7,8,9,10,11. These studies have largely used the U.S. National Data Buoy Center (NDBC) dataset12, as it is by far the most comprehensive archive. Buoys provide high quality data, which have been extensively validated and these data often consist of full directional spectra. The limited spatial distribution of buoys, however, which are often relatively close to coastlines, limits the data available.

The advent of remote sensing techniques has significantly increased the potential available data, with both Synthetic Aperture Radar (SAR) and altimeters having been utilized to observe TC wave fields11,13,14,15,16,17,18,19,20,21.

Observations of the TC wind field over the ocean have similarly relied upon buoy measurements. However, the accuracy of such measurements is sometimes questioned due to tilting of the anemometer on the buoy and sheltering by waves11,22,23,24. The routine penetration of North American hurricanes by aircraft has provided the opportunity to obtain TC wind field data from dropwindsondes25,26,27,28,29. As such measurements are, however, aircraft-based, they are obviously limited in terms of the number of TCs penetrated. As with wave measurements, the advent of remote sensing measurements has expanded the combined wind field database. These systems include aircraft-borne Stepped Frequency Microwave Radiometers (SFMR)30,31. Satellite-based instruments include: the Advanced Microwave Sounding Unit (AMSU)32, the L-band microwave radiometer carried on the NASA Soil Moisture Active Passive Satellite (SMAP)33, the CYGNSS constellation34, and scatterometers35,36,37,38.

It should be noted that none of these data sources are homogeneous. The numbers of buoys deployed and satellites in orbit have increased with time, meaning that the observation frequency of TCs has changed over the last 30 years. In addition, the measurement technology has also changed. In recent years more buoys measure full directional spectra, rather than just bulk parameters, such as significant wave height and peak period. Similarly, the satellite technology has also changed with higher resolution and more consistent calibration of platforms.

In a series of papers, Tamizi and Young11 and Tamizi et al.38,39 combined data from the TC best track database IBTrACS40 with buoy, altimeter, scatterometer and radiometer data to provide a large and comprehensive investigation of the wind and wave fields in TCs. In these studies, TC tracks were identified globally from the IBTrACS archive. NDBC buoy data and the various remote sensing products were then extracted when the given TC track passed close (550 km or 5°) to the buoy or satellite ground track. This is a large dataset, encompassing a total of 2927 TCs from all tropical cyclone basins. As data is sourced from a range of public archives, which each need to be separately searched to extract the required observations, this is not a simple task. The present database contains these data, stored by year/location/TC name. As such, it is a valuable archive bringing together TC track and wind field parameters with recorded wind and wave data.

Methods

Each of the data types used in the composite dataset are described below. Note that in the previous applications of these data11,38,39, track positions were interpolated in time to ensure they were consistent with observations of wave and wind quantities. The present dataset does not interpolate any data. Rather, the original track observations are provided at their original resolution (6 hours in most cases).

Tropical cyclone tracks and parameters (IBTrACS)

The International Best Track Archive for Climate Stewardship (IBTrACS) dataset (40) was developed by the NOAA (National Oceanographic and Atmospheric Administration) National Climatic Data Center. The archive synthesizes and merges best-track data from all official Tropical Cyclone Warning Centers and the WMO (World Meteorological Organisation) Regional Specialized Meteorological Centers. The dataset contains data including time, position, maximum sustained winds, minimum central pressure, p0 and storm nature (i.e., tropical cyclone, tropical storm, etc.). In addition, information such as the radius to maximum winds, Rm and the radius to gales R34 is provided for some storms. The data are provided globally at 6-h intervals. Although the archive contains data beginning from 1848, data before the satellite period is obviously of lower quality. For the present database, only data after 1985 was extracted.

Figure 1 shows the global distribution of tropical cyclone tracks extracted from IBTrACS and contained within the dataset. The tracks, as shown in Fig. 1 contain data for storms which are classified as tropical storms, tropical cyclones, and extra tropical cyclones. The present database includes all storm types. If one wished to exclude extratropical cases, as in Tamizi and Young11, one can simply disregard data at latitudes higher than 40°.

Fig. 1
figure 1

Tracks of tropical cyclones extracted from IBTrACS60 and contained within the database. For clarity, only every 4th track is shown. (Figure created with Matlab R2023a –mathworks.com).

NDBC Buoy data

The NDBC operates the longest duration deep water wave buoy network in the world12. Of particular relevance for the present application, this network covers the Atlantic, Pacific, and Gulf of Mexico regions where North American hurricanes occur. The NDBC buoy data typically includes hourly measurements of significant wave height, Hs and other bulk parameters (wave period etc), and the one-dimensional energy density spectrum E(f), where f is wave frequency. In addition, for directional buoys, NDBC data contains the cross-spectral moments a1, a2, b1 and b2. The significant wave height can be related to E(f) and the directional energy density spectrum, E(f, θ), where θ is wave propagation direction, by

$${H}_{s}^{2}/16=\int E(f,\theta )df\,d\theta =\int E(f)df$$
(1)

The directional wave spectrum, E(f, θ) can be represented as E(f, θ) = E(f)D(f, θ)41,42, where D(f, θ) is a directional spreading function defined such that \(\int D(f,\theta )d\theta =1\). In an approach termed the Fourier Expansion Method (FEM), Longuet-Higgins et al.41 proposed that D(f, θ) takes the form

$$D(f,\theta )=A{(f)\cos }^{2s(f)}\left[\frac{\theta -{\theta }_{m}(f)}{2}\right]$$
(2)

The mean direction θm(f) and the spreading parameter s(f) can be determined from the first two spectral moments a1(f) and b1(f) as

$${\theta }_{m}(f){=\tan }^{-1}\left[{b}_{1}(f)/{a}_{1}(f)\right]$$
(3)
$$s(f)=\frac{{r}_{1}(f)}{1-{r}_{1}(f)}$$
(4)

where \({r}_{1}^{2}(f)={a}_{1}^{2}(f)+{b}_{1}^{2}(f)\). The coefficient A(f) is a normalization factor. Although the FEM defined by (2) to (4) is a useful representation of the directional spectrum, the assumed cos2s form is a significant simplification. More general representations such as the Maximum Likelihood Method42,43 and Maximum Entropy Method44 can also be defined in terms of the cross-spectral moments a1, a2, b1, b2.

The NDBC buoys also measure wind speed. As anemometers on these buoys are at a range

of heights, measurements need to be converted to a standard reference height, usually 10 m, assuming a neutral stability logarithmic boundary layer45,46.

The locations of the NDBC buoys from which TC data are included in the database are shown in Fig. 2. Note that tropical cyclone (hurricane) wave data are also available from the Coastal Data Information Program (CDIP). However, as these data are generally in finite depth locations, it was not included in the present database, which is for deep water.

Fig. 2
figure 2

Locations of NDBC buoys for which tropical cyclone in situ wind speed and wave data are included in the database. (Figure created with Matlab R2023a – mathworks.com).

The buoy data for the present database were downloaded from the NDBC47 archive (https://www.ndbc.noaa.gov/). The same data can also be accessed in a more accessible form as described by Hall and Jensen48. In the present archive, the quantities, r1, r2 and θm were calculated as above. All other quantities are as in the NDBC archive.

Altimeter data

Radar altimeters have been in operation since 1985 and measure wind speed (U10) and significant wave height (Hs) globally at an along-track resolution of approximately 10 km. As the radar altimeter is a “nadir-looking” instrument, it senses the ocean surface over a narrow beam directly below the satellite. As such, the cross-track resolution is low, with ground tracks separated by hundreds of kilometres. As the geographical extent of TCs is relatively small, altimeter data are not always available for such meteorological systems. Due to the global coverage and temporal extent (1985 to present-day) of the combined altimeter missions, however, there are extensive observations of TC wind and wave fields. Ribal and Young45, have developed a consistent database of global altimeter data from the 13 altimeters which were operational from 1985–2018. This database49 has now been extended to 15 altimeters and is available at https://portal.aodn.org.au/. Although there is some degradation of altimeter data in high rain regions, a large quality-controlled dataset is available under TC conditions11.

The altimeter data base has been calibrated against buoy data and cross-validated at cross-over points between altimeter missions operational at the same time45.

Scatterometer data

Scatterometers have been in operation globally since 1992 and measure wind speed (U10) and direction (θu) at a resolution of between 12.5 km and 25 km. In contrast to altimeters, scatterometers measure over a broad swath up to 1400 km wide. As such, they image most locations on the Earth’s surface twice per day. Ribal and Young50 calibrated the various scatterometer missions since 1992 against buoy data. However, these calibrations are limited to wind speeds up to approximately 30 ms−1. At higher wind speeds, scatterometers display a low bias due to reduced backscatter signal51,52. Chou et al.53 have proposed a correction to ASCAT scatterometer winds to address this issue at high winds. This approach was extended by Ribal et al.54 who developed specific correction relationships for the MetOp-A, MetOp-B, ERS-2 and OceanSat-2 scatterometers. For QuikSCAT they found no correction is required. These corrections can be applied to the data of Ribal and Young45 for application in TC conditions. The present database uses the calibrations of Ribal and Young45.

Figure 3 shows a contour plot of the relative density of scatterometer observations of winds within the present TC database. As one would expect, it closely follows the distribution of TC tracks shown in Fig. 1.

Fig. 3
figure 3

Contour plot of the relative density (maximum value 1.0) of scatterometer observations within the TC database. (Figure created with Matlab R2023a – mathworks.com).

Radiometer data

In a similar fashion to scatterometer, radiometers measure over a broad ground track swath with a spatial resolution of 25 km. The radiometer dataset55 is extensive, commencing in 1986. However, radiometer returns are significantly degraded by heavy rain. As such, radiometer data can generally only provide reliable data in TC conditions for the periphery of storms.

Composite data records for tropical cyclones

As noted above, these combined datasets provide detailed observations of wind and wave properties under TC conditions. A typical example of the composite data is shown in Fig. 4, with the track and available data for buoys, altimeter and scatterometer for Hurricane Katrina in the Caribbean and Gulf of Mexico during 2005. The data were extracted from database file 2005236N23285_KATRINA.nc. The figure shows the broad distribution of buoy data available from the NDBC archive, the extensive altimeter tracks across the track of the hurricane and the broad swaths of scatterometer data. The in situ data at Buoy 42040 shows Hs peaking at approximately 17 m and U10 at 28 ms−1.

Fig. 4
figure 4

Observations of wind and wave conditions during Hurricane Katrina during 2005. (a) TC track in blue, buoy locations shown with red dots and altimeter observations by linear tracks. The location of NDBC Buoy 42040 is shown. (b) ground track swaths of scatterometer wind data. (c) significant wave height (Hs) as a function of time from buoy 42040. (d) wind speed (U10) as a function of time from buoy 42040. (Figure created with Matlab R2023a – mathworks.com).

Figure 5 shows histograms of wind, wave and TC parameters for data associated with the buoy observations in the database. More details are provided in Tamizi and Young11. The database of buoy observations consists of more than 2900 time series of wind speed/wave height from more than 350 TCs. The histograms show Hs up to 18 m and U10 up to 60 ms−1. These values were recorded during the passage of TCs (hurricanes) with central pressure, p0 down to 880HPa and values of velocity of forward movement, Vfm up to 30 ms−1. The data were taken within 10 times the radius to maximum wind, Rm of the TC centre, with the TCs having values of radius to gales, R34 up to 420 km.

Fig. 5
figure 5

Summary of data from the in situ buoy TC database: (a) Hs for each transect of a TC, (b) U10 for each transect, (c) p0 of each storm in the database, (d) Vfm of each storm in the database, (e) R34 of each storm in the database, and (f) minimum distance between TC eye and buoy for each case in the database.

The corresponding distributions of observed wind and wave data from the altimeter observations are shown in Fig. 6. As can be seen, the number of observations in the satellite observations is far larger than for buoy data. This is due to the high along-track spatial coverage of the altimeter, together with the fact that it is a global dataset compared to the buoy data being confined to North America. The full dataset contains more than 36,000 altimeter passes over TC wind/wave fields from more than 2,730 TCs. The distributions of the various parameters are similar to the buoy dataset. However, it is noticeable that the maximum recorded values of Hs ≈ 16 m and U10 ≈ 50 ms−1. These values are both lower than the corresponding values from the buoy observations, despite the fact that the altimeter dataset contains many more observations from a larger set of TCs. This suggests that the altimeter database misses some of the extreme observations within the storms. This assumption is supported by comparisons of the distributions of the values of observation distance/Rm. For the altimeter dataset, there are few observations near the centres of the TCs. This is because the QA process rejected many observations where the quality of the altimeter returns had been degraded by high rain rates near the centres of storms.

Fig. 6
figure 6

Summary of the altimeter TC database: (a) maximum significant wave height Hs for each pass of an altimeter, (b) maximum wind speed U10 for each pass of an altimeter, (c) central pressure p0 of each storm in the database, (d) velocity of forward movement Vfm of each storm in the database, (e) radius to gales R34 of each storm in the database, and (f) minimum distance between altimeter and storm eye for each altimeter pass.

The corresponding distributions of wind speed from the scatterometer observations are not shown here due to space limitations but are similar to those shown in Figs. 5 and 6. This dataset consists of more than 13,500 scatterometer passes through more than 800 TCs. Due to the broad ground track swaths of scatterometers, there are more than 14,000,000 observations.

Data Records

The dataset is stored as NetCDF files, with one file per TC (a total of 2927 files). The file names largely follows the IBTrACS naming convention, such that it acts as a TC identifier. The general format is: yyyydddHaabbb_TCName.nc, where

  • yyyy- Year of occurrence of TC

  • ddd- day number from start of year to first observation

  • H- N or S for Northern or Southern Hemisphere

  • aa- latitude of first observation (no sign)

  • bbb- longitude (east) of first observation

  • TCName- Name of TC, if this has been allocated

As an example, “2005236N23285_KATRINA.nc” contains the data for Hurricane Katrina with the data commencing on day 236 of year 2005 at latitude 23°N and longitude 285°E. The data stored in each file is described in Tables 1, 2 and 3, with Hurricane Katrina used as an example.

Table 1 Dimensions of variables in NetCDF file.
Table 2 Variables related to IBTrACS and satellite data in NetCDF files.
Table 3 Variables related to buoy data in NetCDF files.

The dataset56 can be downloaded from https://doi.org/10.26188/24471688.

To provide uses an example of how to access and use the NetCDF files, the Matlab script used to produce Fig. 4 can be downloaded from https://doi.org/10.26188/24903117.

Tables 1, 2 and 3 below contain details of the Variable names, definitions of the measured quantities and the dimensions of the storage array within the NetCDF files of the TC database.

Technical Validation

The instruments used to compile the present database are all extensively used for metocean applications, however, the extreme conditions in TCs pose issues for all instrumentation systems. Below, calibration and validation studies of these systems under TC conditions are considered.

Buoys

Surface floating buoys have a history of more than 50 years of usage and form the basis for national wave measuring programs around the world. These systems use either acceleration or GPS principles to measure the time series of surface displacement (Note that NDBC data use acceleration). These systems have been extensively calibrated and validated and represent mature technology.

Anemometers (both cup and sonic) mounted on meteorological buoys also provide the mainstay of metocean wind measurements. Again, at high winds and large sea states, concerns have been raised about tilting of the buoy axis and shadowing by large waves11,22,23,24.

Altimeters

The altimeter wind speed and wave height data used in the database are obtained from the AODN archive45,55. The altimeter data in the archive were calibrated and validated against extensive buoy data45. These data indicate no decrease in the accuracy of measurements of significant wave height up to 10 m. Beyond this value, there is almost not co-located data to make an assessment. Similar calibration of wind speed against buoy data are used for wind speeds up to approximately 25 ms−1. For higher wind speeds, data suggests that the radar cross-section of the altimeter signal is less sensitive to wind speed and a correction to these calibrations is applied at higher wind speeds57. These altimeter calibration relations for wind speed have been subsequently validated against scatterometer and radiometer data58.

Scatterometer

The scatterometer data used in the present database were also sourced from the AODN archive. The scatterometer values of wind speed were calibrated and validated against global datasets50. Ribal et al.54 subsequently proposed a high wind speed correction for TC conditions. The present dataset does not apply this high wind speed correction. However, it is a very simple process to correct the present data before application and this is recommended to users.