Monthly anomaly database of atmospheric and oceanic parameters in the tropical Atlantic ocean

The Tropical Atlantic Ocean Database and Monthly Anomalies of River Discharge on Atlantic Ocean datasets encompass the monthly anomalies of a variety of physical, biogeochemical parameters from the tropical Atlantic Ocean and the monthly anomalies of river runoff in the Atlantic Ocean and its adjacent seas. The parameters used as the base for the computation of anomalies come from the TROPFLUX, GPCP, ASCAT, SODA, GODAS, DASK, SeaWiFS, OAFLUX, WAVEWATCH III, NOAA/ESRL 20th Century Reanalysis, GLOBAL_REANALYSIS_BIO_001_029, GLOBAL_REANALYSIS_BIO_001_033, OCEANCOLOUR_GLO_OPTICS_L4_REP_OBSERVATIONS_009_081, OSCAR, SMOS, MODIS-Aqua, CO2_Flux, and GRDC datasets. Several of the anomaly data are redundant, but come from different data sources making comparative studies possible. For ease of use, both datasets are provided in NetCDF format, CF convention. These datasets include 18 files in NetCDF format, which facilitates its handling due to the diversity of freeware tools that exist and are structured in two-, three- and four-dimensional grids. All these anomalies can be useful to oceanographers, meteorologists, ecologists and other researchers for studies of climate variation in the tropical Atlantic Ocean. These datasets are hosted at https://www.seanoe.org/data/00718/82962/ and https://data.mendeley.com/datasets/pn5b35vn6s/1.


a b s t r a c t
The Tropical Atlantic Ocean Database and Monthly Anomalies of River Discharge on Atlantic Ocean datasets encompass the monthly anomalies of a variety of physical, biogeochemical parameters from the tropical Atlantic Ocean and the monthly anomalies of river runoff in the Atlantic Ocean and its adjacent seas. The parameters used as the base for the computation of anomalies come from the TROPFLUX, GPCP, ASCAT, SODA, GODAS, DASK, SeaWiFS, OAFLUX, WAVEWATCH III, NOAA/ESRL 20th Century Reanalysis, GLOBAL_REANALYSIS_BIO_001_029, GLOBAL_REANALYSIS_BIO_001_033, OCEANCOLOUR_GLO_ OPTICS_L4_REP_OBSERVATIONS_009_081, OSCAR, SMOS, MODIS-Aqua, CO2_Flux, and GRDC datasets. Several of the anomaly data are redundant, but come from different data sources making comparative studies possible. For ease of use, both datasets are provided in NetCDF format, CF convention. These datasets include 18 files in NetCDF format, which facilitates its handling due to the diversity of freeware tools that exist and are struc-tured in two-, three-and four-dimensional grids. All these anomalies can be useful to oceanographers, meteorologists, ecologists and other researchers for studies of climate variation in the tropical Atlantic Ocean. These datasets are hosted at https://www.seanoe.org/data/00718/82962/ and https://data.mendeley.com/datasets/pn5b35vn6s/

Value of the Data
• The main objective of this work was to gather a series of products offering a reliable representation of past reality in the tropical Atlantic. Either by choosing gridded products based directly in-situ and satellite observations, or by choosing products based on numerical simulations and modelling approaches, constrained to realism by data assimilation (the so-called reanalysis) or other technics. The data presented here encompass the monthly anomalies of physical, chemical and biological parameters in the tropical Atlantic Ocean. This dataset can be useful for any researcher that may need these data for further analyses or interpreting physical, biogeochemical or biological patterns or processes of oceanographic and atmospheric parameters in the tropical Atlantic Ocean. It is relevant to study changes in ocean climate through statistical studies. It can also be used as a reference when compared to fully simulated representations of ocean and atmospheric dynamics during the past decades, like the IPPC and CMIP6 coupled simulations. It can also be used for visualization for official uses, decision-makers, general public, education and outreach activities. • This dataset is made up of multiple NetCDF files using the CF convention, sharing similar time coordinates, making it easy to share. It is extremely easy to use and does not require any prior processing.

Data Description
These datasets present runoff anomalies at stations on all rivers discharging freshwater into the Atlantic Ocean and adjacent seas (MARDAO dataset) and anomalies of surface fluxes and physical, chemical and biological parameters at different ocean depths in the Tropical Atlantic Ocean (TAAD dataset) Fig. 1 shows the geographical boundaries of each dataset, the position of all river runoff stations. In the TAAD dataset there are redundant parameter anomalies (e.g., water temperature, salinity, ocean currents, winds, chlorophyll concentration, etc.), this is to facilitate researchers to make comparative studies of monthly climatic variations, the points WPP, SPP, CHLP, CURP and WINP will be used to show such comparisons ( Fig. 1 , Table 1 ).
All anomaly data files are in NetCDF format, CF convention, the Monthly Anomalies of River Discharge on Atlantic Ocean dataset (MARDAO) contains only one anomaly data file (located in the https://seanoe.org/ repository), while the Tropical Atlantic Anomaly Database (TAAD) contains 19 zip files (Located in the repository https://data.mendeley.com/ ), which contain 20 files in NetCDF format, the 87798.zip file contains 2 NetCDF files because the anomalies of the marine current components are separated from the rest Table 2 shows the details of the original datasets used to calculate the monthly anomalies, such as the center that produces it, the periods, the spatial resolution of each grid and the filename that each file has in the repository. Table 3 shows the description of all physical, chemical and biological parameters for which anomalies were calculated. In this Table are listed the name of each parameter, to which the suffix _anom was added, with the exception of the parameter runoff_mean of the MARDAO  dataset (this parameter contains the original runoff data at all stations of each river). In addition to the name of each parameter, the unit, the type of grid and the original set to which they belong are included. In the TAAD dataset the data are organized in two types of grids, the 3D type grids, which are the parameters that are found at the ocean surface or at a fixed depth, therefore, they depend on longitude, latitude and time. The 4D type grids are organized similarly to the 3D type grids, but in addition to longitude, latitude and time they also depend on depth. In the case of the MARDAO dataset the anomaly data are organized in time series for each station. Note that the product GLOBAL_REANALYSIS_BIO_001_033 was removed from the CMEMS catalog, and replaced in 2021 by the product GLOBAL_MULTIYEAR_BGC_001_033. Both are based on the SEAPODYM ecosystem model, the former at the ¼°resolution with one week frequency estimates. It is forced by weekly means of Mercator Ocean circulation model (without assimilation), ERA-Interim atmospheric fields, and primary production issued from the CMEMS derived GLOB-COLOUR surface chlorophyll concentration. Only evaporation has been taken from the OAFLUX dataset because the rest of the parameters coincide with those of the TROPFLUX dataset. For all parameters the missing data is represented by NaN (Not a Number), in the metadata of each parameter _FillValue and missing_value are assigned to NaN. The time reference for the MARDAO dataset is "days since 1700-01-01 00:00" and for the TAAD dataset is "days since 1900-1-1 00:00:00".
In the MARDAO dataset in addition to the data file containing the runoff anomalies at all stations of all rivers there are 3 directories, the figures directory containing the figures fig_RiverStationsMap.jpeg (Map with the representation of all stations) and     As mentioned earlier, in the TAAD dataset encompasses redundant parameters to facilitate comparative studies of anomalies according to different data sources. As an example of the value of this we have chosen several points to show a comparison between anomalies according to different datasources (see Fig. 2 and Table 1 for the locations of these points). The WPP was chosen due to the presence of a Warm Pool that appears in that region from February to April or May ( Fig. 2 a), the location of the SPP is due to the presence of a permanent Salty Pool in that region ( Fig. 2 b), at the CHLP the chlorophyll concentration varies according to the Amazon River plume ( Fig. 2 c, adapted from [19] ), at the CURP is where the retroflexion of the North Brazil Current (NBC) feeds the north equatorial countercurrent ( Fig. 2 d, adapted from [20] ) and finally the WINP is chosen because this is the place where the Intertropical Convergence Zone (ITCZ) shows maximum variability ( Fig. 2 e) Fig. 3 . shows comparisons of anomalies between similar parameters (the term "similar parameter" means that they are the same parameters but obtained from different datasets, see Table 4 ) with different data sources (Sea Surface Temperature, Sea Surface Salinity, Chlorophyll concentration, current velocity and surface wind), also showing a comparison between runoff anomalies (MARDAO dataset) at station 36290 0 0 (Amazon River) and station 1147010 (Congo River).

Experimental Design, Materials and Methods
The data from the original datasets that were used to calculate the anomalies had different frequencies: every 3 hours, every 6 hours, daily and monthly. The MARDAO and TAAD datasets are presented with monthly anomalies so first the monthly averages were calculated for the datasets that had a frequency lower than monthly. In the case of precipitation of the GPCP dataset, the data were organized in daily precipitation, so the accumulated precipitation in each month was calculated.
Once all the grids (TAAD dataset) and stations (MARDAO dataset) had monthly frequency, the anomalies were calculated using the Matlab script set called CalcPlotAnomaly , the creation of all the NetCDF files was done using the Matlab script set called mNC . Once these processes were completed, all metadata were added using the nco software.

Ethics Statements
Not applicable.

Declaration of Competing Interest
The authors declare that there is no conflict of interest regarding the publication of this article. The authors also declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.