Datasets for assessing the structure and drivers of biological sounds

Obtaining and analysing sound data can be a tedious and lengthy process. We present sound data consisting of 20,485 1 min sound recordings obtained in three sites within a rainforest landscape in southeast Cameroon. The sites differ in anthropogenic disturbance. We also present meta data corresponding to these recordings with the identification of all animal vocalisations in each 1 min sound recording. Additionally, we provide a raw database with data on habitat, human activities, remoteness, accessibility, temperature, humidity, rainfall, moon phase, and mammal and bird observations in the area during the recording period. The data were used by Diepstraten & Willie (2021) to investigate the structure and drivers of biological sounds along a disturbance gradient. The data contribute to call libraries of tropical species and can also be used to build classifiers for automatic detection and classification of animal vocalisations.


Specifications
Ecology Specific subject area Soundscape ecology Type of data Table  Audio How data were acquired Acoustic data were acquired through passive acoustic monitoring with the use of Audiomoth bioacoustics sensors. Local field assistants with an expertise in fauna of the study area detected and identified vocalisations in the recordings. Data on habitat, human activities, and mammal and bird occurrence were collected during transect surveys. Data on temperature, humidity, rainfall, and moon phase were collected in the area during the recording period. Accessibility and remoteness were calculated using coordinates in ARCGIS. Data format Raw Analysed Parameters for data collection Data were collected between February and May 2020 in the northern part of the Dja Faunal Reserve's buffer zone in Cameroon, at the start of the wet season. Data were obtained in three study sites that represent a gradient of disturbance. In each of the sites, six transects of 1 km each were opened for data collection. A sound recorder was placed in the middle of each transect. Description of data collection All sensors were set to record the first minute of every hour. All recordings were played to local experts who detected and identified all animal vocalisations. Furthermore, all transects were surveyed to obtain data on habitat, human activities, and animal occurence in each transect. Temperature and humidity were measured hourly in a fixed location of the study area. The amount of rainfall and moon phase were noted daily. Accessibility and remoteness were calculated for every sensor as the distance to the nearest trail and village, respectively. Data

Value of the Data
• The files provide substantial amount of acoustic data with corresponding identifications of vocalising animals and ecological and anthropogenic factors from an understudied region. • Ecologists and conservationists can benefit from these data that are otherwise difficult and time consuming to obtain. • These data can be used for many purposes, including the analysis of the structure and drivers of a soundscape and the calculation of acoustic indices. The data can also be used to build classifiers for automatic detection and classification of animal vocalisations.

FOLDERS
The folders contain audio files recorded from February to June 2020. There are 38,065 audio files recorded by 18 sensors, including 20,413 1 min audio files used in the study by Diepstraten & Willie [ 1 , 3 ]. The name of each folder corresponds to the name of the sensor that recorded the audio files. FILES -File: Number of analysed audio files (files included in the study by Diepstraten and Willie [ 1 , 3 ]).
• Transect: Number of the transect where sensor was located.
• BT: Number of files for each sensor in study site La Belgique.
• PT: Number of files for each sensor in study site La Palestine.
• NT: Number of files for each sensor in study site Ngouleminanga.
• Total: Total number of audio files for all study sites together. -File: Local English and scientific names of animal species • Badjué: Name of species in local language.
• English: English name of species.

Experimental Design, Materials, and Methods
-Data were obtained in three study sites that differ in land-use type and conservation management in the northern part of the Dja Faunal Reserve's buffer zone in Cameroon. -Data were collected between February and May 2020.
-In every study site, 6 transects of 1 km each were opened.
-Audio data were obtained using the following procedure: • An AudioMoth bioacoustics sensor was deployed in the middle of every transect.
• The sensors were deployed at a height of 2 m at a 90 °orientation.
• The sensors were kept in zip lock bags within a protective case, with a small hole at the location of the sensors' microphone, to protect them from rain and animals. • All sensors were set to record the first minute of every hour at 48 kHz and 30.6 dB.
• Recordings made during the night were screened beforehand. Only night recordings with vocalisations from other species than easily recognisable insects, amphibians, or western tree hyraxes were played to the local experts. • Recordings were played to two local experts who identified all audible species.
• Names of the vocalising species were noted down in Badjué (local language). • Vocalisations of birds and mammals were identified by species. Vocalisations of insects and amphibians were identified by class. Unidentifiable vocalisations were recorded as "Animal unknown" or "Bird unknown". -Field surveys were conducted to collect data on anthropogenic and ecological factors: • Vegetation types were described at every 50 m interval in each transect.
• Human activity was described by identifying all human signs within a 2 m range perpendicular to each transect. For each sign, the location along the transect and the vegetation type were recorded. • Mammal activity was described indirectly by identifying animal signs within 2 m on either side of the transect and corresponding location along the transect, vegetation type, canopy openness, understorey openness and horizontal visibility. The local guide identified the type of animal sign and local name of the species. All transects were surveyed twice. Rainfall between the two surveys prevented overlap. • Presence of great apes, central chimpanzees ( Pan troglodytes ) and western lowland gorillas ( Gorilla gorilla ), was described by the observation of their nests along the transects. For every nest, age, location along the transect (m), perpendicular distance from the transect (m), and circumference (cm) were recorded. Furthermore, vegetation type, canopy openness, understorey openness, horizontal visibility (m), and coordinates were noted. Additionally, for gorillas, the type of nest was described by the composition of plants used for construction. For central chimpanzee nests, the type of nest was described by its position in the tree, an estimation of the height of the nest, an estimation of the height and circumference of the tree, and the presence or absence of fruits on the tree. These surveys were conducted twice for each transect, with one month in between surveys. No nests were counted twice. • Mammal activity was described directly by slowly walking along each transect (1 km/h).
For all observed mammals, number, location along the transect (m), distance between the observer and the animal (m), angle of observation, vegetation type, canopy openness, understorey openness, and horizontal visibility (m) were recorded. • Bird activity was surveyed using point counts in fixed stations and direct observations.
For the point counts, birds were recorded for 8 min at the start, middle, and end of each transect. An initial observation direction was randomly chosen and, after two minutes of observation, the observers rotated 90 °in a clockwise direction. During direct observations, birds were recorded while walking the transect in the same manner as during the direct mammal surveys. For every observation, vegetation type, canopy openness, understorey openness, and horizontal visibility (m) were recorded. • Data on rainfall, humidity, and temperature were obtained in one site. Rainfall (mm) was measured daily. Temperature ( °C) and humidity (RH) were measured hourly. • To assess additional anthropogenic factors, the shortest straight-line distance (m) between each sound recorder and the closest village and trail was measured using ArcGIS to get proxies for remoteness and accessibility, respectively. -See Diepstraten et al. [2] for a detailed description of the experimental design for collecting these data.

Ethics Statement
Our work did not involve the use of human subjects, animal experiments, or data collected from social media platforms.

Declaration of Competing Interest
The research was supported by the Antwerp Zoo Centre for Research and Conservation and the Association de la protection des grands singes. Stichting FONA and Stichting het Kronendak provided personal financial support to Johan Diepstraten. The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.