Utility of acoustic indices for ecological monitoring in complex sonic environments

With the continued adoption of passive acoustic monitoring as a tool for rapid and high-resolution ecosystem monitoring, ecologists are increasingly making use of a suite of acoustic indices to summarise the sonic environment. Though these indices are often reported to well represent some aspect of the biology of an ecosystem, the degree to which they are confounded by various extraneous sonic conditions is largely unknown. We conducted an aural inventory across 23 field sites in Okinawa to identify the number of unique animal sounds present in recordings. Using these values of ‘measured richness’, we then examined how the performance of 11 commonly-used acoustic indices varied across a range of sonic conditions (including in the presence and absence of insect stridulations, audible wind or rain, and human-related sounds). Our analysis identified both welland poor-performing acoustic indices, as well as those that were particularly sensitive to sonic conditions. Only two indices reflected measured richness across the full range of sonic conditions examined. A few indices were relatively insensitive to extraneous sonic conditions, but no index correlated with measured richness when masked by sound from broadband stridulating insects. Our results demonstrate considerable sensitivity of most commonly used acoustic indices to confounding sonic conditions, highlighting the challenges of working with large acoustic datasets collected in the field. We make practical recommendations for acoustic index use based on study design, with the aim of identifying the suite of acoustic indices with greatest utility as indicators for rapid biodiversity monitoring and management of the world’s natural soundscapes.


Introduction
The nascent field of ecoacoustics focuses on the properties and dynamics of biological sound while considering the acoustic environment in which sound is produced (Servick, 2014;Sueur and Farina, 2015). The field extends bioacoustics by shifting away from identifying individual species and their vocal properties (e.g. Aide et al., 2013) towards quantifying the 'soundscape'-the suite of all observable sounds produced in an ecosystem Pijanowski et al., 2011b). Ecoacoustics is rooted in the idea that all vocalising animals compete for niche space in the sensory environment (Krause, 1987). Since acoustic space is limited, animals must partition the soundscape across temporal and frequency domains to be heard (Marín-Gómez et al., 2020;Slabbekoorn, 2018). This, in turn, means that soundscapes can represent ecosystem condition, since more diverse systems should exhibit increasingly structurally complex and diverse use of acoustic space (Dumyahn and Pijanowski, 2011;van der Lee et al., 2020). Accordingly, soundscapes are often considered a proxy for various facets of ecological communities, including their diversity (Gasc et al., 2013;Harris et al., 2016;Mammides et al., 2017), abundance (Boelman et al., 2007;Buxton et al., 2018;Pieretti et al., 2011), and biomass (Elise et al., 2019).
With recent advances in recording, processing and data-storage technology, passive acoustic monitoring is becoming increasingly tractable on even a moderate research budget (Gibb et al., 2019). Acoustic monitoring approaches have several advantages over traditional survey techniques, including their ability to produce data on a fine temporal scale (Ross et al., 2018), capture long-term trends (Sueur et al., 2019), include or target rare species (Znidersic et al., 2020), and facilitate surveys of remote or challenging environments (Burivalova et al., 2019a). Accordingly, there is growing interest in using soundscapes to ask ecological questions (Burivalova et al., 2019;Deichmann et al., 2018;Lomolino et al., 2015;Sugai et al., 2019) and in establishing acoustic recording protocols to monitor changes in the soundscape through time and space (Gasc et al., 2018;Ross et al., 2018;Sethi et al., 2020;Sueur et al., 2019).
Since the development of ecoacoustics as a field, there has been a proliferation of acoustic indices that aim to characterise the ecology of a system (Kasten et al., 2012;Pieretti et al., 2011;Sueur et al., 2014). Most such indices work by summing or contrasting the acoustic power within different frequency ranges. Sueur et al., (2014) reviewed the range of indices for summarising the soundscape, while recognising that there is never likely to be a single value that accurately represents all levels of local or regional biodiversity. Although some acoustic indices may correlate-positively or negatively-with species richness (Depraetere et al., 2012;Jorge et al., 2018), inconsistencies in these relationships (Bradfer-Lawrence et al., 2020) have led to concerns over their broad applicability across taxa and habitats (Gibb et al., 2019;Mammides et al., 2017).
However, soundscapes are rich in information beyond biological signals that indicate the presence of particular animals. For example, soundscapes may provide information on landscape or habitat structure (Burivalova et al., 2019;Fuller et al., 2015), the presence of particular weather conditions or storms (Sánchez-Giraldo et al., 2020), background levels of anthropogenic noise and its change (Gill et al., 2017), or the presence and identity of atypical sounds such as illegal logging activity (Sethi et al., 2020). Moreover, soundscapes include valuable information on the phenology (e.g. Oliver et al., 2018), and temporal dynamics of vocalising animals (Gottesman et al., 2020;Marín-Gómez et al., 2020), including disruptive broadband insect choruses (Hart et al., 2015). Although this study focuses on the ability of acoustic indices to act as proxies for biodiversity, we acknowledge the myriad ways soundscapes and acoustic indices provide information beyond merely acting as biodiversity indicators.
The ability of acoustic indices to reflect some facet of biodiversity (that is, their performance) may be sensitive to a range of conditions. For example, human-related sounds ('anthropophony; ' Kasten et al., 2012), interfere with animal communication, particularly in the lower frequency range, and several acoustic indices may consequently perform poorly in urban areas (Fairbrass et al., 2017). Similarly, soniferous insects, particularly stridulating orthopterans and cicadas, can mask other animal sounds with their broadband choruses (Hart et al., 2015). Sonic conditions such as these could, therefore, decouple acoustic indices from the biology they supposedly represent. Consequently, there is a need to examine the change in performance (henceforth sensitivity) of different acoustic indices to a broad range of sonic contexts in order to identify a best performing index or suite of indices across these conditions with utility for biodiversity monitoring (Eldridge et al., 2018;Harris et al., 2016).
Here, we test the performance of 11 commonly used acoustic indices across a range of sonic conditions. Our aim is to identify whether particular indices outperform others as biodiversity indicators across contexts. We use an aural inventory approach to manually count unique biotic sound types (similar to vocalising species richness) from audio samples collected as part of the OKEON Churamori Project, comprising >13,800 min of recordings from across the island of Okinawa ('Okinawajima'), Japan (Ross et al., 2018). We use these data to compare the performance-measured as the strength of relationships (positive or negative) with richness-of acoustic indices across three sets of sonic conditions, namely in the presence and absence of each of anthropophony, soniferous insects, and geophony (wind, rain etc.). Based on our findings, we then provide guidance on which indices are likely to perform well for a given study, and we make recommendations for future acoustic surveys to make best use of acoustic indices as indicators of biodiversity.

Acoustic monitoring
This study was part of the OKEON Churamori Project (Okinawa Environmental Observation Network; OKEON 美ら森プロジェクト) in Okinawajima, the largest island of the Ryukyu archipelago, Japan. OKEON (www.okeon.unit.oist.jp) uses a suite of complementary monitoring techniques to monitor Okinawa's ecosystems in space and time. At each of the project's 24 field sites, representing the full range of land cover types on the island, a Song Meter SM4 recorder (Wildlife Acoustics Inc., Concord, MA, USA) has been installed on a tree at approximately breast height (~1.3 m), since at least February 2017 (Ross et al., 2018). These devices are used to record sound at default gain settings (+16 dB) with two omnidirectional microphones. Recorders are programmed on a recording schedule of 10-min at the beginning of each hour and halfhour (i.e. 10-min on, 20-min off). Data are saved in stereo WAVE format at a sampling rate of 48-kHz to an SD card, which is collected on a two-week field rotation schedule when batteries and SD cards are replaced. Data are then archived with the Okinawa Institute of Science and Technology's high-performance computing centre.
Overall, Okinawajima presents a challenging sonic environment in which to monitor biological sound. Okinawa's North-South urbanisation gradient results in considerably more anthropophony in the South than in the forested North of the island (Ross et al., 2018). Geophony is common from late summer through winter, particularly during rainy season and typhoon season, when Okinawajima often experiences heavy winds and rain in short bursts. Summer soundscapes are dominated by cicada choruses during daytime and stridulating orthopterans at night. Anuran choruses are common year-round, but peak during breeding season (usually winter-spring but breeding season varies by species). Okinawajima has a rich avifauna (McWhirter et al., 1996), including several species endemic to the Northern Yanbaru forest (Itô et al., 2000) and migratory species such as the Ruddy Kingfisher (Halcyon coromanda) and Grey-faced Buzzard (Butastur indicus). Nocturnal Ryukyu Flying Foxes (Pteropus dasymallus) are also common across the island, including in urban areas. Most field sites are within audible distance of a road (Ross et al., 2018), and some recordings include daily tests of the typhoon warning system. Commercial aeroplane traffic and military activity (including aircraft noise and occasional audible gunshots) make additional contributions to the anthropogenic component of Okinawa's soundscape (Cox, 2010).

Aural inventory
We conducted an aural inventory by selecting 64 recordings of 10min duration from each of 23 OKEON field sites (Ross et al., 2018); one site was excluded due to a mismatch in the temporal extent of available data. To ensure recordings were equally representative of both seasons and time-of-day at each site, they were selected from the middle month of each of Okinawa's seasons (January, April, July, and October), from four dates within each season (the 5th, 10th, 15th and 20th of each month), and at each of four times of day (midnight, midday, and two samples that aimed to capture dawn and dusk choruses at sunrise and sunset, respectively). Dawn chorus times were adjusted seasonally using the closest recording before the observed sunrise time for each season, and after observed sunset for the dusk chorus. This resulted in a total of 13,860-min of continuous audio data across all sites.
We counted the number of unique biotic sound types (animal sounds including birds, insects, frogs, geckos etc.) in each 10-minute recording (following Depraetere et al., 2012;Machado et al., 2017). This aural inventory approach comprised listening to recordings while visually inspecting spectrograms produced via Fast Fourier Transformation (window size = 256) in Kaleidoscope Pro (ver 5.1.8; Wildlife Acoustics Inc., Concord, MA, USA). Biotic sound-type richness-henceforth 'measured richness'-was quantified as the number of distinct sound-types corresponding to unique species based on their spectral properties and cross-referenced with appropriate databases to prevent oversplitting sound classes (e.g. www.xeno-canto.org).
For each recording, we noted the presence or absence of three distinct sonic conditions (Table A1). If there was audible human-related sound in the recording, we marked the recording as anthropophony present. We also noted the presence or absence of audible geophony (e.g. wind, rain) in each recording. Finally, if the recording contained audibly stridulating soniferous insects, we marked the recording as insects present. Note that this classification includes both broadband sounds produced by nocturnal orthopterans and the sonically disruptive cicada activity characteristic of Okinawa's summertime, though these categories can be separated by time-of-day. The total biodiversity of a system includes stridulating insects, but these insects contribute ambient longduration broadband sound to the soundscape rather than transient animal signals such as birdsong (Hart et al., 2015), justifying inclusion of insect sounds both as biotic signal and as a potentially confounding sonic condition.

Acoustic index processing
For each 10-minute recording, we calculated a number of commonly used acoustic indices (Table A2) in R (ver 4.0.0; R Core Team, 2020) using the seewave (ver 2.1.6; Sueur et al., 2008a) and soundecology (ver 1.3.3; Villanueva-Rivera and Pijanowski, 2018) packages. These indices were acoustic complexity (ACI, Pieretti et al., 2011); acoustic diversity (ADiv) and acoustic evenness (AEve, Villanueva-Rivera et al., 2011); the bioacoustic index (BioA, Boelman et al., 2007), and the related acoustic entropy (H) and temporal entropy indices (H t , Sueur et al., 2008b); acoustic richness (ARic) and the median of the amplitude envelope (M, Depraetere et al., 2012); and the normalised difference soundscape index (NDSI), calculated by combining two component indices which we also analysed separately, anthropophony (NDSI Anthro ) and biophony (NDSI Bio , Kasten et al., 2012). Though various subsets of these indices have been related previously to species richness estimated with point counts in the field or via an aural inventory approach (Table A2) no study we are aware of has examined such a comprehensive set of indices across the broad range of sonic conditions considered here (Bradfer-Lawrence et al., 2020).

Statistical analyses
We first constructed a correlation matrix from all 11 acoustic indices to determine how suites of indices may provide congruent or distinct information about the soundscape. We then tested the performance of the various acoustic indices by relating each index (Table A2) to the measured richness for each sound recording across all sites and sonic conditions (n = 1386). To test the significance of relationships with richness, we fitted generalised linear mixed models (GLMMs) and used a model selection approach to choose from a full model that included the effect of site-specific measured richness on acoustic index values, while accounting for the influence of site and time-of-day as random effects. Models were produced using a beta distribution in the glmmTMB package (ver 1.0.1; Brooks et al., 2017). The beta distribution allows our models to cope with a range of data distributions including zeroinflation (Ferrari and Cribari-Neto, 2004). In all cases, we used the Akaike Information Criterion (AIC) to assess model fit, and model simplification was based on maximum likelihood estimation to allow comparison of models with different fixed-effects structures. We assessed the support for each model component in the best fitting model by comparing AIC and using likelihood ratio tests to indicate the significance of model term removal. We measured performance as the model slope of the acoustic index ~ measured richness relationship, with the t-statistic of the model slope indicating whether the slope differs from zero. Larger absolute model slopes represent stronger relationships with measured richness, but we opted to preserve information on the direction (positive or negative) of this relationship by including the sign of the relationship in index performance values.
Acoustic index values vary considerably among indices and are often skewed, so we scaled values by dividing by their maximum for each index except NDSI (Bradfer-Lawrence et al., 2020); NDSI is bounded − 1 to 1 and so was scaled instead using (NDSI + 1)/2 (Fairbrass et al., 2017). We treated measured richness as a site-specific property by using the mean of all measured richness values per site in our GLMMs. Hence, richness was a zero-bounded continuous variable rather than a value per recording (Bradfer-Lawrence et al., 2020).
Seasonal patterns of animal activity may obscure the relationship between acoustic indices and measured richness if ephemeral animal sounds do not make the greatest contribution to the soundscape (Hart et al., 2015). We therefore additionally fitted all GLMMs with the interaction between season and measured richness as a fixed effect. To account for site-specific differences in seasonal richness changes, measured richness was taken as the mean richness per site per season. Since the interaction effect of season was always included in the best performing model (Table A3), we assessed performance of acoustic indices for each season separately.
To establish whether there were certain indices that consistently outperformed others across our chosen range of sonic conditions, we fitted GLMMs between acoustic indices and richness as above, again with site and time-of-day as random effects, but with an interaction effect for each of the three sets of focal sonic conditions (Table A1). We measured performance as the model slope for the relationship between each acoustic index (Table A2) and measured richness in the absence of our three potentially confounding sonic conditions (Table A1). To quantify the sensitivity of each index to the various sonic conditions, we took the inverse of the absolute change in model slope in the presence and absence of a sonic condition as a measure of whether the relationship between indices and measured richness improves or declines in its presence. We used the inverse of absolute slope change because doing so produces intuitive sensitivity scores which increase as acoustic indices are more affected by a sonic condition. Using t-statistics, we noted whether model slopes differed from zero in the presence of each sonic condition, regardless of the index's sensitivity to the condition. Finally, to determine whether the sensitivity of acoustic indices varied across seasons, we constructed the same GLMMs but with a three-way interaction between measured richness, each sonic condition in turn, and season (Appendix A1.2).

Results
We found two suites of positively correlated acoustic indices (Fig. 1). One comprised NDSI, NDSI Bio , BioA, ADiv, and H which positively correlated and AEve and NDSI Anthro which also positively correlated but were negatively related to the aforementioned indices (Fig. 1). The remaining indices (that is, ARic, H t , ACI, and M) were largely independent of each other, though ARic and H t correlated positively (Fig. 1).

Acoustic index performance
When considering the full range of sonic conditions included in our study, and ignoring seasonality, most acoustic indices performed poorly, and did not reflect measured richness (Fig. 2). In fact, only H t was associated significantly with richness (t = -2.01, P = 0.044) irrespective of conditions, though the performance of ARic, which had the greatest absolute model slope, was bordering on statistical significance (t = -1.95, P = 0.051).
For all indices, the best performing model always included the interaction effect of season on measured richness (Table A3). For a given season the identity of the best performing index varied, with some indices performing better in particular seasons (Fig. 2B, S1). ARic and H t still performed best overall, with significant relationships during spring and autumn for both indices, plus for winter in the case of H t . During winter, NDSI Anthro and NDSI Bio were, respectively, positively (t = 2.2, P = 0.028) and negatively (t = -2.45, P = 0.014) related to measured richness, while ACI related positively to measured richness in spring (t = 2.51, P = 0.012). Only one index, H, was significantly related to measured richness during summer (t = 1.97, P = 0.049; Fig. 2B).

Acoustic index sensitivity
There were only a few cases for which the underlying index ~ richness relationship was significant in the presence of our potentially confounding sonic conditions. The performance of H t increased and the index remained significantly related to richness in both the presence of geophony (slope = -0.3, t = − 2.12, P = 0.034) and anthropophony (slope = -0.29, t = − 2.09, P = 0.037). The performance of ARic also increased in the presence of anthropophony (ARic ~ richness slope = − 0.49, t = − 2.01, P = 0.044). However, none of our focal acoustic indices were ever related significantly to measured richness in the presence of insect noise (P > 0.05). None of the remaining indices had index ~ richness relationships that differed significantly from zero in the presence of anthropophony, geophony or insect noise, even if their performance increased in the presence of one of these sonic conditions (Fig. 3).
significant decrease in absolute model slope when anthropophony was present, and three (BioA, ARic, M) performed significantly worse in the presence of geophony. Conversely, six indices (ACI, H t , NDSI Anthro , AEve, ARic, M) had significantly lower absolute model slopes in the presence of insect noise, while H performed significantly better in the presence than in the absence of insect noise (t = 3.65, P < 0.001, Fig. 3). However, H still did not correlate with measured richness when insect noise was present (slope = 0.18, t = 1.2, P = 0.23).
None of the acoustic indices in this study were affected significantly by all three sonic conditions. The performance of both NDSI and NDSI Bio was not sensitive to the presence of any of our focal sonic conditions (P > 0.05; Fig. 3). Moreover, we found that season influenced the sensitivity rankings of indices and the identity of the sonic conditions to which many indices were most sensitive (Fig. A2-A4). However, some indices (e.g. ACI) were notably less affected by season than others.

Discussion
Our study addresses the utility of different acoustic indices in reflecting a meaningful facet of biodiversity (that is, 'measured richness') across sonic conditions. We found that acoustic richness (ARic) and the temporal entropy index (H t ) outperformed other indices in terms of their relationship with measured richness. We also identified indices that were relatively insensitive to confounding sonic conditions (H, NDSI, NDSI Bio ). Practical users of these indices should take into consideration their performance across a relevant range of sonic conditions. When designing acoustic studies, researchers should tailor the selection of appropriate biodiversity facet(s) and sonic conditions to include in models to study design (Gasc et al., 2013(Gasc et al., , 2018Harris et al., 2016;Hart et al., 2015). These may differ from our chosen conditions, particularly when considering aquatic soundscapes for example (Gottesman et al., 2020;van der Lee et al., 2020). Consideration of both performance and sensitivity across sonic conditions when selecting acoustic indices will be particularly important as ecoacoustics shifts increasingly towards answering questions of broad ecological interest (Gasc et al., 2018;Lomolino et al., 2015;Sueur et al., 2019).
We found that H t was related negatively to richness, consistent with other tests of this index (Buxton et al., 2018;Eldridge et al., 2018). Acoustic richness was also related negatively to measured richness here and elsewhere (Mammides et al., 2017; but see Depraetere et al., 2012). Other studies testing the performance of acoustic indices have not typically considered such a broad range of sonic conditions and land cover types as our study, likely contributing to the inconsistencies among studies testing the performance of particular acoustic indices (Bradfer-Lawrence et al., 2020;Eldridge et al., 2018;Fairbrass et al., 2017;Fuller et al., 2015;Machado et al., 2017;Mammides et al., 2017;Zhao et al., 2019). Our finding of a few well-performing indices is then particularly noteworthy given that our models included a range of confounding sonic conditions most often excluded in studies testing the performance of acoustic indices (Buxton et al., 2018;Harris et al., 2016).
Acoustic indices were related significantly to measured richness in only three cases. Despite its high sensitivity to insect stridulations and to geophony both here and elsewhere (Depraetere et al., 2012), ARic performed well in the presence of anthropogenic sound. This may be a product of ARic's calculation method; acoustic richness is a function of both M and temporal entropy (Depraetere et al., 2012), and anthropophony is typically associated with temporally invariable lowfrequency patterns in the soundscape (Pieretti et al., 2011). H t was significantly related to measured richness under two sonic conditions. H t correlated with richness in the presence of geophony, but to our knowledge, the performance of H t has not been tested previously under these conditions; the index was developed and tested by Sueur et al., (2008b) after applying a high-pass filter to remove the sonic effects of wind. H t also performed well in the presence of anthropophony, and Depraetere et al., (2012) note that H t was not affected by anthropogenic background noise in their study. Both NDSI and NDSI Bio were insensitive to the three sonic focal conditions, likely because NDSI compares a ratio of acoustic energy in high frequency to low frequency bands, making it relatively insensitive to broadband sounds of evenly distributed amplitude across frequencies, while NDSI Bio excludes frequency bands associated with anthropophony (Kasten et al., 2012).
When conducting an aural inventory, long-duration broadband Fig. 3. Sensitivity of acoustic indices to extraneous sonic conditions. Sensitivity is the inverse of the absolute difference in model slope (that is, change in performance) of the acoustic index-mean site-level measured richness relationship between models where our three focal sonic conditions are absent (baseline) versus present. Sensitivity values below one indicate an increase in the strength of the model slope (performance improves), while values above one indicate a decrease in model slope (performance declines) under a given sonic condition-that is, in the presence of audible anthropophony (purple), geophony (turquoise) and broadband insects (lime green). Asterisks represent significant differences between model slopes (two-sided test, P = 0.05) based on t-statistics. Indices are ordered top-bottom by sensitivity values.
insect sounds contribute both signal and noise to the soundscape (Hart et al., 2015). Insect noise consistently reduced the performance of all acoustic indices, with only BioA being insensitive to, though not related significantly to, richness in the presence of broadband insect stridulations. Eldridge et al., (2018) suggested that BioA is robust to insect noise because it is calculated across a limited range of frequency bands, largely excluding confounding effects of high frequency insects noise and low frequency anthropophony and geophony. We also found seasonal effects on both the performance and sensitivity of many acoustic indices, with H being least sensitive to seasonality (see also Mammides et al., 2017), though the generality of seasonal differences in acoustic index performance is doubtful; seasonal effects likely depend on features of the acoustic environment determined by landscape configuration and vocalising species composition (Burivalova et al., 2019;Deichmann et al., 2018;Fuller et al., 2015;Sethi et al., 2020). Several acoustic indices have been proposed as being particularly capable of accounting for potentially confounding sonic conditions (Eldridge et al., 2018;Fairbrass et al., 2017;Kasten et al., 2012;Pieretti et al., 2011). For example, we found that ACI was insensitive to most background sounds with the exception of insect noise (see also Eldridge et al., 2018;Pieretti et al., 2011). However, its poor performance makes ACI a poor indicator of species richness (Mammides et al., 2017). Of the indices designed to be insensitive to anthropophony-ACI, ADiv, BioA, NDSI (see Fairbrass et al., 2017)-biophony (NDSI Bio ) was most robust to anthropogenic noise.
Though many indices did not reflect measured richness in our study, their robustness to sonic conditions (e.g. BioA's robustness to insect noise, or biophony's low sensitivity to anthropophony) makes them candidate ecological indicators should it be demonstrated that such indices reflect well a meaningful facet of biodiversity or habitat quality (Elise et al., 2019;Gasc et al., 2013). Until such time, the burden of proof remains on individual ecoacoustic studies to choose acoustic indices that can be interpreted meaningfully. Taken together, H t , ARic, and perhaps also NDSI, or NDSI Anthro may be the most suitable indicators of richness under the full range of sonic conditions included in this study (Table 1). In the case of H t , this index fulfilled both desirable conditions: 1) it correlates with richness, and 2) performs comparatively well across the range of sonic conditions tested, though it remains sensitive to insect noise. In exceptional circumstances, such as with heavy cicada activity, it may be preferable to use a less correlated but robust index in the face of a particular disturbance (Table 1), but we recommend cautioned interpretation of results in such cases.
As the suite of acoustic indices available for ecologists to rapidly summarise audio recordings continues to grow, there will no doubt be continued debate surrounding the existence of a single best index (Sueur et al., 2014). Yet this accumulation of indices may also lead to redundancy. We found that several of the acoustic indices most common in the literature, including our two best performing indices (ARic and H t ), were highly correlated. However, redundancy is unlikely since acoustic indices often relate to different aspects of the overall soundscape (Bradfer-Lawrence et al., 2019). Equally, indices may reflect different features of ecological communities and their dynamics. We have focused on the richness of biological sounds, but acoustic indices may reflect additional dimensions of biodiversity including abundance, evenness, and functional or phylogenetic diversity (Elise et al., 2019;Gasc et al., 2013;Harris et al., 2016;Mammides et al., 2017;Pieretti et al., 2011).
The proliferation of ecoacoustic research brings with it an abundance of methodological refinement (Bradfer-Lawrence et al., 2019) and scores of new acoustic indices (Sueur et al., 2014). To remedy the disparity among previous studies, Bradfer-Lawrence et al., (2020) suggested considering species richness an emergent site-level characteristic in studies of acoustic index performance. In doing so, they revealed consistent soundscape patterns across indices, whereby sites with higher species richness had more uneven soundscapes (Bradfer-Lawrence et al., 2020). We find support for this pattern with two indices not included in their study: H t and ARic. H t was related negatively to measured richness, meaning that as site-level species richness increases, the temporal evenness of the soundscape decreases (Sueur et al., 2008b). Further, since acoustic richness values are a ranked function of both temporal entropy and the median of the amplitude envelope (M) (Depraetere et al., 2012), the decrease in acoustic richness we observed with increased site-level richness likely reflects an increase in soundscape amplitude rather than in evenness. Overall, we can thus infer that sites with higher richness of vocalising species exhibit more structurally complex (less even) but more acoustically active (louder) soundscapes (Bradfer-Lawrence et al., 2020;Dumyahn and Pijanowski, 2011). Remaining discrepancies between the results of many previous studies considering acoustic index performance may result from their sensitivity to particular sonic conditions (Eldridge et al., 2018;Harris et al., 2016). Indeed, our finding that most acoustic indices did not relate significantly to species richness-even when accounting for data distribution and zero-inflation-is likely a consequence of both seasonality and the wider range of sonic conditions considered in this study than in previous tests of acoustic index performance (Bradfer-Lawrence et al., 2020;Buxton et al., 2018;Harris et al., 2016).
Though we often lack the basic knowledge of species occurrences needed to interpret ecoacoustic results in a meaningful way (Machado et al., 2017), our methodological framework considering both performance and sensitivity to a range of relevant sonic conditions should facilitate appropriate and targeted use of acoustic indices. We thus recommend ground truthing acoustic indices for a given context to better interpret results, as we have described. We conclude that, with suitable guiding principles such as those outlined here (Table 1), and when ensuring that acoustic indices reflect an ecologically meaningful facet of biodiversity, such indices have the potential to provide a Table 1 Recommendations for acoustic index use under different sonic conditions. Recommended acoustic indices based on the results of this study and others when handling audio data including different conditions: presence of geophony, anthropophony, broadband insect stridulations, or study designs including different seasons. All/unknown is when not specifically considering any of the above conditions, but all may be present in the study design, hence the recommendations in this category are conservative recommendations for where sonic conditions are highly variable.

Study Conditions
Recommended Indices Details All/unknown H t , ARic, NDSI Anthro , NDSI Across all sonic conditions, H t and ARic performed best, followed by NDSI Anthro . NDSI performed less well but was insensitive to all three sonic conditions. Bradfer-Lawrence et al., (2020) found that species rich sites exhibit temporally variable soundscapes, and we observed this pattern in our study. Anthropophony H t , ARic, NDSI Bio ARic and H t were related significantly to richness in the presence of anthropophony in our study and in that of Depraetere et al., (2012). NDSI Bio was insensitive to anthropophony here and elsewhere ( Fairbrass et al., 2017;Kasten et al., 2012). Geophony H t , ACI H t was related significantly to richness in the presence of geophony in our study. ACI was insensitive to geophony here and in Sánchez-Giraldo et al. (2020), but did not correlate with richness.

Broadband Insects
BioA BioA was least sensitive to insect stridulations in our study. Eldridge et al., (2018) found BioA largely ignores highfrequency insect noise.

Multiple Seasons
H t, ARic, NDSI, H H t and ARic did not differ largely between seasons in their performance. NDSI was not significantly affected by any sonic conditions when considering seasons. We found H was fairly robust to seasonality, as did Mammides et al., (2017). NB: seasonal effects likely differ among studies. significant contribution to ecological monitoring in complex acoustic environments.

CRediT authorship contribution statement
Samuel R.P-J. Ross

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.