Introduction

Although most acoustic signals have evolved to influence conspecifics (Catchpole and Slater 2003; Owren et al. 2010; Mikula et al. 2021), individuals in natural communities often sing together, in time and space, with members of other species. On occasion, this generates multi-species choruses, especially in birds (Gil and Llusia 2020), mammals (Windfelder 2001), amphibians (Ulloa et al. 2019), and insects (Snedden and Greenfield 2003). Many species-specific mechanisms have been proposed to explain why single individuals signal, but few explanations exist for communal displays. A key question that has not yet been answered is whether multi-species acoustic displays merely reflect the spatial and temporal coincidence of intraspecific influences (for instance the co-occurrence of favorable conditions for several species, see review of Gil and Llusia 2020) or may be triggered by interspecific interactions. This question has not been approached in assemblages involving more than two or three species (e.g., Phelps et al. 2007; Budka et al. 2023), ultimately limiting our knowledge of how species coexist in acoustic spaces filled by the signals of many species, and of the role of sound in biotic interactions. Good study systems to fill this knowledge gap are avian communities, which generally include many species signaling at the same time, and birdsong, a conspicuous behavior that permits experimental manipulation (Kroodsma 1989).

Even if the time at which birds start singing is a species-specific trait, different species often sing simultaneously (Gil and Llusia 2020). Several hypotheses have been put forward to explain why different species sing at the same place and at the same time (Table 1). First, since multi-species choruses produce a noisy background (Brumm and Slabbekoorn 2005), and the greatest interference occurs between conspecifics as well as species with similar songs (Sueur 2002; Schmidt et al. 2013), community members should avoid the co-occurrence of their songs through spatial segregation, divergence of acoustic signals, or change in song timing (Brumm 2006). As a consequence, coexisting species should partition their acoustic space, and communal displays should involve only species with different songs (Budka et al. 2023). However, Tobias et al. (2014), Laiolo (2017), and Gayk et al. (2021) showed that bird species signaling together in time or space use acoustic signals that are more similar in design than expected by chance. Seddon and Tobias (2010) and Laiolo (2012) also experimentally demonstrated aggressiveness toward songs of other species, giving rise to the alternative hypothesis of heterospecific territory defense through the utterance of similar songs, which often involves species pairs with conserved (or potentially convergent) signals (Cody 1969) or resources (Reed 1982). Singing among heterospecifics with similar songs during the same time window may, however, also occur without communication, as habitat or other environmental factors select for species with similar acoustic features, or for other features correlated with songs. Direct selection on song features may occur for the process of acoustic adaptation: since songs with different acoustic properties transmit differentially across habitats or climatic contexts (Morton 1975; Boncoraglio and Saino 2007; Ey and Fischer 2009), species co-occurring in the same habitat may have similar songs. Indirect selection may also occur if environmental filters act on phenotypic correlates of acoustic parameters (e.g., body or beak size; Podos 2001; Huber and Podos 2006). Finally, species may signal together because they are acoustically facilitated by heterospecifics: the acoustic activity of community members may indicate safe conditions for displaying, because of low predation risk or diluted attacks from predators (Møller 1992; Windfelder 2001).

Table 1 List of hypotheses and associated predictions for this study

The aforementioned hypotheses, namely, acoustic competition, interspecific territoriality, environmental filters (on correlates of sounds or sounds themselves), and acoustic facilitation, predict contrasting relationships between song co-occurrence patterns and the acoustic characteristics of signals, as well as contrasting behaviors after playback emission (Table 1). Here, we tested for these hypotheses by examining the ecological, intrinsic, and social correlates of interspecific song co-occurrence, and by testing whether birds started singing after broadcasting heterospecific songs. We used passive automatic recordings to examine the magnitude of (unprompted) song co-occurrence and the correlates of this behavior (similarity in songs, morphology, ecology, habitat, and phylogenetic relatedness). We then performed playback experiments broadcasting songs, recording the acoustic behavior of community members after playbacks, and testing whether changes in singing behavior depended on song structure. We exclusively focused on songs (signals emitted during the breeding season, with sexual purposes) in a community of temperate woodland birds. We performed playbacks in both spring and autumn to compare singing behavior in reproductive and non-reproductive contexts, respectively. If community members avoid acoustic interference while singing (“acoustic competition hypothesis”) (Brumm and Slabbekoorn 2005), we expect that only species with dissimilar songs sing during the same time window, since species with similar songs should partition their song activity in time (Planque and Slabbekoorn 2008). We also expect spatial avoidance among species with similar songs, as a form of acoustic competitive exclusion, and no change in bird behavior after playbacks are broadcasted, or an interruption of singing activity (especially in species with songs similar to those of the playback). If there is communication among species for territorial defense (“interspecific territoriality hypothesis”), as demonstrated between closely related species sharing niche features, we expect that species singing together are phylogenetically related, have similar songs, and exploit similar resources (Reed 1982; Laiolo 2012, 2017). We also expect an active response to heterospecifics and aggressive behaviors to deter an intruder, especially when songs with similar acoustic features are played back. If species sing during the same time window of other species because they are socially facilitated by them (“acoustic facilitation hypothesis”), we expect both a co-occurrence of their songs and an active response to heterospecific songs, irrespective of song similarity, since the acoustic activity of community members may indicate safe conditions independent of the acoustic characteristics of the signal. In this case, we do not expect aggressiveness and approaches toward the loudspeaker. Finally, if the environment acts as a filter (“environmental filtering hypothesis”), we expect that birds showing song co-occurrence to have more similar songs or morphology, or to live in similar environments, but we do not expect an active response to playback stimuli of other species (Table 1).

Materials and methods

Passive recordings

Data collection and degree of singing during the same time window

We sampled bird choruses in chestnut and mixed-deciduous forests in March and April (breeding season) of 2020 and 2021 by placing AudioMoth recorders (version 1.1.0) (Hill et al. 2018) in an area of 58,270 ha located in Asturias (Northern Spain) (Fig. S1). We used the AudioMoth Configuration App to set the sample rate to 32 kHz and recording duration to 70 min, with no sleep duration and the gain set to medium; no calibration to specific physical sound level was performed. Recordings started 20 min before sunrise and AudioMoths were placed at chest height on tree trunks. We used two alternative time windows to analyze the co-occurrence of songs of any two species: 10 min and 30 s. The former, longer lag, was established according to Tobias et al. (2014), who considered that two species were singing together when both sang in an interval of 10 min. For this, each recording was divided into five 10-min sound files with a break of 5 min. Altogether, we collected 515 recordings from 103 localities at a > 400 m distance from each other. The shorter lag, of 30 s, was obtained within each 10-min track, from second 300 to 330 (i.e., in the middle of the track), obtaining the same sample size. All analyses were run separately considering the 10-min tracks (as a measure of gross temporal co-occurrence) and 30-s tracks (as a measure of fine temporal co-occurrence).

The species recorded were first identified by listening and through spectrogram inspection with Avisoft SASLab Pro Software (version 4.2) (Specht 2002), using the authors’ experience in aural bird censuses. To identify uncertain vocalizations, we used the xeno-canto repository https://xeno-canto.org/ to visualize the spectrograms of the species we know inhabit the study area (Laiolo et al. 2018). We focused exclusively on the vocalizations that were defined as songs, because this is the only avian vocalization that involves marked vocalizing during the same time window among community members during specific hours of the day (e.g., morning) and periods of the year (spring months) at our latitudes (Fig. S2). We then built a matrix in which each 10-min track (or 30-s track) was a row, and each species a column, filling cells with 1 and 0 when the species was, or was not, singing, respectively. We focused on those species that were recorded singing in more than three plots, and the final database was composed of the 13 resident species (Certhia brachydactyla, Columba palumbus, Cyanistes caeruleus, Erithacus rubecula, Fringilla coelebs, Parus major, Periparus ater, Regulus ignicapilla, Sitta europaea, Sylvia atricapilla, Troglodytes troglodytes, Turdus merula, Turdus philomelos). Corvids, raptors, and woodpeckers, which have long range communication but no clear vocal “song,” were excluded from our sample. We quantified pairwise song co-occurrence among species through the C-score (Stone and Roberts 1990), which quantifies the number of co-occurrences (the song of both species) for each species pair. This index is calculated from the above matrix as (Si − T) × (Sj − T), where Si is the number of singing events for species i, Sj is the number of singing events for species j, and T is the number of tracks in which both species sing. This index was scaled to range from 0 to 1, with zero indicating full song co-occurrence and 1 indicating full song avoidance (Gotelli and Rohde 2002; Laiolo et al. 2017). The C-score was calculated with the R package bipartite (Dormann et al. 2008), building a data matrix for 30-s tracks and another for 10-min tracks. Since the same unmarked bird could be recorded across tracks, we estimated song co-occurrence (C-score) by considering only one track per locality (the third in the series of five, being the track in which more species were singing, see Supplementary Material). These latter C-scores calculated with one track per locality were highly and positively correlated with those calculated with 10-min and 30-s tracks mentioned above (R > 0.5, p value < 0.001) and thus we did not consider them further.

Determinants of singing during the same time window

To examine the relationship of the degree of singing during the same 30 s or 10 min with the characteristics of species in terms of song, morphology, trophic niche, and environment (habitat and climate) similarities, and in terms of phylogenetic relationships, we obtained distance matrices for each of these predictors (therefore, we addressed relationships between song segregation vs dissimilarities).

We estimated differences in species songs using the abovementioned passive recordings and audio files and selecting clear isolated sounds with low background noise for acoustic analyses. For each species, 3 songs were sampled from the recordings of at least 3 individuals, where possible, to obtain a mean value of each parameter for species (Seddon 2005) (Table S1). We only considered tracks in which vocalizations were rendered in a series of successive songs with similar amplitude, being confident that these close sounds were produced by the same individual, and from different localities to assure individual independence. We measured seven acoustic characteristics defining the acoustic space of each species’ song: maximum frequency (Hz), minimum frequency (Hz), peak frequency (Hz; frequency in the song with the greatest amplitude), bandwidth (Hz; maximum frequency minus minimum frequency), song duration (s), number of notes, and pace (rate of note production or number of notes per s−1). Acoustic characteristics of songs were measured using Avisoft SASLab Pro (sampling rate = 22,050 Hz; FFT length = 512; window = “hamming”; frequency resolution = 43 Hz; intensity = 0). Frequency parameters were measured on the power spectrum while duration and number of notes parameters were measured in the oscillogram (see Fig. S3). The acoustic characteristics from recordings in situ were highly correlated with tracks downloaded from xeno-canto (see xeno-canto codes in Table S2) (R = 0.94, p value < 0.01) and thus could be considered as representative of the species. From species acoustic measurements, we obtained a Euclidean distance matrix of song dissimilarity between species, with the dist function on standardized data in the R stats package (R Core Team 2013).

To assess environmental differences between species, we averaged the environmental conditions (percentage of shrub, forest, rock, and meadow; mean annual temperature; and annual precipitation) of the presence plots of each species from the data of Laiolo et al. (2018) and calculated a pairwise distance matrix as for the acoustic parameters. The above data represent a broad regional survey conducted in north-western Spain, in which bird presence was censused in plots with a 100 m radius by means of “area surveys,” in which observers noted down all visual and aural detections, plus the birds flushed out during walks within plots (Laiolo et al. 2018). We also estimated ecological (trophic) dissimilarity among species from a binary matrix of diet type and feeding substrate (insectivorous, granivorous, herbivorous, frugivorous, scavengers, rock, ground, grass, air, water, leaves, or bark), with data again obtained from Laiolo et al. (2018).

As intrinsic determinants of species singing during the same time window, we considered both phylogeny and morphology. To obtain a phylogenetic distance matrix, we used the multigene phylogeny of http://www.bird.tree.org (Jetz et al. 2012) (see details in Laiolo et al. 2017). From the tree, we obtained a cophenetic distance matrix representing the pairwise phylogenetic distance between species with the function cophenetic in the R stats package (R Core Team 2013). The four traits used to characterize morphology were the mean body weight, wing length, bill length, and tarsus length of species. Morphological data were derived from the literature (Laiolo et al. 2015) by considering data from males (since only males sing in the bird species of our community, with the exception of the European robin; Catchpole and Slater 2003). Distance matrices among species morphology were estimated on standardized data as explained above.

Since the probability of song co-occurrence also depends on the probability of spatial co-occurrence of species—birds that do not co-occur will not sing in the same place and time—we estimated the spatial segregation of all target species and included this information as a predictor, in order to control for biases due to species propensity for co-occurrence. Spatial co-occurrence data and their respective C-scores were also obtained from Laiolo et al. (2018). Among the 2345 survey plots of this latter study, we selected those in which at least one target species was present and then estimated the spatial C-score, with 1 and 0 indicating allopatry and sympatry, respectively. This metric permitted the estimation of the co-occurrence of species’ songs independent of the propensity of spatial co-occurrences. At the same time, it also permitted testing whether acoustic dissimilarity decreased with spatial segregation, as a way to test spatial acoustic competition (Table 1).

We combined ecological, acoustic, and intrinsic features to assess the determinants of the co-occurrence of species’ song in a 30-s and 10-min time window. Since all variables were pairwise distance matrices, we used multiple regressions on distance matrices (MRM). Song co-occurrence (song C-score) was the response matrix, while the spatial C-score and phylogenetic, acoustic, environmental, morphological, and trophic distances were the explanatory variables. Multiple regressions on distance matrices were performed with the MRM function in the R package ecodist (Goslee and Urban 2007) with 999 random permutations (Laiolo et al. 2020; He et al. 2021). Prior to analyses, we evaluated correlations between explanatory variables to test for multi-collinearity. Morphological and trophic distance matrices were highly correlated with the phylogenetic matrix, and thus we included only the latter in analyses to avoid overparameterization (Table S3). However, models were also repeated using all explanatory variables, including variables excluded for collinearity (which had no significance effect; Table S4). All distance and C-score matrices are presented in Tables S5-S12.

Playback experiments and variation in singing behavior

Playback stimuli and experimental procedure

Playback experiments were performed by broadcasting songs of the 13 target species identified in passive acoustic recordings. For each species, we performed from 6 to 10 playback trials in spring for a total of 99 trials (Table S2).

Trails were carried out in the same forest patches and localities as passive recordings from 7:30 a.m. to 2:00 p.m. in good weather conditions, in spring from March to May (breeding season) of 2021, 2022, and 2023 (the 2020 pandemic lockdown prevented the complete overlapping of recordings and playback tests as originally designed). The song of each species broadcasted was obtained from the xeno-canto repository (Table S2). We selected tracks with low background noise, close to our study area, with high quality (graded “A” on xeno-canto) and with no interference from heterospecific sounds. As detailed above, these songs are highly correlated with the ones of our study area, and their reliability is supported by high levels of conspecific responses to playbacks (see “Results”). Tracks were filtered in the bandwidth of the signal to reduce background noise through the band pass filter tool of Avisoft-SASLab Pro, and 2-min audio files were created. Stimuli were broadcasted with “Vieta pro easy” omnidirectional loudspeakers (80 Hz–20 kHz response) positioned at less than 1 m above the ground. Silent intervals were maintained from the original tracks to reflect the range of natural intersong intervals. For each species, we broadcasted two different sets of tracks, one set composed of two individuals and the other set composed of one individual alone. Tracks consisting of two individuals were used to mimic natural conditions, in which more than one individual may sing at the same time. Tracks consisting of a single individual reduced the likelihood of overstimulation—in the event that some species respond more strongly to several individuals.

Furthermore, we performed 35 silent control experiments and 90 trials broadcasting non-bird sounds. We tested three categories of non-bird sounds: modified dolphin (N = 35), modified grasshopper (N = 25), and original anthropogenic (N = 30) sounds. These noisy controls served to test whether birds were also stimulated by other playback sounds, as found in other studies (e.g., McLaughlin and Kunk 2013). Each of the non-bird sound categories was composed of two kinds of tracks. The original dolphin and grasshopper sounds included ultrasounds and were therefore modified to a sampling rate included in the hearing range of the species in our community (Catchpole et al. 2003) (see Table S1 for frequency ranges of modified non-bird sounds). For dolphin sounds, we used one track with more pauses (N = 15 trials) and another with long sounds (N = 20). For grasshopper sounds, we used the stridulations of two different species (Chorthippus parallelus: N = 15 and Chorthippus yersini: N = 10), and for anthropogenic sounds we tested car noises (N = 15) and chainsaw noises (N = 15) (sources quoted in Table S1). The peak sound pressure level of broadcasted sounds was measured by means of a Realistic Sound Level Meter 33–2055 at 1 m distance from the loudspeaker with the time constant “fast” and linear frequency weighting (Table S13). It was on average 82.67 dB SPL, ranging between 80 and 89 dB SPL as the maximum difference between playbacks, but with a lot of overlap of intensities in measurements across each sound. There was only one sound below this range (70 dB SPL, in the modified stridulation of Chorthippus yersini), but we did not raise the level of these sounds because the location of focal birds, and thus their distance from the sound source and in turn the perceived intensity, could not be controlled for a priori in these community-wide experiments. However, playback sounds of both bird and non-bird sounds were still clear and clearly distinguishable at a distance of 20 m (our range of observation, see below) and their peak or average sound pressure level had no effect on the behavior of birds (i.e., they are unrelated to the propensity of bird species to sing or stop singing after the playback: all t16 < 0.27; all p values > 0.79).

In the study forests, each playback trial lasted 8 min: 4 min of silence followed by 2 min of playback stimuli (bird song or non-bird sounds) and then 2 min of silence again, while the silence control consisted of 8 min of silence. Trial localities were separated by at least 400 m, and in each locality more trials (with different treatment) were conducted at different times or on different days. The broadcasted stimulus was randomly selected, under the condition that the same stimulus was not repeated in the same locality. By doing so, we could test diverse pools of species in the local community and avoid testing multiple times the same unmarked individuals with the same sounds.

During trials, two observers remained at a distance of approximately 20 m from the loudspeaker, noting down the species singing in a 20 m radius from the loudspeaker during the 8-min trials. During the first year of the study, we also noted the birds that approached the loudspeaker (n = 63 trials). Additionally, one AudioMoth was placed near the loudspeaker (at a distance of approximately 40 cm) in order to confirm identification a posteriori in case a species’ song remained unidentified in the field. From these recordings, we only noted down good quality sounds to identify species, as a signal that birds were close and within 20 m. Observers could hear the selected stimuli, and thus trials were not blind (deaf), as birdsong identification by ear was mandatory during trials. However, observations were blind with respect to the spectrotemporal features of songs, which were all measured after experiments. The observers preferentially placed themselves along forest trails or paths and remained in the trial location only the necessary time to perform the experiment, to avoid disturbing birds in the breeding season more than ordinary people strolling. In the same locations, the same song was never repeated to avoid annoying breeding males more than once with the song of conspecific intruders.

Variation in singing behavior after heterospecific songs

We defined the variation in singing behavior after playback as follows. Species that were singing during the 2 min before playback (from minute 2 to 4 of the experiment) and stopped singing from minute 4 to 6 (i.e., during the 2 min playback/silence/noise or the 2 min after) were considered as species interrupting their songs. On the other hand, species that started singing from minute 4 to 8 (i.e., during the 2 min playback/silence/noise or the 2 min after) were considered the species that sing actively after playbacks species. All species that already sang before the playback stimuli/silence (from min 0 to 4) were considered as non-responding, even if they kept singing during or after the playback, because we detected neither a significant increase nor decrease in the number of songs uttered during the trial (χ2 = 0.741; p value = 0.389, n = 7 species with discrete songs that could be counted; Table S14). Examples of recordings during playback trials are shown in Fig. S2 and S4.

For each species in the local community, we analyzed two categories of variation in singing behavior after playbacks: “response” (when species started singing) and “interruption” (when species stopped singing). For each species, we quantified the overall number of “responses” and “interruptions” over the number of trials with heterospecific songs, conspecific songs, silence controls, and non-bird sounds—for the three categories for separate (modified dolphin sounds, modified grasshopper sounds, and anthropogenic sounds) and for the overall number of non-bird sound trials. With the above percentages of variation in singing behavior per species, we tested by means of paired t-tests at the species level whether percentages of variation in singing behavior (response and interruptions separately) after heterospecific songs, after non-bird sounds, and after conspecific songs were significantly different from the percentage of variation (response and interruptions separately) after silent controls. With the same method we tested whether physical approaches to the loudspeaker (or short flights back and forth; the typical response to conspecific songs; Bastianelli et al. 2017) differed between conspecific and heterospecific playbacks. Assumptions of normality for the differences between pairs were assessed by means of Shapiro–Wilk tests. Additionally, to understand the magnitude of the difference between the treatments, we calculated the effect size through Cohen’s d (“effectsize” R package; Torchiano 2020). The magnitude is defined using the thresholds provided in Cohen (1992), i.e., |d|< 0.2 “negligible,” |d|< 0.5 “small,” |d|< 0.8 “medium,” otherwise “large.” The use of this estimate is preferable to the alternative procedure of correcting the experiment-wise error rate of multiple tests when sample size is very small, as in our sample consisting of 13 species (Nakagawa 2004; Garamszegi 2006).

We performed a number of supplementary analyses and controls to assess the soundness of our results with respect to singing activity. First, we controlled for the number of tests per species performed, since the number of trials with the song of a species was not exactly identical in all species, by estimating the variation in singing behavior by weighting the number of trials per species (Table S15). Results are only presented in the Supplementary Material as they are very similar to the uncorrected estimates (see below). Second, we assessed whether species phylogeny could affect our results, estimating Pagel’s phylogenetic signal λ of the percent differences between treatments, with the R package phytools (Revell 2012), for which we found no signal (λ = 0, p value = 1 in all cases). Third, we regressed species variation in singing behavior after heterospecific songs on the species frequency of occurrence from the abovementioned woodland census (Laiolo et al. 2018) to assess whether the chance of singing depended on the probability of species being present in an experimental plot. Fourth, we tested whether variation in singing behavior after playback was influenced by the hour of the experiment. We examined this by means of a generalized least squares model (gls R function; R Core Team 2013) between the hour in which the experiment was conducted (converted into decimal hour) and the number of species varying their singing behavior after playback. Fifth, in order to clearly differentiate singing from warning behaviors, we analyzed whether singing activity was also associated with alarming. Sixth, we compared bird behavior across seasons to assess whether the target species were stimulated to sing outside breeding. For this, playback experiments were performed in autumn from September to November of 2020 and 2022 with the same design as in spring (playback stimuli: N = 91; silent control: N = 34; non-bird sounds: N = 43). The propensity to respond in the non-breeding season would support the idea of a non-exclusive sexual function of song. For trials conducted in autumn, we compared the percentage of variation in singing behavior after heterospecific songs and to non-bird sounds from silent controls by means of paired t-tests after assessing for normality of differences as indicated above. Moreover, we analyzed the differences in the variation of the singing behavior between spring and autumn with the same test.

Acoustic determinants of singing behavior after playback

To assess whether the observed variation in the singing behavior depended on the acoustic structure of the playback stimuli, we calculated its relationship with song dissimilarity. This served to assess whether the species provoking more variation in singing behavior of the focal species were those with more similar (or dissimilar) songs from the other community members. For each playback species, we calculated the average acoustic differences from the rest of species in the community (from the above song dissimilarity matrix) and the average rate of variation (interruption or response) provoked, expressed as the percentage of species varying their singing behavior over the number of heterospecific trials performed. Generalized least squares models (gls R function; R Core Team 2013) were used to analyze this relationship in the 13 species. As above, we tested whether there was an influence of phylogeny, but we found no phylogenetic signal in the residuals of this relationship (λ = 0, p value = 1).

Results

Passive recordings

There was a significant negative relationship between song avoidance (higher values of song C-score) and song dissimilarity, showing that species tended to sing during the same time window with species with different songs, with no significant influence of the environment, phylogeny, and the time window considered (30 s or 10 min) (Fig. 1, Table 2). There was also no relationship between species probability of singing during the same time window and species frequency of occurrence (t12 = 1.510, p = 0.159), and thus singing during the same time window did not depend on the chance of a species being present in a plot. The species with more similar songs were not spatially segregated as song dissimilarity was not related to the C-score of spatial co-occurrence (p = 0.10) (Table 2). Spatial segregation was slightly positively associated with song avoidance only by considering 30-s intervals (singing during the same time window was higher in sympatric species, Table 2). Therefore, the results of passive recordings do not support the interspecific territoriality and environmental filter hypotheses, which predict that species with similar songs should sing more frequently during the same time window (Table 1). They support the competitive exclusion hypothesis, but only with respect to the temporal partitioning of sympatric species, while the hypothesis of facilitation among species singing together could not be dismissed.

Fig. 1
figure 1

Relationship between song avoidance in 30-s tracks (song C-score: higher values indicate song partitioning, lower values song co-occurrence) and song dissimilarity. Song avoidance is represented by the residuals of song avoidance vs phylogeny, spatial co-occurrence (spatial C-score), and environmental dissimilarity estimated with a multiple regression on distance matrices (MRM). Song dissimilarity is represented by the values of the pairwise distance matrix of song characteristics. Each dot is a species pair and the regression trend line is also shown

Table 2 Results of the multiple regression on distance matrices (MRM) testing for the relationship between the dependent variable (song avoidance measured through the C-score, ranging from 0, co-occurrence of species songs, to 1, partitioning of species song) and the predictor dissimilarity matrices: song dissimilarity (pairwise species dissimilarity in song acoustic features), environmental dissimilarity (pairwise species dissimilarity in habitat and climatic preferences), phylogenetic distance, and spatial segregation (C-score ranging from 0, sympatry, to 1, allopatry). We considered the time windows of 10 min and 30 s to estimate song segregation. The significant predictors (at p < 0.05) are depicted in bold

Playback experiments and active interspecific responses

The percentage of trials in which species stopped singing after playbacks was very low and not significantly different from the percentages of interruption after silence (t = 0.218; p = 0.831) (Table S16). Therefore, playbacks did not trigger any interruption of song activity and we did not consider this behavioral response further. In spring, species tended to respond more frequently to heterospecific songs than to silent controls (paired t-test: t12 = 2.406, p = 0.033), and the response to conspecifics was higher than to silence (paired t-test: t12 = 3.451, p = 0.005). This latter result also demonstrates that sounds were appropriately recognized as species-specific by local species (Fig. 2, Table S17). In 53% of observations, bird species tended to sing during the playback (Fig. S2) and continue after playback, thus providing evidence of active stimulation and less frequent immediate avoidance. On the other hand, birds were not stimulated by non-bird sounds compared to silent controls (paired t-test: t12 = 2.006, p = 0.068) (Fig. 2), although the effect size of heterospecific sounds and non-bird sounds vs silence was medium in both cases (heterospecifics vs silence: Cohen’s d = 0.67, non-bird sounds vs silence: Cohen’s d = 0.56, compared to Cohen’s d = 0.96 for conspecifics vs silence). This is because birds also sang after modified dolphin sounds more frequently than after silence, the sole non-bird sound exhibiting a medium effect size (Table 3). The response to heterospecific stimuli was especially high in the Eurasian blackcap (Sylvia atricapilla), Eurasian wren (Troglodytes troglodytes), and great tit (Parus major), while the response to modified dolphin sounds was again high in the Eurasian blackcap (Sylvia atricapilla) and Eurasian wren Troglodytes troglodytes, but also in the coal tit (Periparus ater) (see Table S18 for the responses of all species).

Fig. 2
figure 2

Boxplot showing the differential responses between silence and sound playback treatments in spring (silence control vs non-bird sounds, heterospecific stimuli and conspecific stimuli). Box plots show medians, interquartile range (IQR), and extent of data to ± 1.5 × IQR. Each dot is a species. Dots beyond the end of the whiskers indicate outlying points. The red asterisk indicate significant differences as tested by means of paired t-tests

Table 3 Results of the paired t-tests between percentage of response to the three categories of non-bird sounds and to silent controls. Effect size estimated through Cohen’s d. The significant relationship (at p < 0.05) is depicted in bold

Species did not utter more alarm calls after heterospecific songs and non-bird sounds compared to silence (paired t-test: heterospecifics vs silence: t = 1.675; p = 0.122; non-bird sounds vs silence: t = 1.467; p = 0.170), and therefore the acoustic response was mainly through songs. Moreover, individuals getting closer to the stimulus were mainly conspecifics of the broadcasted species, showing that the response to heterospecifics was predominantly through songs and territorial behavior was mainly associated with conspecifics (paired t-test heterospecifics vs conspecifics: t = 2.436, p = 0.033). The hour of the day and the frequency of occurrence of species did not affect the probability of singing (t < 1.723; p > 0.113). In autumn, song responses to playbacks were much reduced, and there was no difference between treatments (heterospecifics vs silent control, paired t-test: t12 =  − 1.003; p = 0.336; non-bird sounds vs silent control, paired t-test: t12 =  − 0.677; p = 0.511). The response to non-bird sounds was significantly lower in autumn than in spring (t12 =  − 4.047; p = 0.002), pointing to seasonal variation in the response elicited by these sounds. The response to heterospecific stimuli remained high in the European robin (Erithacus rubecula) only, but this species did not respond to non-bird sounds (Table S19).

There was a significant positive correlation between the response elicited by a species and its average acoustic dissimilarity from community members (t11 = 2.475, p = 0.031). Thus, songs that elicited the most frequent response were the most dissimilar, supporting the result obtained from passive records (Fig. 3).

Fig. 3
figure 3

Relationship between the response elicited by a playback species and its song dissimilarity from the bird community. Each dot represents a species and the regression trend line is also shown. Song dissimilarity expresses the average acoustic difference of a playback song from the songs of the responding species

Discussion

The mechanisms underlying the synchronization of acoustic signals have been largely examined in intraspecific contexts, where a number of functions have been revealed, from sexual stimulation (Wells 1977) to protection from predators through a dilution effect (Greenfield 2015). At the interspecific level, the traditional view is that competitive effects prevail, with species avoiding interference through divergence in acoustic properties, timing, or space of signaling (Ulloa et al. 2019). This study shows a tendency of bird species with similar songs to avoid singing during the same time window (in line with competition among species with similar songs), but also an active stimulation to sing when hearing heterospecifics and, on some occasions, unfamiliar sounds. The significant differences between responses to heterospecifics and to that of silence suggest a potential active behavior between acoustically divergent species that is not compatible with competition, but rather with song stimulation or facilitation between species in a community. Further evidence is required to confirm these results, but, if verified by other studies, they may support the intriguing suggestion that interspecific interactions through songs facilitate species coexistence in crowded acoustic spaces—thus have links with species diversity—and anti-predatory behavioral defense—thus have links with individual fitness. Our findings also pose a new question on the potential response to sounds that birds have never heard before, which calls for further research into non-species-specific acoustic stimulation.

During the breeding season, we found that birds significantly sing during the same time window with species with dissimilar songs. Moreover, species responded to playback stimuli of other species, especially those with different songs, and even responded to some unfamiliar non-bird sounds. Playback experiments have been crucial in revealing these behavioral patterns, as passive recordings alone cannot discriminate between the mechanisms determining species’ active singing during the same time window (as expected for the acoustic facilitation and the interspecific territoriality hypotheses), identify the causes of song segregation (as expected for competition hypothesis) or other environmental influences not related to the social environment. Our results did not support the environmental filter hypothesis: there was no relationship between environmental similarity and the natural tendency of interspecific song co-occurrence, and experiments revealed a response to playback unexpected by this hypothesis. Interspecific territoriality also seems unlikely, since there was no active singing after, or singing during the same time window with, closely related or ecological similar species, and no approach to the loudspeaker as for conspecific songs (Cody 1969; Reed 1982). Moreover, we found neither higher singing during the same time window in species with converging songs nor trophic niche, but we did observe a response to unfamiliar modified dolphin sounds. The fact that birds sang when species with dissimilar songs also sang supports the acoustic competition hypothesis, or at least suggests a pattern compatible with limiting similarity (MacArthur and Levins 1967). However, the competition hypothesis was not fully supported by the outcomes of the playback experiments, since birds were stimulated rather than inhibited by heterospecific songs. In addition, there was a lack of evidence of spatial avoidance between species with similar songs. Conversely, the acoustic facilitation hypothesis was supported by the results of playback experiments, which showed how birds actively responded to stimuli (Table 1). However, while being stimulated, birds tended to avoid acoustic interference (Ficken et al. 1974), a behavior that is not incompatible with facilitation, as most facilitative interactions in ecological communities only occur among phenotypically or phylogenetically distant species (Gross et al. 2009). Individuals might benefit from singing over an heterospecific background if this reduces predation risks (Greenfield 2015), since hearing other species singing might indicate optimal conditions to sing because of low predation risk (Budka et al. 2023) or responding to other species songs might promote a dilution effect. This would be between species (Delm 1990; Greenfield 2015) rather than within species (Møller 1992), and with songs rather than calls, for which a facilitative function has already been postulated (Sieving et al. 2004; Gayk et al. 2021). One possible function to be explored in future studies is whether the observed behavior may also imply benefits in a reproductive context, in terms of mate attraction. For instance, there are observations in birds of the adaptive use of externals signals to improve their own sexual signaling (Dawkins 2016; Järvinen and Brommer 2020). Additionally, further research should investigate the possibility of birds actively switching their song type in response to the features of other species’ songs, as done on occasion for anthropogenic sounds (Halfwerk et al. 2011).

Future studies should also be directed toward acquiring a larger sample of non-bird sounds for experiments, to increase the power of tests when the effect is not strong, and thus elucidate the nature of the behaviors we observed. We can in fact exclude the idea that the stimulation by heterospecifics was due to birds responding to any kind of playback noise, since only in the case of modified dolphin sounds were birds triggered to sing. This type of response is not completely unusual: noise has been found to stimulate songs in canaries (Goto et al. 2023), and neotropical birds sing over backgrounds of dissimilar insect sounds (Stanley et al. 2016). The observed behavior might reflect some warning strategy against other species or unknown acoustic threat (Losin et al. 2016), although we did not observe enhanced alarming paired with songs. Increased levels of sex hormones (Fusani 2008) and the impulse to sing for reproductive purposes (Gahr 2014) may also cause birds to sing following imperfect signals (Önsal et al. 2022). Our results were not due to differences in song intensity as found in other studies (Brumm and Zollinger 2013), but the characteristics of non-bird sounds should be explored with a larger sample of different non-bird sounds to identify the acoustic cues that stimulate birds. We exclude species misidentification in this result, as this occurs among species with highly similar, not dissimilar, sounds (Searcy and Brenowitz 1988). We also observed season dependent reactions to heterospecific stimuli and non-bird sounds, with responses to playbacks occurring in spring only. In autumn, the response to heterospecific songs remains remarkably high in the European robin (Erithacus rubecula) compared to the response of the other species. European robin is the sole species in our community with singing activity in autumn and winter (Catchpole and Slater 2003), which is induced by higher levels of testosterone and territorial behavior (Kriner and Schwabl 1991). Despite the common pattern we identified in this study, it is possible that species differ in the intensity and timing of acoustic responses to heterospecifics, and the European robin may be an appropriate target for future studies on this subject.

In birds, positive interactions have already been described through alarm or mobbing calls in anti-predatory contexts (Sieving et al. 2004; Gayk et al. 2021), but facilitation through songs has been poorly explored. This study points to active song stimulation between coexisting species that may represent a case of facilitation, which, however, awaits further testing. Irrespective of the function, clear patterns of song dissimilarity and interactions among non-conspecific individuals emerge, the effect of which is a community-wide maximization of acoustic diversity at the local scale.