Everyday bat vocalizations contain information about emitter, addressee, context, and behavior

Animal vocal communication is often diverse and structured. Yet, the information concealed in animal vocalizations remains elusive. Several studies have shown that animal calls convey information about their emitter and the context. Often, these studies focus on specific types of calls, as it is rarely possible to probe an entire vocal repertoire at once. In this study, we continuously monitored Egyptian fruit bats for months, recording audio and video around-the-clock. We analyzed almost 15,000 vocalizations, which accompanied the everyday interactions of the bats, and were all directed toward specific individuals, rather than broadcast. We found that bat vocalizations carry ample information about the identity of the emitter, the context of the call, the behavioral response to the call, and even the call’s addressee. Our results underline the importance of studying the mundane, pairwise, directed, vocal interactions of animals.


Supplementary tables
. Number of annotated vocalizations by outcome. The upper section presents vocalizations which were included in the analysis. Individuals which have less than 15 vocalizations in more than one context (see Table S1) were excluded. The lower section presents vocalizations of these individuals. Only vocalizations in which the outcome could be clearly defined were included. * Unknown outcomes were not included. Addressee  Bat ID  F1  F2  F3  F4  F8  M1  P1  P4  P6  P8 F10 P9 Unknown  F1  0 228  69  47 109  321  0  1  0  0  4  0  25  F2  14  0  4  4  2  675  0  1  2  0  1  1  33  F3  3  15  0  12  6 1306  32  1  3  1  2  1  42  F4  7  33  12  0  10  499  2  0  2  0  3  0  21  Total  24 276  85  63 127 2801  34  3  7  1  10  2  121   Bat ID  F5  F6  F7  F9  U1  M2  P1  P2  P3  Only addressees with more than 20 vocalizations addressed to them were included in each classification task. The majority of the vocalizations are produced by females and addressed to males. The table does not include mating aggr. context in which the addressee is always the male, hence classifying the addressee may be equivalent to predicting the context in these cases.    for the prediction of the emitting individual (BA=56%, chance=7%, p<0.01). This analysis was performed on all recordings from this study, including recordings conducted previously in the same setup, excluding pups and bats with less than 400 recorded vocalizations. This analysis indicates that emitter recognition is possible on larger numbers of individuals, and probably only depends on the amount of data available for the training procedure. Bat IDs as described in Table S1, and XFx -Female from previous recordings, XMx -Male from previous recordings.

Supplementary Movies
Video S1. Example of feeding aggression vocalizations. Two bats (hanging from the upper fruit skewer) are interacting during feeding. The emitting individual is marked with a red arrow.
Video S2. Example of mating aggression vocalizations. A female is protesting against a male mating attempt (a male is attempting to mount a female with a pup still attached to her). The emitting individual is marked with a red arrow.
Video S3. Example of perch aggression vocalizations. Two bats are interacting, with limited physical contact relative to other contexts, while perching in their artificial roost. In this aggressive display, the male is the aggressor and the female reacts and retreats. Another female is seen protecting her pup and sidesteps the squabble. The emitting individual is marked with a red arrow.

Video S4. Example of sleep aggression vocalization.
A bat is vocalizing while in the daytime sleeping cluster. Notice the pups held under the wing of both females. The emitting individual is marked with a red arrow.

Segmentation of raw recordings
Raw audio recordings were segmented into syllables and filtered to remove noises using an automated process as described in 36 . Vocalizations are bouts (sequences) of varying number of syllables. Sequences separated by a silence of more than 120ms were considered as separate vocalizations. The value of 120ms was obtained by assuming that the intra-bout silence intervals are normally distributed while the interbout silence intervals are exponentially distributed (a result of the assumption that the vocalization occurrences can be approximated by a Poisson process). Such 2-component distribution was fitted to the entire pool of interval durations, and the value, from which the likelihood of the exponential component was bigger than the Gaussian component, was chosen (Fig. S7).
We used Gaussian Mixtures Models (GMM) of 16 components in 64-dimensional space.
A Universal Background Model (UBM) was constructed by sampling 3900 syllables from data that was not used in the analysis. This sample was drawn from a set of vocalizations for which the pair of communicating bats was identified but it was not clear which was the emitter and which was the receiver.
The syllables were randomly sampled in a balanced way, such that each individual was involved in the same number of interactions. A GMM was then fitted to this sample to create the UBM.
In the UBM-GMM approach, the training phase of the model fits a GMM to each class (e.g. each emitter, each context, etc.). However, instead of directly assessing the GMM parameters from the labeled data, the UBM is being used in an adaptive way 39 .
Scoring test data. The score of a given test vocalization was computed for each class as the (log-) likelihood ratio between the class model (GMM) and the UBM. The class that received the highest score was the predicted class for this vocalization.