Do Dolphins ’ Whistles Reveal their Age and Sex ?

Bottlenose dolphins (Tursiops truncatus) have a complex acoustic communication system composed of a variety of sounds, including narrow-band, frequency-modulated whistles. Many past studies of dolphin whistles have focused on clarifying how dolphins use a subset of whistles for self-identification, with less attention given to other qualities that whistles may reveal about a vocalizer. Acoustic features of vocalizations provide indicators of the physical characteristics of the caller (e.g., size) for many vertebrate species. To test for similar cues in dolphin whistles, we investigated whether whistles varied systematically according to the sex and age of the vocalizer. Neural networks were created to sort whistles produced by males or females, calves or adults, or from dolphins in four different age groups. Fourteen acoustic parameters of whistles were used as inputs to the networks. Results showed that neural networks were able to learn to classify whistles based on dolphin age or sex; however, networks showed relatively little ability to classify whistles other than those that they were trained to sort. No single class of acoustic cues consistently enabled networks to differentiate either males from females or older dolphins from younger dolphins. Instead, the neural networks used multiple acoustic dimensions to sort whistles. These results suggest that acoustic cues indicative of age and sex are likely present within all whistles produced by dolphins, but that these cues do not correspond to the kinds of global shifts in spectral features that would be expected from systematic ageor sex-related differences in the shape or size of sound producing membranes or acoustic resonators across individuals.

Past studies describing how acoustic variations reveal qualities of callers have heavily emphasized features that may directly reveal anatomical constraints on sound production (e.g., the distribution of formant frequencies, Efremova et al., 2011;Fitch, 1997;Reby et al., 2005;Rendall et al., 2004) or past learning experiences (e.g., dialects, repertoire content and size, Deecke, Ford, & Spong, 1999;Kremers, Lemasson, Almunia, & Wanker, 2012;Miller, Samarra, & Perthuison, 2007).Less attention has been given to examining how social relationships between individuals may affect the precision with which acoustic variations can be identified and used (Morton, 1996(Morton, , 2012)).For example, people with hearing disorders are better able to recognize and understand speech produced by their spouses than speech produced by strangers (Johnsrude et al., 2013;Souza, Gehani, Wright, & McCloy, 2013), and birds are better able to localize songs produced by familiar neighbors than songs produced by unfamiliar intruders (Morton, Howlett, Kopysh, & Chiver, 2006).Thus, how often listeners experience acoustic variations in specific social contexts can also potentially affect the utility of received variations.
Understanding how vocalizations facilitate interactions between conspecific allies while minimizing any costs associated with eavesdropping antagonists requires identifying how those vocalizations can potentially differentiate callers (Charlton et al., 2009a;Owings & Morton, 1998;Semple, 2001;Tripovich, Charrier, Rogers, Canfield, & Arnould, 2008).Identifying the consistency with which different sounds reflect a caller's qualities is particularly important for evaluating the extent to which species that maintain long-term social relationships monitor and respond to the actions of particular individuals within groups (Sharpe, Hill, & Cherry, 2013).
The vocal behavior of bottlenose dolphins provides a unique context for investigating the flexibility with which mammals can modulate acoustic variations to reveal or obscure physical characteristics.Dreher and Evans (1966;1964) postulated that variations in the whistles of dolphins could carry information about a variety of social contexts and behavioral states.In contrast to other animals that can be identified from individual acoustic variations of a shared call type (e.g., see Blumstein & Munos, 2005;Charlton et al., 2009b;Matrosova et al., 2011;Semple, 2001;Vannoni & McElligott, 2007), dolphins are thought to use idiosyncratic stereotyped whistles, referred to as signature whistles, to recognize individuals (Harley, 2008;Janik & Sayigh, 2013;Janik, Todt, & Dehnhardt, 1994;Kershenbaum, Sayigh, & Janik, 2013;King, Harley, & Janik, 2014;Sayigh, Esch, Wells, & Janik, 2007;however, see McCowan & Reiss, 2001, for an alternative viewpoint).Some researchers have suggested that the complex social demands of dolphin societies in combination with their acoustic and vocal adaptations to life underwater may have shaped a sophisticated acoustic communication system in order to maintain individual and group relationships (Caldwell, Caldwell, & Tyack, 1990;Herzing, 2000;Marino et al., 2007;Reiss, McCowan, & Marino, 1997).The "whistles"1 produced by dolphins actually are the result of pneumatically induced tissue pulsations (Madsen, Jensen, Carder, & Ridgway, 2012;Murray, Mercado, & Roitblat, 1998a), and are generally not thought to be significantly impacted by resonating air cavities (Cranford, Amundin, & Norris, 1996;Madsen, Lammers, Wisniewska, & Beedholm, 2013;see Miller et al., 2007, for a possible exception).Vibrating membranes can also potentially provide information about the physical qualities of a vocalizer (Taylor & Reby, 2009).For instance, adult male humans have pitches about an octave lower than adult females, because vibrating segments of male vocal folds are 60% longer than those in females, and are also much larger (Titze, 1989).Sexual dimorphism in human vocal folds reflects differences in testosterone levels experienced during puberty (Beckford, Rood, Schaid, & Schanbacher, 1985); whether variations in hormones might lead to similar differences in other mammals is unclear (Taylor & Reby, 2009).Pitch differences related to vocal fold length and mass also allow humans to differentiate between child and adult voices, as children will have higher pitched voices than adult women and men (Pisanski et al., 2014;Smith, Patterson, Turner, Kawahara, & Irino, 2005).Similar variations associated with the age and sex of an individual have been observed in the giant panda (Charlton et al., 2009a), baboon (Rendall et al., 2004), and goat (Briefer & McElligott, 2011).Although bottlenose dolphins are moderately dimorphic (Tolley et al., 1995), it is unknown whether their vocal organs or associated control mechanisms vary systematically as a function of age or sex.
The ability to differentiate among callers in bottlenose dolphins would seem to be highly adaptive as dolphins are a social species that frequently encounter and interact with familiar and unfamiliar individuals.Socially, dolphins live in fluid and dynamic fission-fusion societies, similar to those of chimpanzees (Connor, Wells, Mann, & Read, 2000;Reiss et al., 1997;Smolker, Mann, & Smuts, 1993).In some bottlenose dolphin populations, males are known to form long-lasting coalitions comprised of two to three males who work together to gain access to mates, with long-term coalitions occasionally forming super alliances, in which two or more coalitions work together to defend a female against other coalitions (Connor, Heithaus, & Barre, 2001).Additionally, bottlenose dolphin calves and their mothers have been observed to exhibit strong associations with each other during the calf's first few years of life (Cockcroft & Ross, 1990;Connor et al., 2000;Mann & Smuts, 1999;Smolker, Richards, Connor, & Pepper, 1992;Wells, Scott, & Irvine, 1987), which can sometimes continue years after weaning (Wells et al., 1987).These early years are crucial for the social, physical, and vocal development of the calf (Cockcroft & Ross, 1990;Wells et al., 1987); in some cases it has been observed that calves will develop signature whistles very similar to those of their mothers (Sayigh, Tyack, Wells, & Scott, 1990;Sayigh, Tyack, Wells, Scott, & Irvine, 1995).Instances of cooperative feeding behavior (group hunting) have also been documented in certain bottlenose dolphins populations and include such tactics as herding fish into balls, trapping fish between conspecific dolphins, or trapping fish against mudbanks (Leatherwood, 1975;Rossbach, 1999;Würsig, 1986); in one case, individuals showed role specialization in the division of labor during instances of group hunting off Cedar Key, FL (Gazda, Connor, Edgar, & Cox, 2005).Dolphins also sometimes cooperatively forage with groups of human fisherman (Busnel, 1973;Pryor, Lindbergh, Lindbergh, & Milano, 1990).The development and maintenance of long-term social relationships may thus help dolphins to find food, avoid predators, secure mates, and care for young (Connor, 2007;Reiss et al., 1997;Tyack & Miller, 2002).
Whereas primates use mainly visual cues to identify and monitor the actions of group members, dolphins appear to be more reliant on auditory cues because visibility is frequently low.Like primates, dolphins have the sensorimotor capacity to actively scan their environment (potentially using both vision and echolocation, Au, 1993Au, , 1996) ) to check on the positions and activities of group members (Benoit-Bird & Au, 2009).It is not clear, however, whether any dolphin species ever use echolocation to monitor the actions of particular individuals within groups, or whether echoes from other individuals are reliable indictors of physical qualities such as sex, age, or identity.Passive monitoring of whistles (or other sounds) produced by group members may provide an alternative way for dolphins to track the activities of particular individuals, but again it is unclear whether vocalizations other than signature whistles reveal anything about the individual dolphin producing the sounds other than its presence and possibly the direction it is travelling (Lammers & Au, 2003).
In this study, we investigated whether acoustic variations within whistles produced by a dolphin are indicative of the whistler's age and sex.Past studies have measured and categorized whistles in a variety of qualitative and quantitative ways, often emphasizing visually and aurally salient frequency modulations within whistles (for review, see Harley, 2008;McCowan, 1995).Here, we tested whether a neural network could learn to identify whistles as being from males or females, or from younger or older dolphins, using temporal, spectral, or energetic acoustic cues other than variations in the structure of "frequency contours" (Buck & Tyack, 1993).

Method Study Site and Population
Data were collected at the Roatan Institute for Marine Sciences (RIMS) at Anthony's Key Resort in Roatan, Honduras, which is home to a captive population of bottlenose dolphins (Tursiops truncatus).The group resides in a 300 m 2 fenced enclosure within a natural lagoon covered by native coral, sand and sea-grass beds.The enclosure occupies depths ranging from the shallows of the shoreline to about 8 m (Dudzinski, Gregg, Paulos, & Kuczaj, 2010).During the two years included in this study, the group size ranged from 20-24 dolphins (due to new births), and included both wild caught and captive born individuals with ages ranging from neonate to 30+ years.The ages of study subjects were determined either by record of a known birthdate for those individuals born at the facility, or via estimation based on size and girth.Socially, this captive group is comparable to wild bottlenose dolphin populations in terms of group composition and with respect to the environmental conditions within which whistles are being produced (Connor, Smolker, & Bejder, 2006;Kogi, Hishii, Imamura, Iwatani, & Dudzinski, 2004).

Recordings and Selection of Whistles
Dolphin whistles were isolated from archived video/audio footage collected by one of the coauthors (KMD) at RIMS in 2009and 2010between May 7-21 (2009) and January 7-24 (2010).Recordings were collected between 06:30 and 15:30 daily (the majority of data collection sessions were before 11:00, as underwater visibility decreased significantly after noon) using a customized underwater mobile video/acoustic system (Dudzinski, Clark, & Würsig, 1995), including a digital video camera (Sony HDR HC5/HC7 in HD format, set to stereo audio, 12 bit amplitude resolution, 32 kHz sampling rate, compressed HDV audio) secured in an underwater housing with two attached omni-directional hydrophones (normal receiving sensitivity of -155 dB re1v/uPa).Video/audio footage was converted into digital video files (.dv) with two audio channels using a Sony Video Cassette Recorder (model GV-D1000 NTSC) processed through iMovie software.Subjects were filmed according to a focal follow sampling method, in which one individual within the field of view was randomly chosen and opportunistically observed (and recorded) until they traveled out of frame (Dudzinski et al., 1995).
Recordings were examined aurally and through visual analyses of spectrograms to identify high quality whistles: those that minimally overlapped with other sounds, and those in which the whistling dolphin could be accurately identified.The whistling dolphin was identified via aural detection from the stereo-audio recordings, as hydrophone spacing on the video/acoustic system was specifically designed to accommodate the human interaural distance based upon the speed of sound in water (Dudzinski et al., 1995).The ability to detect directionality (left, right, and center) of sound sources on the stereo-audio recordings and associate those sounds with visual distribution of subjects on screen allows for the identification of a whistling dolphin when in field of view (Dudzinski et al., 1995).Whistling dolphins could only be identified using this method when the vocalizing dolphin was fairly isolated (no more than three dolphins on screen), clearly visible on screen, only a short distance from the camera (~5 m or less), and whistling was clearly audible with little interference from other sound sources.Additionally, identification of the whistling dolphin was facilitated by behavioral and contextual information, such as emission of bubble streams from the blowhole and/or unusual swim patterns and movements.Dolphin subjects were identified in video footage via naturally occurring marks and scars.During video analyses, every whistle that met the criteria mentioned previously was included in the current sample, with no preferential selection of whistles from individuals of a certain age or sex or of whistles of a certain type.Consequently, the number of whistles per dolphin varied across individuals.

Sound Analysis
In general, dolphin whistles are tonal sounds in which most acoustic energy is focused within fundamental frequencies that are modulated over time (Figure 1).Whistles can also contain significant energy in harmonic bands (Lammers, Au, & Herzing, 2003), but these were not considered in the current analysis due to limitations in the sampling rate of the recording system.Whistles were saved in AIFF format (32 kHz sampling rate, 16-bit amplitude resolution) and analyzed using the bioacoustics analysis software, RavenPro 1.5.Narrow-band spectrograms (FFT = 512) were used to visualize whistles and to manually create an analysis window that encompassed the target whistle.For each recording, a single audio channel with the least noise was selected as the whistle record to be used during analysis.Fifteen acoustic measures available within Raven were recorded for each whistle, including: bandwidth 90%, average entropy, Inter-quartile Range (IQR) bandwidth, center frequency, frequency 5%, frequency 95%, peak/max frequency, average power, duration 90%, energy, IQR duration, mean frequency, aggregate entropy, begin time, and end time (see Charif, Waack, & Strickman, 2010, for details regarding how these measurements were calculated).The difference between the begin and end times was used as a measure of whistle duration, and this value was combined with the remaining 13 measures to create a 14-element vector describing acoustic properties of each whistle in the sample.
A minority of whistles in recordings overlapped with trains of broadband clicks produced by the same dolphin (see Figure 1b).No attempt was made to filter out the contributions of these clicks to measurements of spectral and energetic acoustic cues within whistles.Current theories of sound production in dolphins suggest that dolphins use similar vibrating membranes to produce both whistles and clicks (Madsen et al., 2012;Madsen et al., 2013).In this case, any effects of production mechanisms on the acoustic qualities of produced sounds within the frequency ranges considered in the current analyses should be highly correlated (see also Murray, Mercado, & Roitblat, 1998a).Because whistle-click combinations are frequently produced by dolphins, including these vocalizations in the sample provides a more ecologically valid assessment of the utility of dolphin sounds as indicators of physical qualities.If clicks provide information about a dolphin's age or sex beyond what is available within whistles, then this should lead to more accurate classification of whistle-click combinations than whistles alone.Conversely, if the presence of clicks degrades the detectability of such information, then this should lead to less accurate classification of whistle-click combinations.

Statistical Analysis and Data
Information recorded for each whistle included: time of occurrence, vocalizer ID, sex and age class, instances of overlapping click trains, instances of bubble streams produced concurrent with whistles, and any notable behavior observed during whistle production (e.g., parallel swimming, sexual behavior, object play, etc.).Descriptive statistics were also calculated for all acoustic parameters.
Artificial neural network software from the University of Alberta (available for download at http://www.bcp.psych.ualberta.ca/~mike/Software/Rumelhart/index.html) was used for all sound classifications.Specifically, Rumelhart software, a multi-layer perceptron neural network (NN) program used to explore pattern classification, was employed to differentiate dolphin whistles according to age and sex categories.Neural networks haves been used to classify many different types of animals vocalizations, such as deer (Dama dama) vocalizations (Reby et al., 1997), black-capped chickadee (Poecile atricapillu) calls (Nickerson, Bloomfield, Dawson, & Sturdy, 2006), bat (Rhinolophus spp.) echolocation (Parsons & Jones, 2000), and tungara frog (Physalaemus pustulosus) mate recognition calls (Phelps & Ryan, 1998).They have also been used in several studies regarding categorization of marine mammal calls, e.g., sperm whale (Physeter macrocephalus) clicks (Schaar, Delory, Català, & André, 2007), false killer whale (Pseudorca crassidens) vocalizations (Murray et al. 1998b) dolphin echolocation (Au, Andersen, Rasmussen, Roitblat & Nachtigall, 1995), and humpback whale (Megaptera novaeangliae) songs (Mercado et al., 2008).Figure 2 summarizes the basic structure of the NN used in this study.Three separate versions of NN classifiers were used to sort dolphin whistles.The first type was trained to discriminate between males and females (two output units) based on whistle characteristics, and the remaining two were trained to discriminate age differences between whistlers.In order for the NNs to discriminate between younger and older dolphins, age classes were assigned to all study subjects according to four broad categories: calf (3 weeks < 4 years), juvenile (4-7 years), subadult (7-10 years), and adult (11+ years) (Dudzinski et al., 2010;Kogi et al., 2004;Melillo, Dudzinski, & Cornick, 2009, provide details related to age category designation).The first age-related NN classifier was trained to discriminate between calves and adults (2 outputs), while the other classifier was trained to discriminate between the four age classes (4 outputs).All NN architectures consisted of 14 input neurons representing the 14 measured acoustic whistle parameters noted earlier.All input values were converted into z-scores before being used in the NNs.Simulations with varying numbers of hidden units were tested for each NN classifier, and the configurations that yielded the best performance were selected for use in more detailed analyses.Ten hidden units were utilized for the sex classifier, five hidden units for the calf classifier, and 10 units for the age classifier.(Haykin, 1994;Ripley, 2008).They are typically composed of three or more units (neurons), organized into separate processing layers -an input layer, an output layer, and at least one hidden layer.Units in each layer send information to other units through weighted connections, which are automatically modified through training to produce desired outputs (Dawson, 2008;Reby et al., 1997), essentially combining input information (i.e., the 14 acoustic parameters) in all possible ways, via hidden units, to most accurately categorize outputs.The weights of these connections determine which category (e.g., male or female) a NN will select for each whistle.Input layer; AP = acoustic parameter.Hidden layer; HU = hidden unit.
In each NN simulation, networks were initialized with random weights and the learning rate was fixed at 0.05.NNs were trained to determine whether they could learn to classify whistles based on the age and sex of vocalizing dolphins, and then tested on their ability to classify novel whistles (those that were not used for training).Half of the whistles from each category (e.g., male, female, calf, juvenile, and subadult, adult) were used during training; the rest were used for generalization tests (i.e., novel whistles).Whistles from all individuals were included in both training and test sets.NNs were trained until all whistles were categorized 100% accurately, or until a maximum of 10,000 trials was reached.The training and testing process was repeated 10 times for each type of NN to calculate average performance and ensure networks were achieving similar classification rates.Percent of whistles correctly classified during training and testing was used as a measure of performance.Trained NNs were analyzed to identify which whistle features determined how whistles were sorted.Binomial probability tests were used to assess the statistical significance of performance in generalization tests.Supplemental t-tests were used to evaluate differences between whistles from competing categories (male/female, calf/adult) for acoustic parameters that NNs weighed heavily when classifying whistles.

Results
A total of 13 hours of video data (approximately half each from 2009 and 2010) was analyzed in order to collect an opportunistic sample of whistles from as many dolphins as possible.Whistles included in the analysis (n = 398) were recorded from 20 dolphins (11 males/9 females), with 16 individuals recorded in 2009 and 18 in 2010 (Table 1).The number of whistlers per year differs between 2009 and 2010 because three dolphins were recorded whistling during only one year of the study, and three calves were born in 2010.
Inspection of whistle spectrograms revealed that 36% of recorded whistles co-occurred with minimally overlapping click trains (as in Figure 1b).Bubble emissions were associated with 63% of recorded whistles.Behaviors observed during whistling included sexual play (with few erections), parallel swimming with another dolphin, object play (with fence, sea grass, fins of swimmer), and jumping out of water (interspersed with whistling).

Sorting Whistles by Sex
The NNs (SEX-NET) that were trained to classify whistles based on the sex of the whistling dolphin learned to correctly categorize individual whistles (128 male whistles/75 female whistles) with an average accuracy of 99% (chance performance was 50% correct) after ~9000 training trials (Figure 3).The best performing network was 100% accurate after ~4500 trials.Summary statistics from the NNs were calculated for the hidden unit weights that connected the 14 acoustic parameter inputs to the 10 units in the hidden layer of each NN (see Figure 2).Output unit weights were also evaluated for each of the five best performing networks.For these NNs, weights from the three most significant hidden units for each output category were analyzed and used to determine which acoustic features were most important for distinguishing male and female whistles.The largest hidden unit weight means for the male-driven hidden units showed that frequency bandwidth, whistle duration, and entropy measures contributed most to the correct identification of male whistles (Figure 4).Measures of hidden weight values for the female-driven hidden units showed that frequency measures, entropy measures, and whistle duration were most relevant for correctly identifying female whistles.The acoustic parameters with the largest mean weight differences between male-driven units and female-driven units were center frequency and frequency 5%.We performed t-tests for these whistle features using the raw measurements obtained during data collection.These statistical tests showed that frequency 5% (p < 0.05), bandwidth 90% (p < 0.05), and average entropy (p < 0.05) were significantly different between male and female whistles The remaining whistle samples (123 male/72 female) were used to test the ability of trained networks to sort whistles that were not presented during training.Average performance accuracy during generalization testing was 55% (Figure 3).Maximum performance accuracy across networks was 60%.Binomial probability tests showed that 7 out of 10 network configurations yielded performance accuracies during generalization testing that were significantly better than chance (Table 2).

Sorting Whistles by Age
The NNs (CALF-NET) trained to classify whistles as coming from either calves or from adults correctly categorized whistles (22 calf whistles/ 38 adult whistles) with an average performance accuracy of 99.5% (chance performance was 50%) after an average of ~3500 training trials (Figure 3).The network that performed best was 100% accurate after ~1500 trials.Measures of hidden unit weights for calf-driven hidden units showed that spectral power, IQR duration, entropy, and frequency bandwidth were most relevant for correctly identifying calf whistles (Figure 5).Mean hidden unit weight values for the adultdriven units showed that frequency measures, bandwidth, energy, and entropy were all relevant for correctly identifying adult whistles.The acoustic parameters with the highest mean weight differences between calfdriven units and adult-driven units were frequency 5% and average power (Figure 5); t-tests performed on these acoustic features showed that frequency 5% (p < 0.05), bandwidth 90% (p < 0.05), average entropy (p < 0.01), and IQR duration (p < 0.05) were statistically different between calf and adult whistles.A subset of these features overlapped with those found to be important for sorting whistles by sex (Table 5).The remaining whistling samples (21 calves/37 adults) were used for generalization testing.Average accuracy during testing was 52% (Figure 3).Maximum performance accuracy was 59%.Binomial probability tests showed that only one network performed at levels significantly above chance during testing (see Table 2).
The NNs (AGE-NET) trained to classify whistles as coming from one of four age-classes (calf, juvenile, sub-adult, and adult) correctly categorized whistles (22 calf whistles, 102 juvenile, 38 sub-adult, and 38 adult) with an average accuracy of 95% after ~10,000 training trials (Figure 3).The network that performed best was 99.5% accurate after 10,000 trials.Acoustic parameters with the highest absolute mean values averaged across all 10 networks were frequency 5%, IQR duration, bandwidth 90%, and IQR bandwidth (Figure 6).A subset of these features overlapped with those found to be important for sorting whistles by sex and for distinguishing calves from adults (Table 5).
Networks tested with the remaining whistle samples (21 calves, 102 juveniles, and 37 sub-adults and 37 adults) achieved an average accuracy of 20% (chance performance was 25%) (Figure 3).Maximum performance accuracy across all networks was 34%.

Discussion
We investigated whether acoustic variations in dolphins' whistles are indicative of the sex or age of the dolphin producing that whistle.NNs proved able to learn to accurately sort whistles either by the age or sex of a whistling dolphin.This finding suggests that dolphins should also be able to learn to recognize acoustic cues that provide either direct or indirect indications of the physical qualities of individuals with which they regularly associate.Previous studies have shown that specific call features sometimes reflect the physical attributes of a vocalizing mammal (Blumstein & Munos, 2005;Briefer & McElligott, 2011;Charlton et al., 2009aCharlton et al., , 2009b;;Miller et al., 2007;Pfefferle & Fischer, 2006;Reby & McComb, 2003;Sanvito et al., 2007;Vannoni & McElligott, 2008).However, the acoustic cues that NNs used to classify dolphin whistles in the current study do not appear to correspond to either acoustic features that are often correlated with body size in mammals (e.g., differences in fundamental frequency or formant positions, Fitch & Hauser, 2002;Pisanski et al., 2014), or to the kinds of variations in frequency contours that researchers traditionally use to classify dolphin whistles (e.g., Buck & Tyack, 1993).
NNs that were trained to distinguish whistles produced by calves from whistles produced by adults showed perfect accuracy across multiple network configurations.Networks trained to divide whistles into four age classes performed less well, but were still quite accurate at performing this task.Spectral and energetic cues were critical to the successful classification of calf whistles.Average entropy, in particular, was an important distinguishing cue.Variations in the acoustic entropy of a whistle reflect the complexity of a whistle's spectral profile (i.e., a more variable profile yields a higher entropy).Entropy may be a useful indicator of age because calf whistles tend to be more erratic and unstable than those of adults (Caldwell & Caldwell, 1979;Killebrew, Mercado, Herman, & Pack, 2001;McCowan & Reiss, 1995b).Humans can easily discriminate a child's voice from that of an adult (Smith et al., 2005), and likely would be able to subjectively distinguish the whistles of calves from those of adult dolphins (to our knowledge, this ability has never been systematically investigated).It seems likely that dolphins would also be able to recognize whistles produced by calves as being made by calves, although the relevant cues for dolphins may be more closely related to fluctuations in frequency content than to the kinds of global pitch shifts that enable humans to estimate age from speech sounds (Smith et al., 2005).One question not addressed by the current analysis is whether age differences in whistle qualities might be more apparent in males than in females (or vice versa).It is difficult to directly compare the range of vocal variability produced by each sex using NNs, because the networks perform equally well at classifying the ages of both sexes.If the whistles produced by female dolphins are changing more (or less) during development than those of males, then this difference does not appear to significantly affect the ease with which their whistles can be classified.
The greater difficulty networks had learning to sort whistles into age classes may partly reflect the boundaries chosen to divide dolphins into age groups.Specifically, the cutoffs between adjacent categories (e.g., calves versus juveniles or sub adults versus adults) likely do not correspond to ages when systematic shifts in vocal features occur, such that the networks were forced to classify some similar whistles into different age classes, and some less similar whistles into a single age class.For instance, some features of signature whistles may become stable within the first year of development (Janik & Sayigh, 2013).It is also possible that some whistles from different subgroups (e.g., whistles from female adults and from male calves) may be more similar than whistles within age classes, as is the case in human vocalizations where the pitches produced by adult females and children are often more similar than those of adult males and females (Smith et al., 2005).Spectral and energetic cues still contributed the most to the success of age classification networks, suggesting that at least some of the cues that distinguish calf whistles from those of adults may also be correlated with age-related variations in whistle production throughout a dolphin's lifespan.
NNs trained to sort whistles based on the sex of the dolphin were also able to learn this task with perfect or near perfect accuracy.Once again, frequency cues were critical to the success of these networks, suggesting that frequency content may enable listening dolphins to recognize the sex of familiar whistlers, as assumed by proponents of the signature whistle hypothesis (Janik & Sayigh, 2013;Janik et al., 1994), as well as those that have argued against it (McCowan & Reiss, 1995a, 2001).An earlier study of sex differences in the calls of killer whales (Orcinus orca; the largest dolphin) also reported that spectral cues could be used to distinguish the sex of the caller, but only for a subset of calls (Miller et al., 2007).
Energetic and temporal cues also contributed to the success of sex-based classifiers, and it is possible that in cases where spectral cues alone were uninformative, that the networks either made use of these alternative cues or used a combination of spectral and non-spectral cues to classify certain whistles.Miller and colleagues (2007) speculated that sexual dimorphism in resonating cavities might contribute to the between-sex spectral differences in killer whales that they observed (see also Kremers et al., 2012).Such variations (if they exist) could also potentially affect between-sex differences in the spectral features of whistles produced by bottlenose dolphins.It is unclear, however, why such anatomical differences might lead to differences in the temporal or entropic features of whistles.Variations in the entropy of vocalizations between the sexes have yet to be studied in any mammal, so the possible sources of such differences remain obscure.Possibly, differences in the size or properties of the vibrating membranes used to produce whistles are sexually dimorphic in ways that affect how precisely those vibrations can be controlled.Such differences might also constrain how long a dolphin can sustain vibrations in a particular mode, which could in turn lead to temporal differences between male and female whistles.
Overall, analysis across all the NN classifiers developed in this study highlighted the importance of spectral and energetic cues as indicators of physical characteristics of whistling dolphins.Importantly, these results show that recognition of age and sex of familiar individuals is possible without information about frequency contours, which have been the primary focus of most past analyses of dolphin whistles (e.g., Buck & Tyack, 1993;King et al., 2014;McCowan, 1995).Though frequency modulation has been proven to be salient to dolphins, that in no way implies it is the sole acoustic feature that is relevant for every possible function dolphin whistles may serve.Presumably, signature whistles are not selectively androgynous, in that they contain all the relevant acoustic features found within non-signature whistles, so it follows that there may be other cues present within the entire whistle repertoire (signature and nonsignature whistles) of an individual.It is also important to note, that if frequency contour features contain the only acoustic information about age and sex in any given dolphin's whistle that the NNs would have failed.The high success rates of the NNs during training in determining the age and sex of familiar whistlers makes this assumption invalid.Bottlenose dolphins demonstrate excellent fine frequency discrimination (Thompson & Herman, 1975), the ability to retain information about particular frequencies over both short and long durations (Thompson & Herman, 1977), as well as the ability to both expand and compress timevarying tonal sounds (including both natural and artificial sounds) along both spectral and temporal dimensions (Mercado et al., 2014).These sophisticated sound perception and production capacities strongly imply that dolphins would be able to process and recognize acoustic cues comparable to the ones used as inputs to neural networks in the current study, and to thereby either directly or indirectly identify physical characteristics of familiar whistlers that are not visibly detectable.Future studies investigating the potential functionality of dolphin whistles should assess variations in acoustic features in addition to frequency contours, as well as the differences in the mechanisms that give rise to these acoustic variations across dolphins of different ages and sexes.
A key constraint faced by whistling dolphins is that they typically are producing whistles in shallow water sound channels that can generate large, distance-and depth-dependent fluctuations in spectral energy (Mercado & Frazer, 1999;Mercado et al., 2007).Consequently, many acoustic cues that might be indicative of physical characteristics in terrestrial mammals may be less reliable cues for listening dolphins.Furthermore, past experimental studies of dolphin auditory perception and vocal imitation abilities have repeatedly shown that bottlenose dolphins readily recognize (and can transpose) whistle-like sounds across octaves (Mercado et al., 2014).Several past analyses have thus suggested that dolphins can flexibly "warp" the spectral and temporal structure of whistles without loss of functionality (Buck & Tyack, 1993;McCowan, 1995).High versatility in sound production combined with large channel-related distortion strongly constrains the potential utility of acoustic cues as indicators of physical characteristics of whistling dolphins.
The NNs used in the current study successfully classified whistles into groups reflecting the characteristics of the dolphin that produced them.However, not all networks used variations in the same acoustic cues to accomplish this task.Although the most successful networks typically gave more weight to spectral cues than other cues, networks that did not rely extensively on spectral cues were still able to classify whistles well.This raises the possibility that the cues within dolphin whistles that are indicative of physical characteristics might vary across whistles.More detailed analyses in which similar whistles recorded from various distances, in different water depths, and from multiple individuals are compared in terms of how reliably they reveal physical qualities of whistlers can be used to test this hypothesis (see also related analyses by Blumstein & Munos, 2005).
The finding that NNs were generally unable to accurately classify whistles other than those that they had been exposed to during training provides further evidence that intrinsic constraints on sound production do not consistently reveal a whistler's characteristics.Only NNs trained to classify whistles by sex proved to be able to classify unfamiliar whistles at levels significantly above chance and their performance was relatively unimpressive.Miller and colleagues (2007) found that only a small subset of killer whale calls showed sex differences in raw frequency measures and that the intensity ratio of the first two harmonics of calls was more consistently sexually dimorphic.Although bottlenose dolphin whistles differ in many respects from the calls used by killer whales, both species face similar environmental and social coordination challenges.Furthermore, bottlenose dolphins produce whistles using mechanisms comparable to those used by calling orcas.The kinds of acoustic cues that indicate sex might thus be similar across these two dolphin species.
The relative inability of NNs to generalize age-based classifications to unfamiliar whistles suggests that identifying physical characteristics of unfamiliar dolphins based on their whistles might be particularly challenging.It is currently unknown what, if any, non-spatial information listening dolphins might be able to glean from hearing the whistles of unfamiliar dolphins.Unknown dolphins may be acoustically androgynous and ambiguously aged until they have been visually or physically encountered.It seems unlikely, however, that a female dolphin would not be able to consistently distinguish the whistles of an unfamiliar calf from those of a potentially aggressive adult male without coming into close proximity.More likely possibilities are that the acoustic features that enable dolphins to make such distinctions were not provided as inputs to the neural networks used in the current study, or that whistles produced in contexts where unfamiliar dolphins are likely to meet differ systematically from those produced when with familiar companions.Experimental tests of dolphins' abilities to classify whistles as coming from individuals of different ages or sexes could clarify the precision with which dolphins can make such distinctions, and how well they can do so for different whistles types from familiar and unfamiliar individuals.An alternative explanation of the poor generalization by networks is that dolphins rely less on acoustic cues from whistles to identify qualities of the individuals producing them and more on the situation within which those whistles are produced (e.g., the speed and direction of movement of the whistler, the rate of whistle production, or the sequencing and selection of whistle types relative to other whistles being received).Knowledge of the current social context and past social encounters may also provide clues to the age and sex of unfamiliar whistlers.
One possibility not directly addressed by the current analyses is whether NNs might learn to sort whistles into different demographic groups by learning to recognize idiosyncratic qualities of whistles produced by particular individuals.It is clear that the NNs did not learn individual-specific "voice prints" during training, because in that case the NNs should have shown near perfect generalization when presented with novel whistles produced by the same individuals.Nevertheless, it remains possible that the NNs learned to recognize specific combinations of whistle features that were only produced by a single individual.If the NNs classified whistles in this way, then this would imply that individual dolphins produce a "signature repertoire" rather than just one stereotyped signature whistle.Because such signatures would have to be based on unique combinations of acoustic cues, they would likely not be subjectively apparent or detectable through analyses of frequency contours.Whether such cue combinations would reliably reveal the characteristics or identity of a whistling dolphin over longer distances remains unclear.What is clear is that none of the individual acoustic cues analyzed in this study provides an unambiguous marker of a dolphin's age or sex across all types of whistles.
In several mammals, the fundamental frequencies of calls are strongly constrained by the length and mass of vocal folds, such that physical differences due to age-related vocal fold growth or sexual dimorphism directly affect characteristics of calls (Taylor & Reby, 2009).However, vocal fold growth is not typically constrained by body size of an individual (Fitch, 1997), and researchers have been unable to reach a consensus regarding whether or not fundamental frequency directly correlates with body size (Fitch, 1997;Pfefferle & Fischer, 2006;Pisanski et al., 2014;Reby et al., 2005;Rendall, Kollias, Ney, & Lloyd, 2005).Past studies of the mechanisms that dolphins use to produce whistles (e.g., Cranford et al., 1996;Madsen et al., 2013) have not assessed how these structures might vary across ages or sexes.Consequently, it is unclear how such variations (assuming they exist) might affect whistle structure.As noted earlier, it is also unknown whether any resonating structures involved in whistle production might vary in size across individual dolphins in ways that impact whistle structure.In general, any such anatomical differences in vocal organs would be expected to have similar effects on all whistles with similar frequency content, and thus should facilitate the ability of NNs to generalize to unfamiliar whistles from familiar individuals.The fact that NNs showed limited abilities to distinguish unfamiliar whistles produced by familiar calves from unfamiliar whistles produced by familiar adults suggests that the cues that networks learned to use to distinguish calf whistles from adult whistles were not cues related to differences in body size (Tolley et al., 1995), or to associated variations in the size or shape of static resonating air cavities.To the extent that dolphin whistles contain cues indicative of physical qualities such as age or sex, those cues either are independent of the anatomical properties of production mechanisms, or are related to those properties in ways that have not been observed in terrestrial mammals.
The current analysis is a first attempt at clarifying what features of dolphin whistles, if any, provide indications of the physical qualities of the dolphin producing those whistles.As such, it is limited in several respects.First, the effects of sound transmission on the recorded whistles included in this analysis were not measured.Whistling dolphins were recorded in water of varying depths, from varying distances, and in situations where dolphins might have been shallower or deeper than the diver collecting recordings.If dolphins of different ages or sexes tend to whistle at particular depths below the surface, then variations in whistles across classes might in part reflect systematic variations in environmental effects on whistles.A second limitation is that the sample of whistles analyzed in this study was opportunistic, and therefore might not be representative of the full range of whistles that these and other dolphins produce in other social contexts.Consequently, we cannot say whether NNs would be as successful at classifying whistles produced in other contexts, by other bottlenose dolphins, or by other species of dolphin.Finally, more biologically based measures of whistle features, for example based on the auditory resolution of dolphins (Branstetter, Mercado, & Au, 2007), or on production-based measures of vocalizations (Favaro, Briefer, & McElligot, 2014;Vannoni & McElligott, 2007), might provide clearer indications of the cues that are available to listening dolphins.Further investigations of age and sex differences in dolphin whistles should include different and/or more extensive acoustic measures to determine whether other cues indicative of the physical qualities of whistling dolphins exist.
Animal communication researchers have long attempted to associate variations in sounds to differences in the information those sounds convey (Owings & Morton, 1998;Seyfarth & Cheney, 2010).In many cases, the focus of these studies has been on characterizing how sounds vary rather than on understanding the mechanisms that give rise to that variability.Past investigations of "honest advertisement" by calling animals have emphasized obligatory variations in calls (Fitch & Hauser, 2002), giving less consideration to how sound selection and modulation might interact with the availability of cues.For animals with highly flexible vocal control systems, such as cetaceans and songbirds, the possibility exists that vocalizers may be able to differentially or selectively reveal their physical characteristics through either call selection or modulation of call structure.For instance, a dolphin might produce whistles that familiar conspecifics would associate with the identity or physical qualities of the whistler, but that might provide no cues to unfamiliar dolphins of the vocalizer's sex or age.A dolphin might also selectively whistle in ways that more clearly reveal their sex or age (e.g., if a whistle was used to solicit or deter approaches by unfamiliar individuals).Whether and when animals make use of vocal communication for such "selective advertisement" is an important question for future research.
Finally, the fact that NNs were able to learn to classify all whistles that they were trained with as coming either from males or females and as being produced by either younger or older dolphins suggests that: (1) dolphins likely can identify familiar individuals based on hearing almost any whistle that those individuals produce, and (2) representations of time-varying patterns of frequency modulation are not necessary to make such distinctions.Experimental studies that test whether dolphins trained to detect whistles produced by a particular individual generalize this ability to other whistle types that the same individual frequently produces can clarify the extent to which dolphins naturally sort whistles based on which dolphins are producing them.

Figure 1 .
Figure 1.Example spectrograms of whistles used in the current study that were classified as showing either (a) no overlap with other vocalizations, or (b) minimal overlap.

Figure 2 .
Figure 2. Example of the NN structure used for sex classification.NNs are computer programs that learn to classify various inputs through pattern recognition(Haykin, 1994;Ripley, 2008).They are typically composed of three or more units (neurons), organized into separate processing layers -an input layer, an output layer, and at least one hidden layer.Units in each layer send information to other units through weighted connections, which are automatically modified through training to produce desired outputs(Dawson, 2008;Reby et al., 1997), essentially combining input information (i.e., the 14 acoustic parameters) in all possible ways, via hidden units, to most accurately categorize outputs.The weights of these connections determine which category (e.g., male or female) a NN will select for each whistle.Input layer; AP = acoustic parameter.Hidden layer; HU = hidden unit.

Figure 3 .
Figure 3. Accuracy for NN classifiers after training and in generalization tests.

Figure 4 .
Figure 4. Mean hidden unit values calculated from the five highest performing sex classification networks.Means were calculated using the three hidden units that contributed the most to the success of each neural network.Higher hidden unit values indicate a larger impact on which output unit became active.Bars indicate standard errors of the means.

Figure 5 .
Figure 5. Mean hidden unit values calculated from the five highest performing NNs trained to distinguish calf whistles from adult whistles.Higher hidden unit values indicate greater relevance to classification of whistles by the network.Bars are standard errors of the means.

Figure 6 .
Figure 6.Mean hidden unit values calculated for each acoustic parameter for neural networks trained to sort whistles into four age classes.Values were calculated from all networks and all hidden units to indicate overall acoustic parameter significance.Bars are standard errors of the mean.

Table 1
Number of Whistling Dolphins and Whistles Recorded According to Categories of Age Class and SexTwo adult dolphins were only recorded in 2009 and one was only recorded in 2010.* One whistle could only be identified according to gender; therefore there are 398 total whistles according to the sex category, and 397 total whistles according to age class category.

Table 2
Binomial test p-values calculated for the performance accuracy of each network during generalization testing with novel inputs

Table 5
Mean Hidden Unit Values Calculated for Acoustic Parameters for each Category Type for all NN ClassifiersMean hidden unit values for the age class classifier were calculated with all output categories combined, because hidden units could not be isolated according to specific outputs.Bold text indicates the highest mean values for each category.