Computing Sound Space: World-centered Sound Localization in Ferrets

The ability to localize sounds is central to healthy hearing. We can perceive sound location in multiple coordinate systems including those defined by the observer (e.g. “the phone is on my right”) or by the environment (e.g. “the phone is in the office”). Although we can describe sound locations in multiple spaces, the coordinate frames in which non-human animals can perceive sounds remains unclear. Here, we designed a task that required subjects (ferrets) to report the location of sounds in the world across changes in head pose. We developed simulations of the task using world-centered (allocentric) or head-centered (egocentric) models of spatial processing, and compared model predictions to animal behavior. We found that observed behavior most closely matched performance of allocentric models, indicating that subjects solved the task using a world-centered strategy. Our findings indicate that ferrets, like humans, can perceive allocentric sound space and thus abstract sound location beyond momentary head-centered acoustic cues.


Introduction
Sound localization in hearing is critical in many natural behaviors for both humans and other animals. The ability to localize sounds depends on the use of multiple redundant acoustic cues, including monaural spectral cues introduced by ear shape and binaural cues extracted from comparison of sound signals at the two ears. The auditory brain must use these cues to compute the location of sounds de novo as, in hearing, there is no spatially organized array of sensory receptors as there is on the retina in vision or skin in touch.
Computing sound location requires the definition of a coordinate system in which to represent space. In hearing, it has largely been assumed that sounds are encoded in head-centered coordinates in which acoustic signals are natively sampled. However recent studies have shown that neurons in auditory cortex can be tuned to sound location in other coordinate systems, including those defined by the organism's environment across changes in head pose (Town, Brimijoin, & Bizley, 2017). These findings illustrate the potential for integration of auditory and non-auditory information to remap sounds into behaviorally relevant spaces defined by the world. Such cells might in effect provide an equivalent to place cells in hearing, encoding the location of auditory landmarks.
A critical question in this field however is how findings from neurophysiology in passively listening or anesthetized animals translate to sound perception. Human listeners can verbally report world-centered sound location and neural signatures such as mismatch negativity can be elicited by changes in world-centered sound location (Altmann, Wilczek, & Kaiser, 2009;Schechtman, Shrem, & Deouell, 2012). However it is unknown whether other species can even perceive sounds in different coordinate systems. Such insights are critical both for developing animal models of sensory processing and understanding comparative cognition more broadly.
At the behavioral level, studies of sound localization in animals have often used an approach-to-target design where a subject at the center of a speaker ring initiates presentation of sound from a peripheral target that is then approached by the subject (Keating, Dahmen, & King, 2013;Malhotra et al. 2008). While this experimental design has enabled critical insights into neural processing during sound localization (Bajo et al. 2010;Keating, Dahmen, & King, 2015), it cannot reveal whether subjects are reporting head-centered or world-centered sound location. This ambiguity arises from the constant head pose of animals during sound presentation, which introduces a strong correlation between sound location in head and world coordinate frames. Here we developed a new behavioral task to test if animals (ferrets) could report sound location in world-centered coordinates.

Task design
Coordinate frame ambiguity may be resolved by varying head pose across trials. Here we developed a novel task in which freely-moving subjects discriminated the location of two sound sources with head pose varying across trials.

849
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0  Figure 1 shows the task design in which subjects at a central platform (C) initiate sound presentation from one of two speakers (source A or B) and can visit one of two response ports (X or Y) in order to obtain reward. Here, sounds were broadband noise bursts (250 ms) matched for level across speakers (~60 dB SPL). A reward contingency was imposed to reinforce response X after stimulus A and response Y after stimulus B.
Across trials, the central platform was rotated so that sound angle relative to the platform (and thus subject's head) varied, while source position in the world remained constant. In animal experiments, rotations were limited to 30° intervals between ±180° across sessions (i.e. blocks of 50 -250 trials), while our simulations sampled performance at much smaller intervals (3°) and with platform angle varied between individual trials.
Finally, on a small proportion of trials (≤ 10%) we presented probe sounds pseudo-randomly from separate locations around the world and relative to the head. All responses to probe sounds were rewarded.

Animal behavior
We trained two female ferrets (Mustela putorius furo) to perform the task. Ferrets were selected as a model species for their sensitivity to low sound frequencies, which enables use of both inter-aural level and timing localization cues available to human listeners. Ferrets are also widely used in approach-to-target sound localization tasks (Keating et al., 2013;Wood et al. 2017)

Simulations
We qualitatively compared animal behaviour with the performance of three models that considered: (1) sound and response location in world centered coordinates (Fig 2A, fully allocentric model), (2) sound location in head centered coordinates and response location in world centered coordinates (Fig 2B, partial  egocentric), and (3) sound and response location in head centered coordinates (Fig. 2C, fully egocentric).
Models 1-2 considered the probability of responding at spout Y as a function of sound location in either the world or relative to the head: Here x refers to speaker x-axis position in the relevant coordinate system, α is a constant reflecting the uniform marginal distribution on the y-axis and β determines the slope of spatial modulation (e.g. β = 4 in Fig. 2).
Model 3 considered sound location relative to the head with an additional 'tilt' parameter (ϴ) to rotate the joint density function in head-centered space, while the function itself described the probability of responding left (0 ≤ atan2(y, x) ≤ 180) rather than at a particular spout: Where r is the rotation matrix about the z axis in the head coordinate frame for ϴ (e.g. β = 20, ϴ = 7.5°).

Rotational invariance
Both ferrets discriminated accurately across platform rotation (Fig. 3A) indicating that these animals could generalize world-centered sound location across headcentered cues available on individual trials.
When comparing simulations, only the fully allocentric model performed above chance at all platform rotations (Fig. 3B). When sound location was represented relative to the head and response relative to the world (partial egocentric model), performance varied as a cosine function of platform angle, such that performance across rotations was at chance (Fig. 3C). In contrast, when we adapted an egocentric model to respond in head-centered space, mean performance across rotations returned close to ceiling (Fig. 3D). However averaging across rotations obscured performance troughs (Fig. 3E) at angles around ϴ ± 90° and with width related to the sharpness of spatial tuning (β). Although such troughs are visible in simulation, they would not be detected at the rotation intervals in our behavioral experiment and thus responses to trained sounds alone were insufficient to conclude animals used a world-centered strategy.

Probe sounds split model predictions
To resolve the ambiguity between fully allocentric and fully egocentric models, we presented probe sounds from additional untrained locations. Predicted responses to probe sounds of the two models diverged with responses in the allocentric case being invariant to platform rotation; whereas responses to probe sounds varied systematically with platform angles in the egocentric case (Fig. 4A).
Animals responses to probe sounds were largely robust to platform rotation (Fig. 4B), as predicted by the allocentric but not egocentric model. Combining data from both animals we found that the standard deviation of responses across platform rotations was consistently lower than across speaker angle. This was also the case for allocentric but not egocentric simulations. Thus our ferret's behavior was best matched by a model that reported world-centered sound location.

Generalization vs. relearning
As a final control, we also measured performance as a function of experience after platform rotation. In contrast to simulations, our animal work only rotated the platform between sessions and not across individual trials. This raises the possibility that animals may initially localize sounds using a head-centered strategy and rapidly relearned stimulus response mapping within a session.
To exclude this possibility, we plotted performance on only the first trial after platform rotation (i.e. when the animal had no experience within the session). In contrast to the rapid remapping hypothesis, we found that animals performed accurately on the first trial after rotation (Fig. 5) indicating that animals generalized from a rule-based strategy.

Conclusions
Our results demonstrate that ferrets can report the world-centered location of sounds across variations in head pose suggesting that humans are not unique in this ability. The behavior we observed cannot be accounted for by models based on head-centered sound location. Instead, our data support the suggestion that auditory information must be combined with non-auditory signals such as balance, vision and proprioception to abstract sound location in the world (Yost, Zhong, & Najam, 2015). How this occurs in the brain is a topic of ongoing investigation, though brain regions such as auditory cortex (from which we are currently recording during task performance) are likely to play a critical role (Malhotra et al., 2008;Wood et al., 2017).
In addition, our simulations emphasize the importance of considering the coordinate systems of action in sound localization. By switching the space in which responses were mapped from world-centered to headcentered, we recovered successful task performance from egocentric models, even though the representations of sound location were the same in both models.