Neurocomputational Mechanisms Contributing to Auditory Perception

Yale E. Cohen1,2,3), Taku Banno1), Jaejin Lee1), Francisco Rodriguez-Campos1), Matthew Schaff4), Lalitta Suriya-Arunroj1), Joji Tsunada1) 1) Department of Otorhinolaryngology, University of Pennsylvania, Philadelphia, PA 19104. ycohen@pennmedicine.upenn.edu 2) Department of Neuroscience, University of Pennsylvania, Philadelphia, PA 19104 3) Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104 4) Neuroscience Graduate Group, University of Pennsylvania, Philadelphia, PA 19104


Introduction
Hearing is fundamental to and spans our human existence.We can hear selectively: we can focus on as ound that a banjo player is producing and ignore the music produced by the rest of the band.But, then, we can switch our attention to the voice of the band'ssinger.W ecan hear in avariety of environments: in aquiet library or in abusy,loud restaurant.Hearing in this loud restaurant is particularly challenging because our dining partner'sv oice is embedded in ac omplicated mixture of stimuli of other patrons talking, waiters describing the specials of the day,d ishes clanking, and background music.
This remarkable capacity is associated with an umber of computational processes, which act in both parallel and serial, including: perceptual grouping, decisionmaking, and attention.(1) Perceptual grouping is af orm of feature-based stimulus segmentation that determines whether acoustic events are grouped into as ingle sound or are segregated into distinct sounds [1].(2) Auditory decision-making is ac omputational process in which the Received19February 2018, accepted 27 August 2018.brain interprets sensory information in order to detect, discriminate, and identify the source or content of auditory stimuli [2].(3) Although attention is not always necessary,o ur awareness of as ound can be influenced by attention [3].Forexample, we can choose whether to listen to or ignore the mandolin, the fiddle, or even the whole band during aconcert.Likewise, we can selectively attend to the particular features in aperson'sv oice that allowus to identify the speaker.
It is widely believedt hat the neural computations and processes that mediate auditory perception are found in the ventral auditory pathway [4,5,6].In rhesus monkeys [7,8,4], this pathway begins in the anterolateral (AL) belt region of auditory cortex, which receivesinput from core regions of auditory cortex, specifically the primary auditory cortex( A1)a nd field Ra nd the mediolateral (ML) belt region.AL, in turn, projects directly and indirectly via the rostral parabelt (rPB)region of the auditory cortex, to the ventrolateral prefrontal cortex(vlPFC).An analogous pathway has been identified in humans [7].
In this review, we focus on the processing that occurs in different stages of the ventral auditory pathway.W ereviewthe mechanisms and representations along the ventral pathway,with an emphasis on those experiments that combined electrophysiology with auditory behavior.W einte-grate these studies into amodel of hierarchical processing in the ventral pathway.W eattempt to reconcile different sets of studies and pose challenges to the field as to howto best study processing along this pathway.

Neural correlates of auditory perception in the auditory cortex
Although not technically ap art of the ventral pathway, al arge literature has focused on the contribution of A1 to perception and behavior.T he results of these studies have been, to some degree, controversial.Typifying the controversy is A1'sc ontribution to perceptual decisionmaking.In order to study decision-making, humans and non-human animals engage in ab ehavioral task, while neuronal signals are recorded simultaneously during the task.The critical manipulation is to probe behavior when as timulus is "ambiguous".Fore xample, if al istener is asked to report whether as equence of tone bursts contained more "low-frequency" or "high-frequency" tone bursts (a "low-high" task), an ambiguous or "noisy" stimulus would be one that contained 50% low-frequencytonebursts and 50% high-frequencyt one bursts.Using this strategy,c hanges in stimulus features can be separated from those related to changes in the behavioral report.In essence, if aneuron codes astimulus feature, its response should be invariant to the behavioral choice.In contrast, if aneuron reflects the behavioral choice, then neuronal activity should differ as afunction of the choice.This is often called "choice-related" activity.
Several studies have noted such choice-related activity in A1 [9,10,11,12,13,14].Fore xample, in one study in which ferrets participated in apitch discrimination task, both local-field potentials and spiking activity were modulated by the ferrets' pitch judgments.Work from the Sutter lab has also identified A1 and belt-region activity that was modulated by choice; in these experiments, rhesus monkeys detected changes in amplitude-modulated noise.In a human neuroimaging study,core auditory cortexwas modulated by whether listeners could discriminate between ambiguous speech sounds.
Adifferent set of literature, however, has failed to identify choice-related activity in A1 and several belt regions of the auditory cortex.Work from Romo'sgroup failed to see choice activity in A1 while rhesus monkeyswere discriminating between acoustic flutter butd id see elements of choice behavior in the ventral premotor cortex [15,16].Similarly,several studies from our lab have not identified choice-related activity in the auditory cortex [17,18,19].In ar ecent study,w ea sked monkeyst od etect at arget stimulus in various signal-to-noise regimes and found that, on average, A1 activity wasn ot modulated by choice.A different finding emerged at the population level: al inear decoder could read-out choice as well as various task parameters (single-to-noise leveland "signal" versus "noise" trials)f rom these neurons.This implies that information about choice and task is available in the population, which potentially could be encoded more robustly in those single neurons that receive input from A1.In another set of studies in which monkeysw ere engaged in the low-high task, we found average AL and ML activity wasnot modulated by choice.In afi nal study,m onkeysp articipated in at wo-interval, discrimination task in which theyr eported whether twophonemes were the "same" or "different".Once again, average AL activity wasnot modulated by choice.
We cannot totally reconcile these different sets of findings.However, we wonder whether these differences may relate to the specificn ature of the different auditory decisions.Fore xample, for those studies in which choicerelated activity waso bserved [11,12,13], the animal listeners were asked to makea na uditory decision about a relatively low-levels timulus feature (pitch or amplitude modulation).Because these features may be represented directly in an A1 neuron'sfiring rate, A1 activity may be able to encode the sensory evidence for these decisions.In contrast, in the sets of studies that did not identify choicerelated activity, [15,16,17,18,19], rhesus monkeyswere required to makeadecision about astimulus attribute that may not be encoded directly in an euron'sfi ring rate.Indeed, in our signal-to-noise detection task [19], we found that the firing rates of most A1 neurons were not modulated by different signal-to-noise levels.Fors uch decisions, it is feasible that only later regions of the ventral auditory pathway participate in the decision process.
This raises an intriguing idea.Namely,the neurocomputations underlying auditory perception may be multiplexed along the ventral auditory pathway.I no ther words, depending on the demands of the task and the nature of the auditory decision, ap articular brain region may differentially contribute to the different computations underlying auditory perception and decision-making.To test this idea, we would minimally have to record simultaneously in different brain regions and examine howsingle neurons and neuronal populations are modulated by the same task with different types of stimuli and also by different tasks with the same stimuli.

The contribution of time to understanding decision-making
We have previously reported the presence of choicerelated activity in the vlPFC [20].In that study,r hesus monkeysd iscriminated between twop honemes and reported whether theyw ere the same or were different.We found that, for the same nominal stimulus, vlPFC activity wasmodulated by choice.At first glance, this combination of studies from AL, ML, and vlPFC suggest ah ierarchical processing stream in which sensory evidence early in the pathway (ALa nd ML)gets transformed into acategorical choice later in the pathway (vlPFC).However, although this posited hierarchys eems quite enticing, it is missing ac ritical piece of information: when do these computations occur?This timing information is critical because it can help us differenti-ate between neuronal responses related to feedforward and feedback connectivity.
Indeed, the ventral auditory pathway is as rich in feedback connections to cortical and subcortical regions as it is in feedforward connections [7,21].Because of this rich interconnectivity,the presence of, for example, A1 choicerelated activity does not necessarily imply that this information arises in A1 nor does it necessarily followthat it is part of afeedforward process in which this choice-related activity contributes causally to the eventual auditory decision.Similarly,i fv lPFC choice-related activity occurs after the actual choice, our interpretation of its functional utility would be substantially different than if it occurred at the time of the choice.
One means to resolvet his issue is to identify approaches by which as ubject'sb ehavioral responses can be related to underlying neurocomputational processes and ultimately to neuronal activity.Asuccessful approach has been to combine behavioral tasks with variants of sequential-sampling models, likethe drift-diffusion model [2].These models quantify the process of converting incoming sensory evidence, which is represented in the brain as the noisy,s piking activity of populations of relevant sensory neurons, into ad ecision variable that guides behavior [22].Ak ey benefito ft hese models is that they can makeq uantitative predictions about both choice and response time as af unction of the manipulated stimulus parameter (e.g., signal-to-noise ratio).Thus, we can use these models to jointly fit (1) the psychometric function, which describes accuracyv ersus the manipulated stimulus parameter and (2) the chronometric function, which describes response time versus the manipulated stimulus parameter.T he fits to this model yield insights into the decision process that givesrise to the measured accuracy, response times, and trade-offsb etween the two [ 2].Further,these model fitsalso allowustoapproximate the time of occurrence of the decision process: the drift-diffusion model partitions asubject'sresponse times into "decision" and "non-decision" times.Decision time reflects the time (relative to stimulus onset)n eeded to makeap erceptual decision, whereas non-decision times reflect sensory delay,motor preparation and other post-decision processing.
Using this approach, we have made several important insights into the temporal dynamics of an auditory perceptual decision that are needed during the low-high task [17,23].These temporal dynamics reflect this particular auditory decision and whether or not theyg eneralize to other auditory decisions is an open question.Regardless, relative to the time of stimulus onset, we found comparable stimulus sensitivity (i.e., the degree to which neuronal activity wasm odulated by the percentage of lowand high-frequencytone bursts in asequence)and choicerelated activity in ML and AL.However, on aneuron-byneuron basis, when we compared AL's responses relative to the decision time, we found acorrelation between stimulus sensitivity and choice-related activity.I mportantly, this correlation became significant just before the decision time.We failed to findsuch acorrelation in the ML data.This suggests that AL-but not ML-may transmit the information used in this decision.Consistent with these observations, microstimulation of AL sites biased the rhesus monkeys' behavior,whereas ML sites did not bias behavior.vlPFC neurons appear,o na verage, to signal choicerelated activity only after the inferred decision commitment.vlPFC may then play arole in monitoring or evaluating decision outcomes.
Where then is the brain region that actually encodes the decision?We have yet to identify this area.But, we suspect that the parabelt may be aprime target because it receives input from AL and projects to vlPFC.Even without the identification of the putative site(s) that encodes the actual decision, this temporal analysis strongly suggests af eedforward pathway between ML and the vlPFC along with the respective functional roles.
We believe that being able to identify the temporal dynamics of ad ecision-making process and then relating these dynamics to neuronal activity is ap owerful wayt o identify aflow of information.We encourage others to use asimilar approach in order to distinguish between feedforward and feedback contributions to perception and behavior.

Conclusions and futureq uestions
Of course, several fundamental questions remain.First, as noted above,u nderstanding howf eedforward versus feedback information contributes to neural correlates of perceptual judgments remains an open question.Relatedly,t he degree to which subcortical processes interact with decision-making and these feedforward and feedback loops is an open question.Second, the degree to which different types of auditory judgements differentially engage different regions of the ventral pathway have yet to be fully articulated.Athird question is to identify howthe different computational processes (e.g., perceptual grouping, attention, and decision-making)t hat underlie auditory perception interact with one another at both the cortical and subcortical level.Forexample, it remains an open issue as to whether and howa ttention differentially modulates neural correlates of auditory perception at different hierarchical levels of the ventral auditory pathway (e.g., A1 versus AL) [24].Another example is the potential interactions between auditory perceptual grouping and decision-making.Finally,itisimportant to identify the manner by which the dorsal and ventral auditory pathways interact in order to form ac onsistent and coherent representation of the auditory scene and the degree to which the dorsal pathway (and potential other pathways)mediate perception.