Concurrent Multisensory Integration and Segregation with Complementary Congruent and Opposite Neurons

Our brain perceives the world by exploiting multiple sensory modalities to extract information about various aspects of external stimuli. If these sensory cues are from the same stimulus of interest, they should be integrated to improve perception; otherwise, they should be segregated to distinguish different stimuli. In reality, however, the brain faces the challenge of recognizing stimuli without knowing in advance whether sensory cues come from the same or different stimuli. To address this challenge and to recognize stimuli rapidly, we argue that the brain should carry out multisensory integration and segregation concurrently with complementary neuron groups. Studying an example of inferring heading-direction via visual and vestibular cues, we develop a concurrent multisensory processing neural model which consists of two reciprocally connected modules, the dorsal medial superior temporal area (MSTd) and the ventral intraparietal area (VIP), and that at each module, there exists two distinguishing groups of neurons, congruent and opposite neurons. Specifically, congruent neurons implement cue integration, while opposite neurons compute the cue disparity, both optimally as described by Bayesian inference. The two groups of neurons provide complementary information which enables the neural system to assess the validity of cue integration and, if necessary, to recover the lost information associated with individual cues without re-gathering new inputs. Through this process, the brain achieves rapid stimulus perception if the cues come from the same stimulus of interest, and differentiates and recognizes stimuli based on individual cues with little time delay if the cues come from different stimuli of interest. Our study unveils the indispensable role of opposite neurons in multisensory processing and sheds light on our understanding of how the brain achieves multisensory processing efficiently and rapidly. Significance Statement Our brain perceives the world by exploiting multiple sensory cues. These cues need to be integrated to improve perception if they come from the same stimulus and otherwise be segregated. To address the challenge of recognizing whether sensory cues come from the same or different stimuli that are unknown in advance, we propose that the brain should carry out multisensory integration and segregation concurrently with two different neuron groups. Specifically, congruent neurons implement cue integration, while opposite neurons compute the cue disparity, and the interplay between them achieves rapid stimulus recognition without information loss. We apply our model to the example of inferring heading-direction based on visual and vestibular cues and reproduce the experimental data successfully.

: Multisensory integration and segregation. (A) Multisensory integration versus segregation. Two underlying stimulus features s 1 and s 2 independently generate two noisy cues x 1 and x 2 , respectively. If the two cues are from the same stimulus, they should be integrated, and in the Bayesian framework, the stimulus estimation is obtained by computing the posterior p(s 1 |x 1 , x 2 ) (or p(s 2 |x 1 , x 2 )) utilizing the prior knowledge p(s 1 , s 2 ) (left). If two cues are from different stimuli, they should be segregated, and the stimulus estimation is obtained by computing the posterior p(s 1 |x 1 ) (or p(s 2 |x 2 )) using the single cues (right). (B) Information of single cues is lost after integration. The same integrated resultŝ = 0 • is obtained after integrating two cues of opposite values (θ and −θ) with equal reliability. Therefore, from the integrated result, the values of single cues are unknown.
The information of individual cues can be recovered by using the preserved disparity information if  In the present study, we explore whether opposite neurons are responsible for cue segregation 92 in multisensory information processing. Experimental findings showed that many, rather than a 93 single, brain areas exhibit multisensory processing behaviors and that these areas are intensively 94 and reciprocally connected with each other 8,9,14-16 . The architecture of these multisensory areas This prior reflects that the two stimulus features from the same stimulus tend to have similar values.

140
The parameter κ s specifies the concurrence probability of two stimulus features, and determines 141 the extent to which the two cues should be integrated. In the limit κ s → ∞, it will lead to full  It has been revealed that the brain integrates visual and vestibular cues to infer heading-145 direction in a manner close to Bayesian inference 8,9 . Following Bayes' theorem, optimal multisen-146 sory integration is achieved by computing the posterior of two stimuli according to 147 p(s 1 , s 2 |x 1 , x 2 ) ∝ p(x 1 |s 1 )p(x 2 |s 2 )p(s 1 , s 2 ).
Since the calculations of the two stimuli are exchangeable, hereafter we only present the results 148 for s 1 . The posterior of s 1 is calculated through marginalizing the joint posterior in the above 149 equation, where we have used the conditions that the marginal prior distributions of s m and x m are uniform, i.e., p(s m ) = p(x m ) = (2π) −1 . Note that p(s 1 |x 2 ) ∝ p(x 2 |s 2 )p(s 1 , s 2 )ds 2 is approximated to be 152 M(s 1 ; x 2 , κ 2s ) through equating the mean resultant length of distribution (Eq. 12) 23 .

153
The above equation indicates that in multisensory integration, the posterior of a stimulus given  . Cue integration (blue) is the sum of the two vectors (green), and the cue disparity information (red) is the difference of the two vectors. (C-E) The mean and concentration of the integration (blue) and the cue disparity information (red) as a function of the cue reliability (C), cue disparity (D), and reliability of prior (E). In all plots, κ s = 50, κ 1 = κ 2 = 50, x 1 = 0 • and x 2 = 20 • , except that the variables are κ 1 = κ 2 in C, x 2 in D, and κ s in E.
Finally, since the product of two von Mises distributions is again a von Mises distribution, 159 the posterior distribution is p(s 1 |x 1 , x 2 ) = M(s 1 ;ŝ 1 ,κ 1 ), whose mean and concentration can be 160 obtained from its moments given by where j is an imaginary number. Eq. 4 is the result of Bayesian optimal integration in the form 162 of von Mises distributions, and they are the criteria to judge whether optimal cue integration is Probabilistic model of multisensory segregation

173
The above probabilistic model for multisensory integration assumes that sensory cues are originated and the neural system needs to infer stimuli based on individual cues. In practice, the brain needs to differentiate these two situations. In order to achieve reliable and rapid multisensory processing, we propose that while integrating sensory cues, the neural system simultaneously extracts the disparity 178 information between cues, so that with this complementary information, the neural system can 179 assess the validity of cue integration.

180
An accompanying consequence of multisensory integration is that the stimulus information 181 associated with individual cues is lost once they are integrated (see Supplementary Fig. S1). Hence 182 besides assessing the validity of integration, extracting both congruent and disparity information 183 by simultaneous integration and segregation enables the system to recover the lost information of 184 individual cues if needed.

185
The disparity information of stimulus 1 obtained from the two cues is defined to be which is the ratio between the posterior given two cues and hence measures the discrepancy between 187 the estimates from different cues. By taking the expectation of log p d over the distribution p(s 1 |x 1 ), 188 it gives rise to the Kullback-Leibler divergence between the two posteriors given each cue. This 189 disparity measure was also used to discriminate alternative moving directions in ref. 24. 190 Utilizing the property of the von Mises distribution and the periodicity of heading directions 191 (− cos(s 1 − x 2 ) = cos(s 1 − x 2 − π)), Eq. 5 can be re-written as Thus, the disparity information between two cues can also be expressed as the product of the The above equation is the criteria to judge whether the disparity information between two cues is 199 encoded in the neural system. the prior concentration κ s varies can be explained analogously (Fig. 3E).

218
A notable difference between von Mises distribution and Gaussian distribution is that the con-219 centration of integration and disparity information changes with cue disparity in von Mises distri- between cues according to ln p d (s 1 |x 1 , x 2 ) = ln p(s 1 |x 1 ) + ln p(s 1 |x 2 + π) (see Eq. 6). Analogous to 233 multisensory integration, the optimal segregation can be achieved by where the preferred stimulus of neurons satisfying θ j = θ j + π (see details in SI). That is, the Other parameters are the same as those in Fig. 4.
inputs to update their activities (Eq. 14), and the multisensory integration and segregation will be  We further checked the responses of neurons to combined cues, and found that when there 291 is no disparity between the two cues, the response of a congruent neuron is enhanced compared 292 to the single cue conditions (green line in Fig. 5A), whereas the response of an opposite neuron is 293 suppressed compared to its response to the direct cue (green line in Fig. 5B). These properties agree 294 with the experimental data 8,9 and is also consistent with the interpretation that the integrated and 295 segregated amplitudes are respectively proportional to the vector sum and difference in Fig. 3.

296
Following the experimental protocol 13 , we also plotted the bimodal tuning curves of the example    In response to the noisy inputs in a cueing condition, the population activity of the same group of 311 neurons in a module exhibits a bump-shape (Fig. 6A), and the position of the bump is interpreted as 312 the network's estimate of the stimulus (Fig. 6B) 27,29,30 . In a single instance, we used the population  Fig. 6E with 6C-D). Certainly, the distribution 341 of cue disparity information decoded from opposite neurons in combined cue condition is wider 342 than that that under the direct cue condition (Fig. S2 purple). Note that when the cue disparity 343 is larger than 90 • , the relative response of congruent and opposite neurons will be reversed (results 344 are not shown here).

345
To demonstrate that the network implements optimal cue integration and segregation and how  Fig. 3E), which is consistent with a pre-353 vious study 11 . We further systematically changed the network and input parameters over a large 354 parameter region and compare the network results with Bayesian prediction. Our results indicated 355 that the network model achieves optimal integration and segregation robustly over a large range 356 of parameters (Fig. S3), as long as the connection strengths are not so large that winner-take-all 357 happens in the network model.

358
The above results elucidate that congruent neurons integrate cues, whereas opposite neurons com-360 pute the disparity between cues. Based on these complementary information, the brain can access 361 the validity of cue integration and can also recover the stimulus information associated with single 362 cues lost due to integration. Below, rather than exploring the detailed neural circuit models, we 363 demonstrate that the brain has resources to implement these two operations based on the activities 364 of congruent and opposite neurons.

399
Animals face challenges of processing information fast in order to survive in natural environments, 400 and over millions of years of evolution, the brain has developed efficient strategies to handle these 401 challenges. In multisensory processing, such a challenge is to integrate/segregate multisensory sen-402 sory cues rapidly without knowing in advance whether these cues are from the same or different 403 stimuli. To resolve this challenge, we argue that the brain should carry out multisensory process-  neuron can be used to discriminate the cue disparity, we apply receiver-operating-characteristics 441 (ROC) analysis to construct the neurometric function (Fig. 8B), which measures the fraction of 442 correct discrimination (see Methods). Indeed, the opposite neurons can discriminate the cue dis-443 parity much finer than congruent neurons (Fig. 8C). In addition, our model also reproduces the 444 same discrimination task studied in refs. 8,9, i.e., to discriminate whether the heading-direction is 445 on the left or right hand side of a reference direction under different cueing conditions (Fig. S4).

446
The present study only investigated integration and segregation of two sensory cues, but our 447 model can be generalized to the cases of processing more than two cues that may happen in reality 34 . this study has shown that it can be done in a biologically plausible neural network, since the op-474 eration is expressed as solving the linear equation given by Eq. 8. A concern is, however, whether 475 recovering is really needed in practice, since at each module, the neural system may employ an 476 additional group of neurons to retain the stimulus information estimated from the direct cue. An  The prior p(s 1 , s 2 ) specifies the probability of occurrence of s 1 and s 2 , and is set as a von Mises 509 distribution of the discrepancy between two stimuli 11,20,21 , given by Eq. 2. Note that the marginal 510 prior of either stimulus, e.g., p(s 1 ) = π −π p(s 1 , s 2 )ds 2 = 1/2π is a uniform distribution.

511
Inference 512 The inference of underlying stimuli can be conducted by using Bayes' theorem to derive the posterior 513 p(s 1 , s 2 |x 1 , x 2 ) ∝ p(x 1 |s 1 )p(x 2 |s 2 )p(s 1 , s 2 ), The posterior of either stimuli, e.g., stimulus s 1 , can be obtained by marginalizing the joint posterior 514 (Eq. 10) as follows (the posterior of can be similarly obtained by interchanging indices 1 and 2) where we used the fact that both marginal distributions p(s m ) and p(x m ) are uniform and then Finally, substituting the detailed expression into Eq. 11, The expressions of the meanŝ 1 and concentrationκ 1 can be found in Eq. 4. The expressions of 529 ∆ŝ 1 and ∆κ 1 in the disparity information can be similarly calculated and is shown in Eq. 7.

530
Dynamics of decentralized network model 531 We adopted a decentralized network model in this study 11 . The network model contains two where I n m (θ, t) is the feedforward inputs from unisensory brain areas conveying cue information.

550
W rc (θ, θ ) is the recurrent connections from neuron θ to neuron θ within the same group of neurons 551 and in the same network module, which is set to be where a is the connection width and effectively controls the width of neuronal tuning curves.
where ω controls the magnitude of divisive normalization, and [x] + = max(x, 0) is the negative 570 rectified function. D n m (t) denotes the response of the inhibitory neuron pool associated with neurons 571 of type n in network module m at time t, which sums up the synaptic inputs of the same type of 572 excitatory neurons u n m (θ, t) and also receives the inputs from the other type of neurons u n m (θ, t), J int is a positive coefficient not larger than 1, which effectively controls the sharing between the 574 inhibitory neuron pool associated with the congruent and opposite neurons in the same network 575 module. The partial share of the two inhibitory neuron pools inside the same network module 576 introduces competition between two types of neurons, improving the robustness of network.

577
The feedforward inputs convey the direct cue information from the unisensory brain area to a 578 network module, e.g., the feedforward inputs received by MSTd neurons is from MT which extracts Demo tasks of network model 627 Testing network's performance of integration and segregation 628 We firstly applied each single cue to the network model individually. Under each cueing condition, 629 we recorded the population activities in equilibrium state across time during cue presentation.

630
In equilibrium state, the statistics of neuronal activities across time is equivalent to across trial.
where z c 1 (t) and z o 1 (t) are the positions of the population activities of the congruent and opposite 651 neurons in network module 1 respectively, which were decoded by using population vector (Eq. 21).

652
In real-time reconstruction, the sum of firing rate represents the concentration of the distribution.

653
This is supported by the finding that the reliability of the distribution is encoded by the summed 654 firing rate in probabilistic population code 11,12 .

655
Discriminating cue disparity on single neurons 656 A discrimination task was designed on the responses of single neurons to demonstrate that opposite 657 neurons encode cue disparity information. The task is to discriminate whether the cue disparity, 658 x 1 − x 2 , is either smaller or larger than 0 • . In the discrimination task, the mean direction of two 659 cues, x 1 + x 2 = 0, is fixed at 0 • , in order to rule out the influence of the change of integrated 660 direction to neuronal activity. Meanwhile, the disparity between two cues, x 1 − x 2 , is changed from 661 −32 • to 32 • with a step of 4 • . For each combination of cue direction, we applied three cueing conditions (cue 1, cue 2, combined cues) to the network model for 30 trials and the firing rate distributions of the single neurons were obtained ( Fig. 8A and B).