1 Introduction

When a pair of stereo-incompatible images is suddenly presented to our eyes, the initial superposition percept evolves rapidly (≈ 0.1–0.2 s) into seeing just one eye’s image, even when the stimuli are equally strong. A similar ‘percept-choice’ process also occurs at the onset of ambiguous monocular stimuli such as a Necker cube, or a transparent object that rotates in depth. In all such cases, neural competition within our visual system rapidly and spontaneously breaks the perceptual ambiguity which arises at the onset of stimuli that provide strong support for two (or more) incompatible percepts. Recent theory and experiments (Noest et al. 2007; Klink et al. 2008; Wilson 2007; Pearson and Brascamp 2008) have revealed how the (onset-driven) neural dynamics of percept-choice differs crucially from the classic ‘rivalry’ process (a slow, irregular cycle of percept-switches) that arises under sustained viewing of such stimuli (Alais and Blake 2004; Blake and Logothetis 2002). Summarized very briefly (see Noest et al. 2007 for details): Either of these processes can occur generically (under transient or sustained stimulation respectively) in a wide variety of ambiguity-encoding neural models with sufficiently strong, recurrent competition between populations and slow local adaptation (Matsuoka 1984; Lehky 1988; Laing and Chow 2002; Noest et al. 2007; Wilson 2007; Shpiro et al. 2009): The strong competition creates two attractor-states in which one population suppresses the other, but each attractor exists only when its dominant population is not too deeply adapted. At onset, percept-choice trajectories start from low activity in both populations, run closely along the separatrix between the two attractors, linger briefly near a saddle-point, and then quickly settle into one of the attractors, depending on the combined bias in activation and adaptation-states of both populations in the brief time between onset and leaving the saddle-point (Noest et al. 2007). Percept-switching is very different: Under sustained stimulation (longer than used in any experiments modelled in our present paper), the slow, noisy accumulation of adaptation in the dominant (stochastically firing) population gradually shifts the currently occupied attractor, and reduces its stability and domain of attraction as it approaches the saddle-node bifurcation that ends its existence, until any small perturbation can trigger a fast switch into the opposite attractor. Even longer stimulation then leads to a series of such percept-switches, separated by gamma-distributed intervals.Footnote 1

How each percept-choice depends on the history of previous percepts and stimuli has been investigated extensively by regularly removing the stimulus, usually before percept-switching starts, and presenting it again after a variable blank interval. The most basic finding is that percept-choices show strongly positive serial correlation, unless the blank interval T 0 is too short (roughly, T 0 < 0.4 s). When first discovered (Orbach et al. 1963, 1966), it was realized that this pattern presents a serious problem for models inferred from classic rivalry data, since the adaptation mechanism in these models causes the opposite of the previous percept to be chosen at each onset (as in standard ‘after-effects’). This conundrum persisted until the effect was rediscovered, and tentatively attributed to cognitive-level memory or ‘priming’ processes (Leopold et al. 2002). However, dynamical analysis (Noest et al. 2007) then identified that interaction between shunting adaptation and a small fixed neural baseline offers a simple and neurally viable mechanism that generates choice-repetition, without requiring any top-down intervention. It also predicts that choice-repetition should give way to choice-alternation at short blank times, as was confirmed by psychophysics (Noest et al. 2007; Klink et al. 2008). Since then, a series of psychophysics and modeling studies (mostly reviewed in Pearson and Brascamp 2008) have greatly extended the range of percept-choice phenomena covered by simple, neurally viable extensions of the basic model. For example, adaptation in stages preceding the stage where percept-choice (or switching) is generated explains (Noest et al. 2007) how non-ambiguous stimuli induce the classic (opposite percept) after-effect, whereas ambiguous stimuli induce choice-repetition (Pearson and Clifford 2004, 2005). Likewise, top-down ‘attention’ (i.e. gain-control) at early stages then explains biased percept-choice (Klink et al. 2008). Furthermore, incorporation of the fact that adaptation is a multi-timescale process allows the model to explain how percept-choice depends on a weighted sum of many previous percepts, including those generated by classic ‘rivalry’-oscillation (Brascamp et al. 2008); these phenomena also contradict an alternative model (Wilson 2007) that incorporates an explicit binary perceptual memory stage into a classic rivalry model. Adding nonlinearity to the ‘priming’ term enables modelling a variety of choice-sequences with ‘nested’ temporal structure that spans all timescales from about 0.5 to over 1,000 s (Brascamp et al. 2008). Finally, adding depth-structure and lateral interactions yields a relatively simple mechanistic explanation of hitherto perplexing data on how the spatial interaction between pairs of structure-from-motion elements depends on local disambiguation (Klink et al. 2009; Freeman and Driver 2006).

However, these models do not cover a class of experiments that directly probe which feature-level neural network connectivity and dynamics allows our visual system to rapidly resolve everyday visual ambiguities: These experiments use temporally interleaved presentations of two (or more) ambiguous stimulus sequences, where the stimuli of different sequences either have shifted visual feature values but equal locations (Maier et al. 2003), or have shifted positions but equal feature values (Chen and He 2004; Knapen et al. 2009). For two sequences, the temporal structure of this class of experiments consists of repeating a 4-step cycle:

Step

Stimulus type

Duration

1

Sequence-1 ambiguity

T 1

2

Blank

T 0

3

Sequence-2 ambiguity

T 1

4

Blank

T 0

Importantly, the ON-duration T 1 is longer than the percept-choice timescale (≈ 0.1–0.2 s) but shorter than the typical time at which a spontaneous percept-switch occurs (usually several seconds), and the Off-duration T 0 is much larger than the neural membrane timescale τ (which is < 0.1 s).

The experiments reported sofar are clearly just the beginning: Extension to many other feature-dimensions and spatial configurations promises to provide important information about how the known feature-selective receptive field and lateral connectivity of visual cortex networks combines with neuron-specific adaptation to produce a form of perceptual ‘priming’ that goes beyond the pure featural and spatial selectivity of neural tuning curves, and thus enables our visual system to rapidly resolve visual ambiguities in a way that respects the natural, continuous structure of featural as well as spatial dimensions.

Modelling should provide mechanistic insight into which of the many known neural and network properties are essential, how they shape the dynamics of percept-choice in such settings, and which predictions this implies. We approach these goals through three stages of modelling: We first show (Section 2) that the interleaved choice-sequence (ICS) phenomena reported sofar can be captured by expanding our previously studied (Noest et al. 2007) basic model (2-population reduction) to a quasi-continuous ‘neural field’ version with visual-cortex type structure in either featural or spatial dimensions. This sets the stage for identifying and analyzing the crucial mechanism and its dynamics. We do this by means of two stages of model-reduction: In Section 3, we reduce the neural-field model with 2-ICS stimulation to a 6-population model. This allows us to show how the temporally non-overlapping sequences become coupled via the adaptation of ‘shared’ subsets of neurons that (i) receive feature-level input from both sequences, and (ii) are linked by cross-inhibitory coupling with the ‘main’ neural subsets, each of which is activated by just one of the sequences. Once we have extracted the essential dynamical processes (both at the fast activity timescale and the slow adaptation timescale) in this 6-population model, we are able to reduce the model even further (Section 4): Firstly, the fast (≈ 0.1 s) dynamics of each choice-event can be reduced to evaluating an ‘instantaneous’ binary choice-indicator function, parametrized by the four main adaptation states at each stimulus onset. Moreover, the slow dynamics of these adaptation states can then be reduced to a pair of iterated nonlinear maps, coupled only via the choice-function. This final reduction enables deeper analytical and computational analysis, resulting in the prediction of phase-diagrams that delimit the existence and stability conditions for in-phase and/or anti-phase repetitive sequence-pair patterns, as well as their response to perturbations that cause occasional ‘glitches’ in individual percept-choice events. These predictions invite a variety of future experiments that should not only test our model, but also stimulate the use of analogous experiments and models to probe the role of other coupled perceptual ambiguity-resolution and rapid choice processes, such as may occur in saccadic eye-movements.

2 Models with continuous feature- and space-selectivity

The common network characteristic of the visual cortex, at each of the relatively early stages that are of interest to our present aims, is that each neuron responds selectively (with finite ‘tuning-width’) to a specific combination of a spatial location and several feature-values (e.g., orientation, color, stereo-depth, etc), and that the whole collection of neurons in each stage covers the whole visual field and some subset of the many dimensions of feature-space (e.g., stage MT encodes motion and stereo-depth, not color). Thus, the standard notion of a receptive field (RF) in visual field space ℝ2 actually extends to a multi-dimensional product-space ℝ2 × \(\mathbb{F}\), where the feature-space \(\mathbb{F}\) depends on the stage in question. Anatomically, the cortex is organized in columns, with each column containing RFs that overlap in visual space, but cover all of the relevant feature-space \(\mathbb{F}\).

Besides the mostly feedforward connections that define each neurons receptive field, there are cross-inhibitory connections. These mostly connect within each column, thus causing competition between co-localized measurements of incompatible feature values. These connections play a major role in much of our modelling, since the process of percept-choice is predicated on the presence of strong, mutually incompatible visual features in the same location. In contradistinction, the (excitatory) lateral connections, which implement proper ‘parallel transport’ of local features \(\phi \in \mathbb{F}\) across space, are not probed by the stimuli relevant to our present aims.

Any stimulus with well-defined visual-geometric structure thus activates a particular sub-network out of the full product-space structure ℝ2 × \(\mathbb{F}\), within several (but not generally all) of the many processing stages. Hence, the structure of the currently active network can be chosen (within limits), simply by presenting an appropriately designed visual pattern. In particular, the stimuli used in various percept-choice experiments are designed to be perceptually ambiguous but strong and featurally well-defined, so as to selectively activate particular sets of networks with fast, semi-local cross-inhibition and slow neural adaptation. This allows one to probe how these structural and dynamical properties interact within several variants on a common network-motif whose function is to resolve the many semi-local ambiguities that occur in everyday vision. Percept-choice dynamics is an extreme example of this process; it may be rare in nature but it is particularly suitable as a probe into the neural mechanisms of ambiguity-resolution because each choice between two incompatible percepts with equally strong stimulus support makes small internal signals that break this symmetry highly visible.

2.1 Featurally shifted, spatially coincident sequence pairs

The first reported ICS-psychophysics experiments (Maier et al. 2003) indicated that the effective interaction between two choice-sequences depended on similarity between the stimuli of the two sequences presented at the same location. The examples which showed this effect most clearly used stimuli with ambiguous rotation in depth—parallel-projected images of transparent but surface-textured objects rotating around an axis lying in the frontoparallel plane. By varying the angle between the rotation axes used in two temporally interleaved sequences, it was found that smaller inter-axes angles yielded stronger inter-sequence correlation between the percepts (a particular sense of rotation in depth) chosen within the interleaved sequences.

To elucidate the underlying neural dynamics of such phenomena, our first model type explicitly represents the motion-direction subspace of \(\mathbb{F}\), but lumps the spatial ‘fine-structure’ of the within-stimulus relation between local speed and position along each stimulus surface. This simplification focusses on the crucial effects and it is reasonable given that the used stimulus-size does not exceed the typical RF-size in the relevant neural stage (MT). It also fits the observation that each percept-choice in such settings affects whole surfaces (Klink et al. 2009) rather than their fine-structure.

Thus, we collapse the ℝ2-structure to a point, and the remaining feature space \(\mathbb{F}\) to a circle \(\mathbb{S}^1\) parametrized by the preferred motion directions ϕ ∈ ( − π, + π) of neurons driven by oppositely moving pairs of surfaces,Footnote 2 and write down the neural field dynamical equations that generalize our original 2-population model (Noest et al. 2007) to this continuous \(\mathbb{S}^1\) setting; deriving such coupled order-parameter field dynamics from noisy neuron-level dynamics and sparse restricted-range connectivity can be done by applying standard techniques first developed in Noest (1989). This leads us to

$$\begin{array}{rll} \tau \partial_t H(\phi,t) &=& X(\phi,t) -\{1+A(\phi,t)\} H(\phi,t) \\ && -\,\Gamma(\phi) \star S[H(\phi,t)] +\beta A(\phi,t) \end{array}$$
(1)
$$ \partial_t A(\phi,t) = -A(\phi,t) +\alpha S[H(\phi,t)] , $$
(2)

with neural generator potentials H(ϕ,t), inputs X(ϕ,t) from motion-tuned prestage neurons, adaptation levels A(ϕ,t), and firing rate function S[h > 0] = h 2/(1 + h 2) ; S[h ≤ 0] = 0. During the ON-time of an ambiguous rotation-in-depth stimulus with rotation-axis angle ϕ 1, the input-distribution X(ϕ,t) is the sum of two ‘humps’ with shapes equal to the neural tuning-curve plotted in Fig. 1, and centered at the motion-directions ϕ = ϕ 1 ±π/2 (see top panels of Fig. 2 for two examples). Otherwise, X(ϕ,t) = 0. Cross-inhibition occurs with a strength depending on the distance between the preferred motion-directions of cells. This is modelled by the convolution kernel Γ(ϕ), which must be 2 π periodic and ϕ-symmetric, and have a broad maximum around the opposite direction: Γ(±π) = γ. Its minimum is taken as Γ(0) = 0, representing lack of self-inhibition. We actually use \(\Gamma(\phi) = \gamma \sqrt{|\sin(\phi/2) |}\) (plotted in Fig. 1), but the precise shape is non-critical; reasonable variants merely require recalibration of other model parameters. In any case, γ must be large enough to allow only 1-hump S[H(ϕ,t)]-responses at the end of each choice-event (and before switching sets in). Roughly similar neural-field models have been used to model classical rivalry, e.g. Laing and Chow (2002) and Kilpatrick and Bressloff (2010), but these lack the combination of shunting adaptation and βA(ϕ,t)-term (or equivalent) that is crucial (Noest et al. 2007) for generating observed percept-choice repetition.

Fig. 1
figure 1

Direction-column model: Shape of the (model-discretized) crossinhibition kernel Γ(ϕ), as well as the input X(ϕ) and steady state firing-rate S[H(ϕ)] when driven by an unambiguous motion stimulus

Fig. 2
figure 2

Direction-column model ICS dynamics (Light/Dark = High/Low signal values): The distance between the rotation-axis angles ϕ i of the ambiguous-motion stimuli of the two sequences i determines whether the percept-choice sequences can coexist in either phase (shown for 67° inter-axis angle distance), or force each other into an in-phase pair (shown for 45° distance). In each case, the input activity-distribution X(ϕ,t) across the 2 π-periodic motion-direction space ϕ consists of pairs of ‘humps’ around ϕ = ϕ i ±π/2, during the respective ON-intervals of each sequence-i stimulus. ON/OFF timing: T 1 = 0.5 , T 0 = 0.5. On the left, we show the neural outputs S[H(ϕ,t)] which encode the chosen motion percepts, for both in- and anti-phase pairs. On the right, where an initial anti-phase pair decays to an in-phase pair, we show the outputs S[H(ϕ,t)] (middle) as well as the corresponding adaptation dynamics A(ϕ,t) (bottom)

This model allows us to make the first steps towards understanding the neural dynamics behind ICS-interaction phenomena in circular feature spaces, as first explored by Maier et al. (2003). To focus on the generic aspects, it is helpful to first consider the two extreme cases.

The case with equal rotation-axes (ϕ 1 = ϕ 2) corresponds to a single sequence of percept-choices (at doubled rate), nearly identical to the subject of our previous experiments and 2-population model (Noest et al. 2007; Klink et al. 2008). Based on these results, and given the used T 0 = 1 s, we predict relatively long runs of percept-choice repetition, with occasional sequence-flips due to neural noise (and/or the long-term cycling mechanism identified and modelled by Brascamp et al. 2009). Noise affects our present models, e.g. the column-based model (Eqs. (1) and (2)), via the same generic mechanism (see Appendix for analysis). Thus, the (formal) pair of interleaved sequences is maximally correlated, limited only by the fraction of noise-induced sequence-flips, in agreement with the Maier et al. (2003) results.Footnote 3

For orthogonal rotation axes (ϕ 1 = ϕ 2 + π/2, say), the model-structure becomes mirror-symmetric about each of the four motion-directions ϕ 1±π/2, ϕ 2 ±π/2. This implies that even the dynamical symmetry-breaking that constitutes percept-choices in sequence-1 (say) can not bias the choice-dynamics in sequence-2 towards either of its competing percepts, and vice versa: The components of βA(ϕ,t) that couple distinct sequences are not only weak (because the generated S[H(ϕ,t)]-bumps are narrow, a parameter-contingent result) but their action is strictly choice-symmetric. Conversely, the symmetry constrains all choice-biasing components to act only within each sequence. This makes each of the two sequences nearly equal to single sequences studied in our previous experiments and 2-population model (Noest et al. 2007; Klink et al. 2008), especially since we showed there that only much larger choice-symmetric contributions have any effect, i.e. transition to a choice-alternation sequence (Fig. 2(b) in Noest et al. 2007). Again, those same mechanisms apply to our present models (see later sections and Appendix), so we predict that both sequences show long runs of percept-repetitions, separated by occasional sequence-flips when neuronal noise overrides the accumulated βA(ϕ,t) bias that favours repetitions. Moreover, the structural and dynamical model symmetry for orthogonal axes predicts that the percept-choices in one sequence are independent of those in the other. Indeed, the Maier et al. (2003) measure of inter-sequence coupling is at its minimum for the orthogonal-axes case, and the remaining 20–30% apparent coupling may again be attributed to the fraction of sequence-flips, roughly consistently with the same effects in the equal-axes case.

For inter-axes angles 0 < |ϕ 2 − ϕ 1 | < π/2, the results of Maier et al. (2003) interpolate very smoothly between the two extremes, but their data is averaged not only over noise-fluctuations but also over several observers, and later experiments (Carter and Cavanagh 2007) have shown that highly idiosyncratic and local random biases exist in percept-choice processes such as these. This makes it premature to try to reproduce data containing such hidden complexities before the basics of ICS-dynamics are clearly understood. Hence we focus on identifying and analyzing the relatively simple and generic dynamical structure behind basic ICS-interactions.

The first step is to identify how the existence of attractors for various ICS-patterns is affected by the inter-axes angle | ϕ 2 − ϕ 1 |. For this, our simple circular model (Eqs. (1) and (2)) without noise is most suitable. Topological reasoning rather than simulation then reveals the structure we seek, and guarantees that it is robust to finite structural disorder, e.g. as indicated by the Carter and Cavanagh (2007) results. The attractor-structure in the extreme cases must extend at least a finite distance into the intermediate range of | ϕ 2 − ϕ 1 |, since moving along this continuum corresponds to a continuous deformation of the dynamical system (Eqs. (1) and (2)), and attractors are structurally stable objects. From our symmetry-based analysis for the extreme cases, we already know that the orthogonal-axes case has four equivalent ICS-repetition attractors (each sequence independently repeats one of its pair of competing percepts), whereas only two of these attractors survive in the equal-axes case because the then dominant interaction between the two formal interleaved choice-sequences destroys the possibility of an “anti-phase” pattern of interleaved choice-repetitions.Footnote 4 The remaining double-rate repetition sequences (two attractors) are equivalent to “in-phase” pairs of interleaved repetition sequences.

Because the attractors of both extremes extend smoothly at least a finite distance into the intermediate range, we extend the meaning of the terms “in-/anti-phase” to pairs of sequences containing percepts that are closer/further apart in ϕ-space (along the shortest path). Note that two of the four attractors in the orthogonal case extend to anti-phase attractors (related by inverting all percept-choices); the remaining two extend to in-phase attractors (similarly related). Crucially, the anti-phase attractors must disappear at some internal point of the | ϕ 2 − ϕ 1 | range, since their sequences cannot transform continuously into the only two stable sequences at the equal-axes extreme, i.e. the double-rate repetition sequences that extend to in-phase sequence-pairs. Conversely, the two in-phase attractors do persist along the whole range of | ϕ 2 − ϕ 1 |, since their sequence patterns are smoothly transformed into each other by moving between the two extremes.

In Fig. 2, we illustrate these very general and robust conclusions by explicit simulation of Eqs. (1) and (2): Both anti- and in-phase sequence pairs remain stable at moderately large | ϕ 2 − ϕ 1 |, but at smaller | ϕ 2 − ϕ 1 |, a system initialized into anti-phase quickly falls into a stable in-phase pattern. The actual (ϕ 1, ϕ 2)-values where anti-phase attractors disappear depend on all model parameters, including the structural disorder indicated by random local percept-biases (Carter and Cavanagh 2007). Modelling such idiosyncratic complications in a systematic way can only begin to be considered after elucidating the generic structure of ICS-dynamics. This is what our analysis provides.

Indeed, we note that our topological analysis of angle-dependent attractor structure extends well beyond models that can be deformed to Eqs. (1) and (2): For binocular rivalry between gratings of different orientation (or motion-direction), we get a doubled model structure (one per eye), with the cross-inhibition now running between eyes.

$$ \begin{array}{rll} \tau \partial_t H_i(\phi,t) &=& X_i(\phi,t) -[1+A_i(\phi,t)] H_i(\phi,t) \\ && -\,\Gamma(\phi) \star S[H_j(\phi,t)] +\beta A_i(\phi,t) \end{array}$$
(3)
$$ \partial_t A_i(\phi,t) = -A_i(\phi,t) +\,\alpha S[H_i(\phi,t)] \; ; \ i\neq j \in\{1,2\} . $$
(4)

Topologically the same attractor structure is predicted, and confirmed by simulation (with recalibrated parameters), even when adding moderate intra-ocular crossinhibition (bounded by creating counterfactual intra-ocular orientation-choice).

2.2 Spatially shifted, featurally equal sequence pairs

In several more recent ICS experiments, the two sequences are driven by stimuli with the same ambiguous featural content, but presented at (variably) shifted locations (Chen and He 2004; Knapen et al. 2009). Such stimuli select a different functional sub-network out of the full cortical product-space framework, as follows: The ambiguity common to both sequences is driven by a pair of competing feature-values that are so far apart in feature-space that we may safely neglect any feature-space overlap between the activated neurons, and thus label these populations by discrete indices i ≠ j ∈ {1,2}. On the other hand, the spatial extent of each activated population can no longer be reduced to a point now, since the typical scale across which cross-inhibition operates (2–3 times the stimulus wavelength (Liu and Schor 1994)) now tends to be less than the stimulus size and of the same order as the range of spatial overlaps probed in the most informative of these experiments. The neural-field dynamics model that should capture such situations becomes

$$\begin{array}{rll} \tau \partial_t H_i(r) &=& X_i(r) -[1+A_i(r)] H_i(r) \\ && -\,\Gamma(r) \star S[H_j(r)] +\beta A_i(r)\end{array}$$
(5)
$$ \partial_t A_i(r) = -A_i(r) +\,\alpha S[H_i(r)] \; ; \ i\neq j \ ; \ i,j\in\{1,2\} \ .$$
(6)

In this setting, the cross-inhibition received by a feature-i neuron at a location r comes from feature-j neurons within the neighborhood of r. Hence, the kernel Γ(r) has a symmetric peak at spatial offset r = 0. Its effective range is crudely known (Liu and Schor 1994; Alais et al. 2006) to be a few times the wavelength of the stimulus spatial frequency. We simply take this width-scale as our unit of spatial distance. Likewise, the pattern of inputs X(r,t) to the choice-stage will be a slightly blurred version of the stimulus; its effective blur-kernel will be roughly the convolution of the RF- kernels of the cells in the pre-processing stages that feed into the stage we model. Thus, we expect the spatial blur-scale of X(r,t) to be roughly similar to the Γ(r) blur-scale.

As shown in Fig. 3, this model generates the type of behavior reported in recent experiments (Chen and He 2004; Knapen et al. 2009): For strongly overlapping stimuli, choice-repetition sequences only exist as an in-phase pair. Conversely, the two sequences may also exist as a long-lived anti-phase pair when there is a sufficiently large spatial gap between the stimuli. The measured spatial shift between stimulus centers at which the phase-locking effect reaches half-maximum (Knapen et al. 2009) is of the order of one degree, which is also roughly equal to the stimulus-diameter. This fits at least qualitatively with the model, where the typical spatial scale is set by the diameter of the RF and that of the Γ(r) kernel; these are of the same order of magnitude for the stimuli used in the existing experiments.

Fig. 3
figure 3

Spatially structured model examples: The inter-sequence interaction, which tends to enforce in-phase patterns, now depends on the existence of a gap or overlap between the stimuli belonging to each sequence. This again manifests itself as (top left) immediate decay to in-phase pattern for overlapping stimuli, or (bottom left) stability of any mutual phase when the gap is larger than the typical RF-size excited by the stimuli, or (right top and bottom) a slow transition to in-phase pattern for abutting stimuli. In this case, we show both the neural outputs S[H(r,t)] and their adaptation dynamics A(r,t). In all panels, blue and red denote the competing percepts

3 Reduction to six-population ODE model reveals the crucial mechanism of choice-sequence interaction

The fact that our neural-field models capture the general patterns of known 2-ICS behavior does not suffice to provide a thorough understanding of the crucial dynamical processes involved, but it does provide a useful first step: In these models, the huge complexity of visual-cortex connectivity and neural dynamics is already reduced to a concise set of simplified ingredients, which are thus shown to be at least sufficient. Moreover, the simulation results strongly suggest that the phenomenological interaction between the interleaved choice-sequences depends on the degree of overlap (in either featural or spatial dimensions) between the neural populations activated by the respective stimuli of each of the two sequences. To quantify and understand how such overlap may provide the core dynamical mechanism we seek, we need to reduce our (quasi-continuous) neural field models to the smallest set of ordinary differential equations (ODEs) that captures the structure and dynamics of the various neural subsets (overlapping or non-overlapping populations) that are activated by 2-ICS stimuli. The following considerations come into play:

The blank time T 0 between the stimuli of the interleaved sequences is much longer than the timescale τ of the fast (H) neural activity dynamics, so there can be no direct H-based dynamical coupling between sequences, even when there is strong overlap between the neural populations driven by each sequence. Only the slow decay of adaptation levels A can bridge the T 0-gap. Note that this is entirely in line with how the A-dynamics bridges the intra-sequence stimulus interruptions (here of length 2T 0 + T 1) as one crucial factor in the mechanism that allows long runs of percept-choice repetitions within each sequence, according to our widely supported (Pearson and Brascamp 2008) single-sequence model (Noest et al. 2007). However adaptation is a very local process, probably acting within each neuron separately, so it can only carry any influence between choice-sequences in as far as it occurs in ‘shared’ neurons, i.e., neurons whose activity S[H] covaries strongly with the percept-choice dynamics of both sequences. As in the known (Noest et al. 2007) percept-choice process within a single sequence, the term βA in those ‘shared’ neurons will bias them towards repeating the most recent percept, i.e. the one which last occurred in the ‘opposite’ sequence. However, note that one more element is required to guarantee that a single, spatially or featurally homogenous percept emerges at each onset: The ‘shared’ neuron population activity during stimulus-ON time must evolve largely in unison with that of the ‘non-shared’ populations activated during that time. Such H-based coupling is actually implemented by the cross-inhibition between competing features that underlies the very existence of a percept-choice process: This cross-inhibition is known to act across a finite range in real space and in feature-space (Alais et al. 2006) that is at least of the right order of magnitude to fit the observed inter-sequence interaction effects.

Incorporating these mechanistic demands and considerations into the simplest neurally viable model, we arrive at the following 6-population ODE-model,Footnote 5 whose general structure is sketched in Fig. 4.

$$ \begin{array}{rll} \tau \partial_t H_{i,k} &=& X_{i,k} -(1+A_{i,k}) H_{i,k} +\beta A_{i,k} \\ && -\,\gamma \left( S[H_{j,k}] + S[h_j] \right)\end{array}$$
(7)
$$\begin{array}{rll} \tau \partial_t h_{i} &=& \xi (X_{i,1}+X_{i,2}) -(1+a_{i}) h_{i} +\beta a_{i} \\ &&-\,\gamma S[h_{j}] -(\gamma/2)\left( S[H_{j,1}] + S[H_{j,2}] \right)\end{array}$$
(8)
$$\partial_t A_{i,k} = -A_{i,k} + \alpha S[H_{i,k}]$$
(9)
$$\partial_t a_{i} = -a_{i} + \alpha S[h_{i}]; \quad i\neq j \ ; \ i,j,k \in\{1,2\} \ .$$
(10)

Note that the fast/slow dynamical variables of the four main (non-shared) neural populations are denoted with capital-letter symbols (H i,k , A i,k ) as before, whereas those of the (smaller) ‘shared-neuron’ populations are indicated by lower-case letters (h i , a i ). As explained above, the shared populations receive inputs (weighted by ξ < 1 representing RF-tail strength) from the stimuli of both sequences k, and their fast-dynamics is sufficiently coupled via shared cross-inhibition (of overall strength γ) to that of the main populations of both k. Adaptation dynamics remains local, at least relative to the spreading of inputs (RF-size) and cross-inhibition kernels.

Fig. 4
figure 4

Reduction to a 6-population ODE-model: The fast/slow dynamical variables of the four main (‘non-shared’) neural populations are denoted with capital-letter symbols (H i,k , A i,k ) as before, whereas those of the (smaller) ‘shared-neuron’ populations are indicated by lower-case letters (h i , a i ). Note that the shared populations satisfy two crucial demands (see text for explanation): They receive inputs (weighted by ξ < 1) from the stimuli of both sequences k, and their fast-dynamics is sufficiently coupled (via shared cross-inhibition: red links) to that of the main populations of both k

This model allows us to precisely dissect and understand how the course of each percept-choice process (fast dynamics) is jointly controlled by the adaptation states at onset of the sequence-specific main populations as well as the shared populations. Indeed, Fig. 5 shows in detail the sequence of crucial dynamical effects during each choice event, which we can summarize as follows:

Fig. 5
figure 5

Detailed views of the fast dynamics of a percept-choice (in sequence k = 1), showing how bias derived from the main-population adaptation state A i,1 can be overridden by the initial response of the shared-populations, which is biased by the a i state. Main panel (left) Trajectories of the crucial ‘membrane potential’ pairs H i,1 (red) and 5 h i (green) during the crucial first few τ-units after onset (filled/open-dots mark time in units of τ respectively τ/3). To show how the shared-population delivers the effective coupling between sequences, we overlay the trajectories of two cases: In both, we assume a preceding percept-1 choice in sequence-1, leading to a main-population A 1,1 > A 2,1-state; as explained previously (Noest et al. 2007), the small facilitatory terms βA i,k terms then give percept-1 a subthreshold ‘head-start’ \(H_{1,1}^{\ast} > H_{2,1}^{\ast}\) at the present onset (foot-point of red trajectories). The shared-population a i -states, and hence the \(h_i^{\ast}\) head-starts (foot-point of green trajectories), contain a similarly biased contribution, but they also contain a ‘crosstalk’ contribution from the sequence-2 choice-history. To generate the ‘without crosstalk’ baseline trajectories, we blocked these sequence-2 contributions to a i , as if sequence-2 did not exist. The remaining imbalance a 1 > a 2 then biases the \(h_1^{\ast} > h_2^{\ast}\) head-start (green foot-point) in the same way as the main-population bias. As expected, the choice-dynamics then converges on percept-1 (trajectories curving to lower right-hand side). In the ‘with crosstalk’ case, we assume that sequence-2 has repeatedly chosen percept-2, such that it leads to an imbalance a 1 < a 2. Now we have a conflicting set of head-start biases \(H_{1,1}^{\ast} > H_{2,1}^{\ast}\) and \(h_1^{\ast} < h_2*\) (see starting points of green and red trajectories). We chose (realistic) conditions such that the bias from a i actually overrides the bias from A i,1. Note that during the first phase (up to t ≈ 1.5τ), the h i (green) activations indeed grow while maintaining their bias towards percept-2. Via shared cross-inhibition, this gradually curves the (red) H i,1-trajectories away from their initial percept-1 biased course and towards the percept-2 side of the diagonal, before they reach the vicinity of the saddle point where the red trajectories diverge sharply, signalling that the system is essentially ‘committed’ to a particular percept-choice. While near the saddle, the main H i,1 suppress the smaller h i -signals by competition. For the final phase of the process, see the two side-panels. Side panels (right) Time-course of the same choice-process, now in terms of the neural firing rates S[H i,1], S[h i ]; these do not encode the important subthreshold ‘head-start’ biases mentioned above, but they drive the cross-inhibition which couples the shared and main population fast dynamics so as to a generate a unified, jointly biased percept-choice. Lower/Upper panels show the choice-process with/without the a i -‘crosstalk’ which overrides the A i,1-derived bias. Note the initial < 2 τ phase where the shared-population signals couple their bias with that of the main population, and the ‘hesitation’-stage until t ≈ 4 τ, during which all shared population signals are suppressed. Afterwards, all four populations jointly accelerate towards the attractor that encodes the chosen percept, and essentially converge on it at t ≈ 7 τ

Without loss of generality, we consider a choice process within sequence-1, and assume that the preceding percept chosen in this sequence was percept-1, leading to a moderate imbalance A 1,1 > A 2,1 of the main-population adaptation at the current onset. As explained previously (Noest et al. 2007), this sets \(H_{1,1}^{\ast} > H_{2,1}^{\ast}\) at onset, giving percept-1 a ‘head-start’ (see foot-point of red trajectories). The shared-population a i -state contains a similarly biased contribution, but its net state also contains contributions from the sequence-2 choice-history, and we assume this to be a long series of percept-2 choices. In our ‘without crosstalk’ case, we removed these sequence-2 contributions—as expected, the overall choice process then converges on percept-1. However, in the ‘with crosstalk’ case, the long history of percept-2 choices leads to an imbalance a 1 < a 2, yielding a head-start bias \(h_1^{\ast} < h_2^{\ast}\) that conflicts with the \(H_{1,1}^{\ast} > H_{2,1}^{\ast}\) bias (see starting points of green and red trajectories). With conditions such that the a i -derived bias overrides the bias from A i,1, the h i (green) activations initially grow while maintaining their bias towards percept-2. Via shared cross-inhibition, this gradually curves the (red) H i,1-trajectories away from their percept-1 biased course towards the percept-2 side of the diagonal, before they reach the vicinity of the saddle point where both red lines diverge sharply, and thus make the percept-choice irreversible. While near the saddle, the main H i,1 suppress the smaller h i -signals by competition, but soon after (t ≈ 5 τ, see right-hand panels of Fig. 5), all four populations jointly accelerate towards the attractor that encodes the chosen percept (well outside the area plotted in the left panel), and essentially converge on it at t ≈ 7 τ.

Having analyzed the detailed fast-timescale dynamics of the percept-choice process, we can summarize its net behavior in terms of a mapping from the four-dimensional space of adaptation-states A i,k , a i at each sequence-k stimulus onset to the label i of the the chosen percept—we can neglect the extremely small range where the A i,k , a i are so close to i-symmetry that the fast dynamics lingers near the saddle point for a large fraction of the stimulus-ON time T 1 ≫ τ. The full map is obviously symmetric under i,j and k,ℓ interchange, and we know from earlier studies (Noest et al. 2007) that it favors percept-i when A i,k  > A j,k unless both adaptation levels become large. (Actually reaching this large-A regime requires such short within-sequence blank intervals 2T 0 + T 1 that it is probably unreachable in our setting). The relevant structure of this ‘choice-map’ can be viewed in Fig. 6. It shows the A i,1-dependence at a few realistic values of a i imbalance, to illustrate the effect of ‘crosstalk’ from sequence-2 choices on the active sequence-1 choice process. One may note the nearly linear effect of (realistically) small crosstalk bias on displacing the choice-boundaries. The general i,k-symmetry and smoothness properties seen here are retained in defining the choice-indicator function C i,k (Eq. (12)) for our next level of model-reduction.

Fig. 6
figure 6

Dependence of a sequence-1 percept-choice on the main-population state A i,1 (axes), as biased by the shared-population adaptation state a i (curve parameters). Left panel shows overview; Right panel shows detail in the usually relevant range (long T 0 yield low A i,k ). All cases have input-sharing ξ = 0.25, implying a baseline condition with a i  ≈ 0.3 A i,k . To show the effect of crosstalk from a percept-2 choice in sequence-2, we add offsets δa 2 to a 2, as indicated. Region-coding: percept-1/2 is chosen in the yellow/blue regimes; in between these, the actual choice-boundary position depends on δa 2, as indicated. For inverted a i -imbalance, the choice-map pattern is obtained by interchanging the plot-axes. Note the roughly linear effect of the (realistically small) a i -imbalance on displacing the choice-boundaries in the low-A i,1 regime

4 Reduction to iterated A i,k -map: analysis and predictions

To enable detailed dynamical analysis that yields predictions of generic 2-ICS behavior well beyond existing experiments, it is very useful to perform another model-reduction step. Indeed, the results we obtained from the 6-population model allow us to condense all the crucial elements of 2-ICS dynamical behavior under a wide range of conditions into a much more tractable form: A discrete-time map that relates the adaptation-states A i,k at one stimulus-onset to the next, and thereby also determines the sequence of percept-choices.

The main reason why such a reduction can capture all the essentials of 2-ICS dynamics is that we have a sufficiently large separation of the relevant timescales: As shown in Fig. 5, each percept-choice effectively finishes within a few times the fast timescale τ after onset of the corresponding stimulus. This is not only fast with respect to the adaptation timescale but also with respect to the stimulus-ON duration T 1, for all presently relevant experiments. This allows us, firstly, to collapse the actual dynamics of an individual choice-event into a formally ‘instantaneous’ evaluation of a binary-valued function C i,k (m) ∈ {0,1} that indicates whether percept i in sequence k is chosen(1) or not(0) at a particular onset indexed by m = 2n + k, where n ∈ ℕ counts the full (2-ICS) stimulus cycles. Secondly, it allows us to describe the adaptation dynamics between one onset and the next as the sum of a passive exponential decay and an ‘adaptation-boost’ term C i,k (m) Q[A i,k (m)] which describes the amount of adaptation added to the chosen-percept population during the m-th stimulus-ON time. Explicit forms of the functions C i,k and Q[A] are constructed below (Sections 4.1.1 and 4.1.2).

4.1 Iterated A i,k -map for 2-ICS dynamics: general form

With T 0, T 1 denoting the stimulus OFF and ON durations, and n ∈ ℕ counting the full stimulus periods (length 2(T 0 + T 1)), the stimulus onsets in each sequence k ∈ {1,2} are counted by m = 2n + k, and the sequence dynamics is reduced to the iterated set of maps

$$ A_{i,k}(m) = e^{-T_0-T_1} A_{i,k}(m-1); \quad m=2n+k\;,\; \ell \neq k $$
(11)
$$\begin{array}{rll} A_{i,\ell}(m) &=& e^{-T_0-T_1} A_{i,\ell}(m-1) \\ &&\, C_{i,\ell}(m-1) e^{-T_0} Q[A_{i,\ell}(m-1)] \end{array}$$
(12)

where C i,k  ∈ {0,1} and Q[A i,ℓ] are the choice-function and adaptation-boost function introduced above, and specified below.

Note that Eq. (11) applies when the sequence-index k and onset-counter m have the same odd/even parity—only passive decay of adaptation happens in this time-interval. Equation (12) applies to the cases with unequal-parity (ℓ and m), and its last term reflects the fact that a choice-event occurred within its own sequence one onset earlier. Indeed, the C i,k (m)-term occurs only with equal parity of its sequence-index k and onset-counter m. Note also that the coupling between sequences now occurs via the choice-functions.

4.1.1 Choice indicator function C i,k , and elimination of shared-population signals

Our 6-population model (Section 3) revealed how the outcome of the rapid percept-choice process after stimulus onset (Fig. 5) is effectively determined by the adaptation states (at onset) in all neurons driven by the corresponding stimulus (Fig. 6). Thus, in this model, a percept-choice in a particular sequence k is determined by the four adaptation levels A i,k and a i at stimulus onset. The main-population states A i,k deliver the main bias toward one of the percepts, whereas the shared-population states a i convey a small extra bias that couples the two sequences. At first sight, it seems we must track all six adaptation values to model the full 2-sequence dynamics. However, we also saw (Fig. 5, right-hand panels) that the shared-population signals h i , and hence also their a i , directly follow the main-population dynamics after the percept-choice (we can neglect the short (≈ τ) post-onset phase in which the shared population actively biases the incipient choice). Also note that the percept-choices in both sequences will thus contribute to the shared-population a i . Hence, we can simplify the system again by deleting the a i as independent degrees of freedom, and use the (appropriately weighted) adaptation states A i,k of the main populations as the formal source of coupling-bias in the choice-function C i,ℓ for the other sequence, ℓ ≠ k. With these considerations about the coupling terms, we choose the simplest mathematical form of choice-function that satisfies the general i, j and k,ℓ symmetries, and captures the basic fact that the usual (uncoupled) choice of percept i for A i  > A j inverts at large adaptation-levels. We define C i,k  ∈ {0,1} to indicate that percept i in sequence k is chosen (1) or not (0), and formalize its dependence on the four main-population adaptation states at onset as

$$\begin{array}{rll} C_{i,k} = \Theta &\left[\big( A_{i,k} -A_{j,k} \big) \right. \\ &\times \left\{ B -A_{i,k} -A_{j,k} +\eta_2 (A_{i,l} +A_{j,l}) \right\}\\ &\left. +\;\eta_1 \big(A_{i,l} -A_{j,l} \big) \right], \end{array}$$
(13)

where Θ[z ≤ 0] = 0; Θ[z < 0] = 1.

Note that the main effective coupling parameter η 1 delivers a sequence-ℓ driven bias to sequence-k choices. These are mainly determined by the sequence-k adaptation imbalance, whose effect inverts at a mean adaptation level set primarily by B, with a sequence-ℓ dependent shift weighted by the secondary coupling parameter η 2. Both η 1, η 2 are increasing (roughly linear) functions of the effective size and competitive strength of the ‘shared’ population, as captured in the 6-population model by the shared-input parameter ξ. In this model, the effective values of η 1 and η 2 then are of roughly equal magnitude, but this need not be so in reality.

4.1.2 Adaptation-boost function Q[A]

As soon as the fast-dynamics variables H i,k have essentially converged to a new percept-choice after stimulus onset (and until the end of the ON-interval T 1), we can approximate them by their formal fixed-point values

$$ H_{i,k}^{\ast} = \frac{X_{i,k} +\beta A_{i,k}}{1+A_{i,k}} \ , $$
(14)

thus reducing the full dynamical system to a (decoupled) set of nonlinear ODEs for the A i,k

$$ \partial_t A_{i,k} = -A_{i,k} + \alpha S[H_{i,k}^{\ast}] . $$
(15)

During the stimulus-OFF intervals T 0, the same approximation holds, with X i,k  = 0.

Integrating the A i,k -ODEs (Eq. (15)) from one stimulus onset to the next then yields the maps (Eqs. (11) and (12)), as follows: The trivial cases, yielding mere exponential decay, are for combinations of i,k and m such that C i,k (m) = 0, i.e., for populations that do not represent the percept chosen at onset m. For the (only) remaining population, which encodes the chosen percept, we have \(S[H_{i,k}^{\ast}]>0\) over essentially the full stimulus-ON time T 1 (up to an \(\mathcal{O}[\tau]\)-error from the choice-process). This contributes an A i,k -‘boost’ term denoted as Q[A i,k ], on top of the basic exponential decay term. Thus, the chosen-percept adaptation map is

$$ A_{i,k}(m+1) = e^{-T_0-T_1} A_{i,k}(m) + e^{T_0} Q[A_{i,k}(m)] $$
(16)
$$ Q[A_{i,k}(m)] = \alpha \negthickspace \int_{m(T_0+T_1)}^{m(T_0+T_1)+T_1} \negthickspace S[H_{i,k}^{\ast}(t)] e^{t-m(T_0+T_1)-T_1} d t \ , $$
(17)

where the (slow) time-evolution of \(H_{i,k}^{\ast}\) is fully determined via the A i,k -ODE (Eq. (15)) with initial condition A i,k (m) at onset m, in combination with the fixed-point relation (Eq. (14)). Note that the function Q[A] therefore also depends on all other parameters in the original problem. For the purposes of this paper, as well as most experiments, the dependence on T 1 is most important, besides the explicit A-dependence. All other parameters do not qualitatively alter the A- and T 1-dependence shown in Fig. 7: Q decreases smoothly with A, and increases sublinearly with T 1, approaching gradual saturation on a timescale of order 1. These properties robustly determine the whole range of 2-ICS behaviors discussed in this paper.

Fig. 7
figure 7

Typical dependence of the adaptation-‘boost’ Q on the adaptation-value A at onset, and the stimulus-ON time T 1. The smooth decrease with A and sublinear growth with T 1 are qualitatively common to all model-variants introduced in this paper, while depending smoothly on all model parameters

4.2 Existence and stability of choice-repetition sequences

Repetitive choice of a percept i by sequence k corresponds to choice-function values C i,k (2n + k) = 1, C j,k (2n + k) = 0 for all n. Assume that both sequences settle into such a repetition pattern, with arbitrary mutual relation. To check the existence and stability of such a solution, we can restrict analysis to the behavior of the A i,k (m) at times m = 2n + k when a percept-choice actually occurs in sequence k. Note also that the dynamical rules that govern both sequences will have the same general form.

To describe the underlying adaptation dynamics of such double-repeat sequences, it proves useful to apply the general iterated map dynamics (Eqs. (11) and (12)) twice (with appropriate label-permutations), corresponding to the full 2-sequence stimulus period 2(T 0 + T 1). Indeed, this yields the simple 2-timestep dynamics

$$\begin{array}{rll} A_{i,k}(m) &=& e^{-2(T_0+T_1)} A_{i,k}(m-2) \\ && +e^{-2T_0 -T_1} Q[A_{i,k}(m-2)] \end{array}$$
(18)
$$ A_{j,k}(m) = e^{-2(T_0+T_1)} A_{j,k}(m-2) \ . $$
(19)

Note that the dynamics of each sequence has now become formally uncoupled from that of the other, so we do not (yet) have to separate the two (perceptually very different) cases of ‘in-phase’ sequences (same percepts i) or ‘anti-phase’ sequences (different i). This dynamical independence is due to the fact that the original interaction occurs through the choice-maps C i,k , which are not only piecewise constant but now actually fixed, representing the assumption that the system produces repeating-choice sequences. Hence, we merely have to check whether this assumption is self-consistent, and whether the corresponding A-dynamics is stable.

4.2.1 Existence

First, we need to find the fixed points of

$$ A_{i,k}^{\ast}(m) = \frac{e^{-2T_0 -T_1}}{1-e^{-2(T_0+T_1)}} Q[A_{i,k}^{\ast}(m)] \;\; \equiv \;\; \mathcal{A^{\ast}} $$
(20)
$$ A_{j,k}^{\ast}(m) = 0 \ . $$
(21)

The only nontrivial value \(\mathcal{A^{\ast}}\) can be computed numerically by iterating the map, or by efficient root-finding routines.

Existence of the ‘repeat’-solutions then requires consistency with C i,k  = 1,C j,k  = 0. Note that the C-functions depend formally on all four A i,k , but now two of these are zero and the other two are equal to \(\mathcal{A^{\ast}}\), and \(\mathcal{A^{\ast}} e^{T_0 +T_1}\) respectively. To check consistency, we do have to distinguish between “in-phase” and “anti–phase” pairs of sequences:

Existence of the in-phase solution requires

$$ B+\mathcal{A^{\ast}}\{-1+\eta_2 e^{T_0 +T_1}\} +\eta_1 e^{T_0 +T_1} > 0 $$
(22)

This condition is satisfied throughout, because η 1 , η 2 > 0 and \(\mathcal{A}^{\ast} < B\) since both sequences are producing percept-repetitions.

Existence of the anti-phase solution requires

$$ B+\mathcal{A^{\ast}}\{-1+\eta_2 e^{T_0 +T_1}\} > \eta_1 e^{T_0 +T_1} $$
(23)

This bound can indeed be violated; it defines a boundary in the space of all model-parameters beyond which anti-phase repetition patterns cannot exist. An example is shown as the black-dashed line in Fig. 8: Stable anti-phase repetition only exists below this line in the selected (T 1,η 1, η 2)-subspace of model parameters.

Fig. 8
figure 8

Existence of stable choice-repetition patterns and their response to a choice-glitch, as dependent on the blank-time T 0 and crosstalk parameters η: In-phase sequence pairs are linearly stable throughout this diagram, but anti-phase pairs exist (and are linearly stable) only below the black dashed line (regimes III,IV). Effect of a choice-glitch in one of a pair of in-phase sequences: (Regime I) Both sequences flip to opposite-percept. (Regimes II,III) Eventual return to previous in-phase sequences. (Regime IV) Only glitched sequence flips, yielding an anti-phase pair. When starting from anti-phase choice-sequences: (i.e., in III,IV): Glitched sequence flips, thus producing an in-phase pair. Two smaller panels (right) show that (lower panel) reducing the stimulus-on time T 1 roughly shifts the regime boundaries to higher T 0, whereas (upper panel) increasing the secondary coupling parameter η 2 mostly just removes the low-T 0 downturn in the (black-dashed) boundary for existence of anti-phase sequence pairs. See text for explanation of the underlying dynamics and mechanisms

We note that the large-T 1 behavior of this critical line is an exponential decay of η 1 with T 1, reflecting the adaptation timeconstant (which we took as our unit of time): For T 1 ≫ 1, we can approximate \(A^{\ast} \approx e^{-2T_0 -T_1} Q[0] \ll B\), so the anti-phase existence condition simplifies to

$$ \eta_1 < e^{-T_0-T_1} \left( B +\eta_2 Q[0] e^{-T_0}\right) \ . $$
(24)

In fact, since η 2 is of the same order as η 1, the asymptotic bound simplifies further, to \(\eta_1 < e^{-T_0-T_1} B\). This may provide experimental access to a simple relation between some of the effective model parameters.

4.2.2 Stability

Since the dynamics (Eqs. (11) and (12)) is independent of the mutual phase of the two interleaved choice-sequences, we assume without loss of generality that both consist of percept-1 choices. Thus, we may drop the k index. For example, the fixed points are \(A_{1,k}^{\ast}= A_1^{\ast}>0\) and \(A_{2,k}^{\ast}=A_2^{\ast}=0\). To study the dynamical stability, we study the dynamics of small perturbations d i , i.e., we write \(A_i(t)=A_i^{\ast} +d_i\), and expand the adaptation-boost function as \(Q[A_i] = Q[A_i^{\ast}] + q_i d_i + \mathcal{O}(d_i^2)\), introducing the ‘slope’ q i .

Substitution into the dynamics (Eqs. (11) and (12)) yields

$$ d_1(m) = e^{-2T_0 -T_1} (q_1 +e^{-T_1}) d_1(m-2) $$
(25)
$$ \label{d2dyn} d_2(m) = e^{-2(T_0 +T_1)} d_2(m-2) $$
(26)

Writing each of these as d i (m) = λ i d i (m − 2), the stability conditions are | λ i | < 1. As expected, the d 2-dynamics is unconditionally stable, since both T 0,T 1 > 0. For d 1, we note that Q is a decreasing function of A, so q 1 < 0, and we have at least λ 1 < 1. Satisfaction of the lower bound λ 1 > − 1 is less self-evident, but numerical exploration shows that q 1 > − 0.2 for parameters that produce hitherto observed behavior, and that the q 1 < − 1 regime remains far below the q 1-range for any viable parameter set.

4.3 Effects of noise or perturbations: “Glitch”-responses

Neural noise, or a visual bias pulse designed to probe the system in a more controlled manner, only starts to affect a stable percept-choice sequence when it causes a “glitch” (a percept that breaks the predicted pattern). The binary nature of each choice-event effectively ‘collapses’ all types of perturbations of the underlying neural dynamics (affecting either the H or A-variables or both) onto a unified and easily measurable response—see the Appendix for mechanistic analysis of these stochastic processes. This property of choice-dynamics provides a very welcome opportunity: We can already classify and analyze at the level of choice-sequences the generic types of dynamical consequences common to all such perturbations, without being blocked by the vastly more complicated task of computing how the probabilities of glitches and their consequences depend on the stimulus, network and noise parameters. Moreover, the problem is as yet experimentally vastly underconstrained, both in sequence statistics and in all essential parameters. Indeed, our present analysis of response-types and regimes is a prerequisite for future attempts to compute these statistics. Most importantly for the short term, our analysis yields a range of new predictions that provide direct experimental access to the crucial mechanisms of coupling between and within interleaved choice-sequences, e.g. by using specifically designed stimulus-bias pulses to induce a glitch.

With only neural noise, ICS-systems studied sofar appear to produce choice-repetition runs of at least several cycles, so we can capture the important behavior by analyzing the dynamical consequences of isolated glitches.

4.3.1 Starting from in-phase choice-sequences

Assume that both sequences were repeating percept-1, say, but that a ‘glitch’ (percept-2 choice) occurs at onset m = 0. Thus, we set C 1,2(0) = 0, C 2,2(0) = 1, despite the fact that the adaptation state \(A_{1,2}(0)=A^{\ast}>0\), A 2,2(0) = A 2,1(0) = 0 and \( A_{1,1}(0)=e^{(T_0+T_1)} A^{\ast}\), would normally have yielded a percept-1 choice. We trace all possible consequences of this.

At onset m = 1

The k = 1 adaptation values are not yet affected by the glitch, but the k = 2 values are. Thus, we have

$$\begin{array}{lll} A_{1,1}(1) &=& A^{\ast}, \quad\quad\quad\;\: \quad A_{2,1}(1) = 0 \ ,\\ \nonumber A_{1,2}(1) &=& e^{-T_0-T_1}A^{\ast}, \ \quad A_{2,2}(1) = e^{-T_0}Q[0] \ . \end{array}$$

Without the glitch, evaluating C i,1(1) would have yielded a percept-1 choice, but the ‘crosstalk’ from the changed k = 2 adaptation state may cause the k = 1 choice sequence to flip at this point. Whether this happens or not clearly depends directly on the time-parameters T 0, T 1, but also on all underlying model parameters via the effective crosstalk parameters η 1,η 2 and function Q(A). The blue line in Fig. 8 delineates the regime (I) in T 0, η 1 space (for realistic other parameters) where the glitch at m = 0 indeed causes a flipped choice in the other sequence at m = 1. Note that the blue line lies within the exclusive in-phase regime (I+II).

At onsets m ≥ 2

The k = 2 adaptation states are still affected by the glitch, independently of whether the k = 1 sequence flipped at m = 1 or not. Thus we have

$$ A_{1,2}(2) = e^{-2(T_0+T_1)}A^{\ast} \; , \; A_{2,2}(2) = e^{-T_1-2T_0}Q[0] \ . $$
(27)

This shift in adaptation balance from percept 1 towards percept 2 reduces the previously existing bias towards choosing percept 1, given that the system was in its repetition-stabilized regime. However, the actual choice, as defined by C i,2(2), also depends on the ‘crosstalk’ effect captured by the A i,1(2)-balance.

Now we need to distinguish several nested types of history, based firstly on the percept-choice outcomes at m = 1, and within these, on the choice occurring at m = 2. Further case-distinctions based on the m > 2 choices will prove unnecessary.

Case 1

If a choice-flip did occur at m = 1, i.e., if the (k = 1)-sequence chose percept-2, the A i,1 states are affected (for the first time), and we find

$$ A_{1,1}(2) = e^{-T_0-T_1}A^{\ast} \; , \; A_{2,1}(2) = e^{-T_0}Q[0] \ . $$
(28)

Note that the ratio between these ‘crosstalk’ terms is the same as the ratio between the two main adaptation terms A i,2 because they both arise form the same stimulus sequence, which however arrives with a smaller input-weight ξ onto the shared population. Evaluation of C i,k (2) yields percept-2 choices throughout this regime.

For all subsequent onsets, the same logic applies, with even stronger imbalances towards percept-2: Each choice at m > 2 is at least as biased towards percept-2 as the already computed choices at m − 2 > 0. Hence, the perceptual result in regime-I is that a single choice-glitch flips both choice-sequences. This is as expected from the fact that regime-I lies inside the regime where only in-phase choice-sequences exist (above the black-dashed line).

Case 2

If no choice-flip occurred at m = 1, i.e., if sequence-1 chose percept-1 as usual, the A i,1 states are not affected, so we have

$$ A_{1,1}(2) = e^{(T_0+T_1)}A^{\ast} \; , \; A_{2,1}(2) = 0 \ . $$
(29)

Now we have a conflict between this ‘crosstalk’ imbalance towards percept-1 and the main (k = 2) adaptation imbalance towards percept-2. Evaluating C i,k (2) yields the green line in Fig. 8, separating the conditions for choosing percept-i at m = 2, as follows.

Above the green line, percept-1 is chosen. The net effect is that the glitch at m = 0 is effectively ignored; this ‘resilience’ is essentially due to the coupling with sequence-1, which in this regime did not flip at m = 1 (and a fortiori simply continues repeating percept-1). As before, subsequent choice events further stabilize the now restored initial situation: Both choice-sequences continue repeating percept-1. Note that the region (between the green and blue lines in Fig. 8) where this occurs includes regime-III where anti-phase repetition also exists as a stable solution. Evidently, a single glitch is insufficient to reach the basin of attraction of this solution. For this to occur, one needs to go to the only remaining regime, analyzed below.

Below the green line, percept-2 is chosen. Again, subsequent choices simply stabilize the new situation: Sequence-1 repeats percept-1, and sequence-2 continues to repeat percept-2. Thus, the single glitch now initiates an anti-phase choice repetition pattern, by flipping the sequence in which it occurs, but not the other.

4.3.2 Starting from anti-phase choice-sequences

Without loss of generality, we may assume that sequence-k was repeating percept-i with i = k, and that the glitch still occurs at m = 0. Thus, we set C 1,2(0) = 1, C 2,2(0) = 0, despite the adaptation states \(A_{1,2}(0)=A_{2,1}(0)=0 ,\, A_{2,2}(0)=A^{\ast}>0 ,\, A_{2,1}(0)=e^{(T_0+T_1)} A^{\ast}\), which specify a percept-2 choice.

At onset m = 1, the sequence-1 adaptation states are unaffected (favoring percept-1), but the sequence-2 values have become

$$ A_{1,2}(1)= e^{-T_0}Q[0] \; , \; A_{2,2}(1)=e^{-T_0-T_1}A^{\ast} \ . $$
(30)

Clearly, crosstalk from these terms favors percept-2 less than before the glitch, so sequence-1 is now biased even more than normal towards choosing percept-1.

At onset m = 2, we therefore only need to examine the consequences of the glitch in sequence-2, while sequence-1 continues in percept-1. The k = 1 adaptation states are still unaffected at this point, but the k = 2 states are given by

$$ A_{1,2}(2) = e^{-T_1 -2T_0}Q[0] \ , \ A_{2,2}(2) = e^{-2(T_0 +T_1)}A^{\ast} $$
(31)

As expected given this shift in balance towards percept-1, numerical evaluation of C i,2(2) shows that percept-1 is chosen for all realistic parameters. Thus, sequence-2 now has two consecutive percept-1 choices, and subsequent evolution only stabilizes this repetition more deeply (compare our earlier linear stability results). Hence the overall perceptual outcome is that a single choice-glitch in a linearly stable anti-phase sequence pair will change it into an in-phase pair.

4.4 Verification of A-map prediction robustness

The various types of glitch-responses we have discovered by exploiting the analytical simplicity of the iterated A-maps offer attractive opportunities for new experimental tests that probe the underlying neural mechanisms. Given that these predictions are made on the basis of our substantial and formally inexact reduction of many-parameter continuous-time models to highly simplified iterated A-maps, we finish our modelling by verifying the novel A-map predictions about glitch-responses within our much less reduced 6-population ODE-model. We find that these simulation results (see Figs. 9 and 10) reproduce each of the analytical predictions of the A-map dynamics, modulo some parameter rescaling expected under such model reduction. The main effects have also been confirmed in the quasi-continuous neural-field models we started with (Eqs. (1), (2), (5) and (6)). Without going through our two steps of brutal model-reduction, it would have been very hard to discover these predictions, and even harder to understand the crucial mechanisms behind them.

Fig. 9
figure 9

Glitch response types in 6-population model fit all types predicted by reduction to iterated A-map. H i,k and A i,k in red/blue for k = 1/2, and h i and a i in green, with i = 2-signals shown as negative. In both columns, both sequences were in percept-1 repetition sequence until the glitch forces sequence-1 to percept-2 at t = 0. Left column Sequence-2 also flips, starting an in-phase percept-2 sequence pair. Right column Sequence-2 does not flip, and then restores sequence-1 to its previous phase

Fig. 10
figure 10

Glitch response types in same model and with same plot-coloration as Fig. 9, under different conditions. Left column Starting from an anti-phase pattern, only sequence-1 flips, thus creating an in-phase pattern. Right column Starting from an in-phase pattern, flipping only sequence-1 creates an anti-phase pattern, which is stable in this regime (and beyond)

5 Discussion

Our nested set of models has yielded increasingly detailed insight into the neural network structure and dynamics that is probed by pairs of temporally interleaved percept-choice sequences. Each stage of model-reduction preserves the essential aspects of the phenomena while condensing the dynamical mechanisms into an increasingly simplified form that allows more detailed analytical and numerical investigation. This strategy has yielded both a unified and precise picture of the mechanism behind the existing data, and a firm basis for predictions that invite a wide variety of future experiments.

In summary, we showed how the various phenomenological interactions between interleaved choice-sequences depend crucially on the subset of neurons that are ‘shared’ between the neural populations driven by the stimuli of the two sequences, and that take part in resolving the perceptual ambiguity at each stimulus-onset by among-population competition. First of all, we noted that interactions between temporally well-separated choice-sequences can rely on the same adaptation-based ‘priming’ mechanisms that underlie the strong within-sequence choice-correlations—a mechanism that we already modelled (Noest et al. 2007) and that has been tested and refined in a wide array of tests (Pearson and Brascamp 2008). However, additional network properties are crucially required for inter-sequence interactions: The ‘shared’ set of neurons whose adaptation-based priming implements the inter-sequence interaction is defined by requiring (i) that they are driven by stimuli from both sequences, and (ii) that during each (sequence-specific) ON- interval, they are synaptically coupled (by sufficiently strong cross-inhibition) with the presently activated main population, such that the percept-choice dynamics occurs in unison between the main and the shared population. The size of this shared neural subset obviously depends on the inter-sequence stimulus similarity (featural or spatial), and thus modulates the strength of inter-sequence interaction.

The crucial difference between our adaptation-based interactions and other coupling mechanisms that are carried only by ‘fast’-timescale activity is nicely illustrated by an experiment reported recently: Klink et al. (2009) studied interactions between two spatially separated percept-choice sequences (ambiguous structure-from-motion cylinders). In most cases, the two sequences were synchronously presented, and this revealed coupling across gap sizes up to roughly the stimulus diameter. However, this coupling vanished when the two sequences were presented in an temporally interleaved fashion. We attribute this to the absence of the ‘shared’ neuron set that is crucial to our type of inter-sequence interactions: At spatial gap sizes of the order of the stimulus-diameter (which limits the largest RF-size of activated neurons), there will be very few neurons with shared input and/or cross-inhibitory coupling that spans across the gap.

In an earlier experiment using choice-sequence pairs with stimuli that were both spatially well-separated and temporally interleaved (Chen and He 2004), a systematically anti-phase relation was found between the sequences. This can now be understood as reflecting weak inter-sequence coupling due to the near-absence of ‘shared’ neurons: We found that both anti-phase and in-phase sequences can then persist, so the phase relation that actually occurs will be determined by a combination of local bias and noise. In fact, local and idiosyncratic choice-biases have been found (Carter and Cavanagh 2007) and a recent experiment (Knapen et al. 2009) has shown that inter-sequence interactions only over-ride these local biases when the stimuli are partly overlapping, and become negligible for inter-stimulus gaps of roughly the stimulus diameter.

Our modelling also applies to a class of choice-sequence experiments which formally define their stimuli as forming a single sequence of percept-choices between binocularly rivalrous stimuli, but in which the eye-specific stimuli are swapped between the eyes on alternate presentations, e.g. some cases in Chen and He (2004), Pearson and Clifford (2004) and Grossmann and Dobbins (2006), and more extensive experiments in Sandberg et al. (2011), published during revision of our paper). We note that such stimulus sequences can be considered as two temporally interleaved sequences whose respective competing stimuli just ‘happen to be’ each others eye-swapped copies. To monocular neurons, such stimuli are no different from any other spatially coincident 2-ICS stimuli which drive neural populations with little or no feature-overlap (or else they would not be binocularly rivalrous). The special (eye-swapped) relation between the two sequence stimuli can only have an effect on perception via binocularly driven neurons and/or ‘shared’ monocular neurons (a subset expected to be small in these conditions). Given our model predictions, this opens up a new approach to probing how rivalrous stimuli drive monocular and/or binocular neurons, and whether and how these neural populations are mutually coupled by inhibition and/or other fast interactions: The simplest case would be that the two stimuli drive non-overlapping and purely monocular neuron populations—this would predict equally stable in-phase and anti-phase interleaved sequences. Note that the anti-phase sequence pair would only appear to be ‘pattern-stabilized’, but would in fact consist of two interleaved, purely eye-based repetition sequences which just happen to be in mutual anti-phase with respect to eye-dominance at each choice event. Short stimulus perturbations that cause a single choice-glitch should then only flip that single choice-sequence and not the other. More generally, if the stimuli also drive some monocular or binocular neurons that satisfy our criteria for forming a ‘shared’ neuron population exist, they will manifest themselves via a reduced parameter-range in which the anti-phase solution is stable (see Fig. 8),Footnote 6 and via the predicted responses to an isolated choice-glitch (see Figs. 8, 9 and 10). As a caveat, we note that all our predictions assume that the inter-sequence blank durations T 0 are sufficiently long to avoid fast-timescale (H-based) interactions across each eye-swap event. This excludes applying our models to ‘flicker-and-switch’ binocular rivalry experiments (Lee and Blake 1999), which use such extremely short T 0 < 100 ms that it allows fundamentally different temporal interactions carried by the slower H-timescale of parvocellular neurons which contribute to binocular processing.

More generally, the generic predictions we derived from our proposed neural mechanism for ICS-dynamics should provide a well-characterized tool for probing potential overlaps and interactions between several other types of neural processing streams which operate partly in parallel. Whenever a perceptual choice sequence can be set up within each stream, our model predictions about glitch-responses and stability-domains for the in-phase and anti-phase solutions can be used as tools for quantifying the existence and connectivity of ‘shared’ neuron populations that bridge the (formal or real) gaps between populations with known feature- and space-selective responses. One class of examples concerns the neural encoding of various feature dimensions: Besides quantifying the sofar only qualitatively known interactions within the populations that encode orientation- or motion-direction and spatial positions, one could probe the connectivity within and between populations coding for color, stereo-depth, curvature, etc. An early study of this type (Grossmann and Dobbins 2006) used eye-swapping of colored ambiguous structure from motion (SFM) stimuli to show that choices between the competing SFM percepts can be decoupled from the choices between the binocularly rivalrous colors of the same dots that specify the ambiguous SFM. Our predictions about timing dependence and glitch-response types might be used to probe the underlying neural connectivity and dynamics in more detail. Other examples involve the question whether and how 1st-order and 2nd-order visual processing, which is known to exist in both the orientation and motion-direction domain, occurs in essentially separate stages or streams (with only their outputs converging at some late stage), or whether they interact, perhaps competitively, at specific earlier stages. Similar questions exist with respect to the parvo-cellular and magno-cellular processing streams, and with respect to the stages at which signals from various sensory modalities (vision, audition, touch, etc) interact. Any neural subsets shared between such formally different processing streams may thus be probed in new, probably informative ways by driving pairs of interleaved choice-sequences with stimuli that are considered to excite only one or the other of the two processing streams, and testing for inter-stream interactions manifested in the various glitch-response types predicted by our models.