Decoding spatiotemporal spike sequences via the finite state automata dynamics of spiking neural networks

Temporally complex stimuli are encoded into spatiotemporal spike sequences of neurons in many sensory areas. Here, we describe how downstream neurons with dendritic bistable plateau potentials can be connected to decode such spike sequences. Driven by feedforward inputs from the sensory neurons and controlled by feedforward inhibition and lateral excitation, the neurons transit between UP and DOWN states of the membrane potentials. The neurons spike only in the UP states. A decoding neuron spikes at the end of an input to signal the recognition of specific spike sequences. The transition dynamics is equivalent to that of a finite state automaton. A connection rule for the networks guarantees that any finite state automaton can be mapped into the transition dynamics, demonstrating the equivalence in computational power between the networks and finite state automata. The decoding mechanism is capable of recognizing an arbitrary number of spatiotemporal spike sequences, and is insensitive to the variations of the spike timings in the sequences.


Introduction
Meanings of temporally complex stimuli such as speech are embedded in patterns over time. Sensory neurons in areas such as the primary auditory cortex [1,2] and the retina [3,4] detect stimulus features within short time windows, and spike reliably whenever their preferred features appear. Different features are preferred by different sensory neurons. A stimulus that varies over a time span much longer than the time windows of the features drives spatiotemporal sequential spikes in the sensory neurons. Thus, a population of sensory neurons encode complex temporal stimuli into spatiotemporal spike sequences [5]. How the brain's neural networks read out such codes is an important unsolved problem [6]- [14].
In this paper, we propose a decoding mechanism based on the spiking dynamics of biological neural networks. We note that spatiotemporal spike sequences are analogous to text strings: assigning a unique letter to each sensory neuron (or to each group of sensory neurons that detect the same feature) transforms the spike sequences into strings of these letters (figure 1). Decoding spatiotemporal spike sequences can thus be done similarly to recognizing written words. Finite state automata (FSA) are powerful mathematical models of machines that recognize a class of written words that belong to regular languages [15,16]. We show that the spiking dynamics of networks of neurons with transient bistable dendritic plateau potentials, which produce UP and DOWN states of the membrane potentials, can behave like FSA for decoding spatiotemporal spike sequences. Transitions between the UP and DOWN states are controlled by feedforward excitations and inhibitions from the sensory neurons, as well as lateral excitations between the neurons in the networks. A general connection rule is shown to ensure the mapping of any finite state automaton to the transition dynamics of the network. Our networks have the same computational power as FSA, and can recognize spatiotemporal spike sequences as complex and general as all text strings recognizable by FSA.
Our mechanism is biologically realistic. Transient bistable dendritic plateau potentials are observed in cortical neurons [17,18]; lateral excitation, feedforward excitation and feedforward inhibition are general features of cortical microcircuitry [19]. This is unlike previous studies on relating neural computations to FSA, which were mostly performed on artificial neural networks [20]. There are other biologically plausible mechanisms of recognizing temporal patterns, such as those based on tapped delay lines [8], synaptic depression [9], detection of synchrony [10], transient network dynamics [11], or single neuron integration [13,14]. Our mechanism is distinguished by its ability to process temporal patterns of arbitrary length, its tolerance of the timing variations of the input spikes, and its capability of recognizing, if necessary, an infinite number of patterns with common characteristics. It is an extension of our previous work on a similar mechanism that could recognize only a finite number of spatiotemporal spike sequences [12]. FSA is a key part of the Turing machine that is capable of universal computation [15], and also has been widely used in engineering systems such as natural language processing and speech recognition [16]. Our mechanism is a concrete demonstration that biological neural networks are computationally at least as powerful as FSA.

FSA
A FSA consists of a finite number of discrete states. Among them, one is designated as the start state, one or more as the end states and one as the ground state φ. Driven by sequential input letters from an alphabet, the FSA changes its states according to a transition table that decides the next state given the current state and input letter. A string of input letters may drive the FSA to one of the end states starting from the start state. If so, the string is said to be 'recognized' by the FSA. Otherwise, the string is said to be 'rejected'. An example of a FSA that recognizes words in a 'sheep language' (strings starting with b, followed by one or more a's and ending with !) [16], is shown in figure 2. Consider a string baaaa!. At the beginning, the FSA is in S 1 , the start state; b induces a transition from S 1 to S 2 ; the first a from S 2 to S 3 ; subsequent a's from S 3 to S 3 and finally, ! from S 3 to S 4 ; since ! is at the end of the input and S 4 is an end state, the string is recognized. It is easy to see that the a's in between b and ! can be any number larger than zero, and all such 'sheep words' are recognized. Consider any string that starts with a. From S 1 , a induces a transition to the ground state φ, and the FSA stays in this state regardless of the subsequent inputs, thus S 4 is not reached at the end and the string is rejected. Similarly, a string like bbaaa! is rejected because the FSA goes to φ after the second b. Incomplete sheep words such as baaa are also rejected since at the end the state is S 3 not S 4 . The FSA rejects all strings that are not sheep words. A key to the FSA computation is the AND operation. From a given state, the FSA can go to different states depending on the input. For instance, the sheep language automaton goes from S 3 to S 3 if the input is a, to S 4 if ! and to φ if b. Recognition of specific strings are through the transitions based on the state and input pairs. The ground state φ is a special state; once a FSA goes to that state, it remains there. Reaching the ground state at any point of an input signals rejection. The end state can be reached during an input; but if it is not reached at the end, the input is still rejected.

Neuron model
Experiments on excitatory neurons in the hippocampus [17] and the neocortex [18] demonstrate that a strong, excitatory, short (around 5 ms) pulse delivered at a distal branch of a dendrite can make the membrane potential of the branch go to a plateau potential that is transiently stable for 50-100 ms. This plateau potential then drives the membrane potential of the soma to a state that is depolarized about 10 mV above the resting potential of the neuron. We refer to this transiently sustained depolarized state as the UP state. Without the dendritic plateau potential, the membrane potential stays near the resting potential. This is referred to as the DOWN state. A strong inhibitory input to the dendrite in the plateau potential can transit the UP state to the DOWN state. Our mechanism of decoding spatiotemporal spike sequences relies on these properties of cortical neurons. The idea is that the UP state corresponds to a state in FSA.
A common approach of modeling a neuron with dendritic structure is to approximate segments of the dendrites and soma as compartments that are connected according to the morphology of the neuron, each compartment described by its own membrane potential, ion channels and synaptic inputs. Such a model can be complex, with numerous compartments and detailed distributions of ion channels [21], if the aim is to accurately describe the morphological and biophysical properties of single neurons. Our goal here is different. We aim to show how dendritic plateau potentials can be utilized for decoding spike sequences with neural networks. We therefore choose to construct a simple multi-compartment model that incorporates only the neuron properties essential for our mechanism.
Our model of an excitatory neuron consists of a soma and five distal dendrites (the number of dendritic compartments is not critical). The dendrites are connected to the soma with ohmic conductance. The soma is a leaky integrate-and-fire unit: it generates a spike when its membrane potential exceeds a threshold. After the spike, the membrane potential resets to a reset potential, and remains there for a refractory period of 5 ms. The soma also contains A-type potassium conductance [22,23], which is activated near the resting potential and inactivated at more depolarized potentials. This conductance enhances the robustness of our recognition mechanism. The dendrite is a leaky integrator as well, but does not generate spikes. All compartments have alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) and gamma-amino-butyric-acid (GABA) synaptic conductances, which mediate synaptic excitation and inhibition, respectively. A dendritic compartment also has N-methyl-D-aspartate (NMDA) synaptic conductance (excitatory), whose nonlinear dependence on the dendritic membrane potential leads to a transient bistable plateau potential that drives the UP state of the soma [17]. The properties of the excitatory neuron model are shown in figure 3.
All synaptic conductances follow a 'kick-and-decay' dynamics. The decay time constants of AMPA and GABA conductances are 5 ms, while that of NMDA is 100 ms; the latter determines how long the UP state is sustained without further inputs. At the time of a spike arrival, the conductance of a synapse that receives the spike increases by an amount equal to the strength of the synapse. In between spikes, the synaptic conductance decays exponentially. Noisy fluctuations of the membrane potentials are induced by random excitatory and inhibitory spike inputs at the soma and the dendrites. Unless specified otherwise, we set the somatic and dendritic membrane potentials to fluctuate with 1 mV standard deviation. The mathematical details of the excitatory neuron model and the synaptic dynamics are given in the appendix.
We model an inhibitory neuron as a single compartment quadratic integrate-and-fire neuron [12,24], on which the AMPA time constant is small (1 ms). It is capable of responding with a short latency to an excitatory spike input ( figure 4(B)). The details of the model can be found in [12]. The AMPA conductance on the inhibitory neuron is set to 0.6; there is no NMDA conductance.
All synaptic and ion conductances are scaled with the leak conductances of the corresponding compartments.  When the input is strong, the membrane potentials sustain large depolarized plateau potentials (UP state). (D) The UP state is driven by a nonlinear current mediated through NMDA conductance in the dendrite. Several curves of the net current in the dendrite as a function of the membrane potential V d , with different levels of NMDA conductances, are plotted. When the NMDA conductance is large, there are two stable equilibrium membrane potentials (arrows).

Network implementation of FSA
For any given FSA, we can wire the neurons in the following way to make the spiking dynamics of the network isomorphic to the FSA.
The number of states in the FSA equals the number of excitatory neurons. State S i of the FSA is represented with neuron N i in the UP state. The ground state φ corresponds to all neurons in the DOWN states.
For each letter in the alphabet of the FSA, there is one sensory neuron (or one group of sensory neurons) whose spikes indicate the detection of a distinctive feature (the neuron or the group is labeled with the letter). In addition, there is a 'start neuron' s that detects the beginning of a sensory input, and an 'end neuron' e that detects the end. A spike from a sensory neuron corresponds to a letter input to the FSA. The state transitions are implemented through a network structure that agrees with observed microcircuitry of cortical neural networks [19]. All sensory neurons excite an inhibitory neuron, which in turn inhibits the somata (synaptic strength G 5 ) and dendrites (G 4 ) of all excitatory neurons (figure 4(A)). The inhibitory neuron spikes once with a delay of 2 ms at each input spike, thus providing a global feedforward inhibition to all dendrites and somata time-locked to each sensory spike ( figure 4(B)). Sensory neurons also selectively send feedforward excitatory inputs to the somata (G 1 ) and the dendrites (G 2 ) (figure 5). Excitatory neurons selectively connect to the dendrites (G 3 ) to provide lateral excitation (figure 5). A dendrite is connected by at most one sensory neuron and one excitatory neuron. G 1 is in a range such that an input spike to the soma makes the neuron spike if it is in the UP state but not in the DOWN state; G 2 and G 3 are such that a dendrite cannot jump to the plateau potential if it receives only one spike from either a sensory neuron or an excitatory neuron, but can if it receives both; G 4 is large enough for a feedforward inhibitory spike to turn off the plateau potential at a dendrite in all cases except when the dendrite receives both feedforward and lateral spikes.
The AND operation S i × h → S j , which means transition from S i to S j if the input is h, is implemented by connecting sensory neuron h to the soma of neuron N i , and both h and N i to a dendrite of neuron N j . There are no other excitatory connections to the dendrite. If N i is in the UP state and h spikes, the dendrite goes to the plateau potential and makes N j go to the UP state. Meanwhile, unless i = j, all dendrites of N i are set to the resting potential due to the feedforward inhibition, and N i returns to the DOWN state. This completes the operation.
At any moment, at most one neuron is in the UP state, and at most one dendrite of the neuron is in the plateau potential because different dendrites get inputs from different pairs of sensory and excitatory neurons.
The time intervals between input spikes are restricted to a range, since the transition between the UP and DOWN states takes a finite amount of time, and the UP state decays to the DOWN state spontaneously if the interval is too long. We set the range from 30 to 80 ms. The start neuron s connects only to a dendrite of N 1 with synaptic strength G 0 (figure 5), which is strong enough to produce the plateau potential despite the feedforward inhibition. A spike from s makes N 1 go to the UP state, and the network is in the start state S 1 . The end neuron e connects to the somata of all neurons corresponding to the end states. When e spikes, one of these end state neurons spikes if it is in the UP state, indicating recognition of the input spike sequence. The recognition is signaled by the coincident spiking of e and an end state neuron. Spiking of the end state neurons before e spikes does not count. The values of the synaptic strengths can be searched in the following way. Firstly, determine the range of G 1 by requiring that a feedforward input spike should not make the neuron spike in the DOWN state but should in the UP state. The A-current in the soma, which is inhibitory, is active in the DOWN state but inactive in the UP state; this makes the range of G 1 large. Secondly, determine the lower limit for G 4 . It must be large enough so that a spike from the inhibitory neuron can terminate the dendritic plateau potential and the UP state. Thirdly, for a value of G 4 beyond the lower limit, determine a range for G 2 and G 3 using two requirements: (1) they should be small enough so that a feedforward excitatory spike or a lateral spike alone cannot prevent the inhibitory spike from shutting down the plateau potential; (2) they should be large enough so that a feedforward excitatory spike and a lateral excitatory spike together can produce the plateau potential from the resting potential despite the inhibition. Setting G 2 = G 3 simplifies the search process. Fourthly, G 0 is set to a large enough value to produce a plateau potential when s spikes. Finally, the role of the inhibition to the soma is to reduce the possibility that a neuron spikes during the transition from the DOWN to UP states due to the transient excitations at the dendrites, which can be large if G 2 and G 3 are large; G 5 is set accordingly. We have confirmed that there is a wide range of the parameters satisfying these criteria. For the rest of the paper, we set G 1 = 2.5, G 2 = G 3 = 3, G 4 = 5, G 5 = 5 and G 0 = 5.

The sheep language network
In figure 5, we show the network that implements the sheep language FSA (figure 2) and recognizes the spatiotemporal spike sequences that are analogous to the sheep words. There are four excitatory neurons, labeled N 1 to N 4 , whose UP states correspond to the states S 1 to S 4 . The network receives spike inputs from five sensory neurons, s, a, b, ! and e. One inhibitory neuron mediates global feedforward inhibition; it receives inputs from all sensory neurons and connects to all somata and dendrites of the neurons (not shown in the figure). s is connected to a dendrite D 11 of N 1 ; this corresponds to the start arrow in the FSA transition diagram. b connects to the soma of N 1 and a dendrite D 21 of N 2 , which also receives a connection from N 1 ; this implements S 1 × b → S 2 . a connects to the soma of N 2 and a dendrite D 31 of N 3 , which also gets a connection from N 2 ; this corresponds to S 2 × a → S 3 . a also connects to the soma of N 3 and another dendrite D 32 , which gets an autaptic connection from N 3 ; this implements S 3 × a → S 3 . ! connects to the soma of N 3 and a dendrite D 41 of N 4 , which is connected from N 3 ; this is S 3 ×! → S 4 . Finally, e connects to the soma of N 4 , the end state neuron.
In figure 6(A), we illustrate the spiking dynamics of the network driven by input spike sequence sbaaaa!e, which is analogous to the word baaaa! in the sheep language. The intervals between the input spikes are randomly chosen between 30 ms and 80 ms. The input sequence successively turns on the UP states in N 1 , N 2 , N 3 , N 3 , N 3 , N 3 and N 4 , and at the spike from e, N 4 spikes, indicating recognition of the input spike sequence.
From S 3 , the sheep language FSA stays in S 3 if the input is a, goes to S 4 if the input is !. In the network, the corresponding transitions are done through activations of different dendrites depending on the sources of the excitatory spikes. The soma of N 3 gets input from both a and !, thus can spike in the UP state if the input spike is from either a or !. However, a and ! connects to different dendrites: a to D 32 of N 3 and ! to D 41 of N 4 . D 32 and D 41 are also connected from N 3 . Thus, the input from a induces D 32 to the plateau state and N 3 stays in the UP state, whereas the input from ! induces D 41 to the plateau state and N 4 to the UP state. The dendrites hence implement the AND operations, and play a critical role in directing the network from the same state to different states with different inputs. It is easy to see that the network recognizes any input spike sequence of the form sba . . . a!e, which are analogous to the words in the sheep language.
The network rejects all spike sequences that are not analogous to the words in the sheep language. An example is shown in figure 6(B), in which the input sequence is sba!bae. The first three spikes sba! of the sequence successively produce the UP states in N 1 , N 2 , N 3 and N 4 . However, the next input is from b, which does not send a connection to the soma of N 4 . At this input, none of the excitatory neurons spike and hence none of the dendrites get a lateral excitation; N 4 returns to the DOWN state due to the feedforward inhibition, and the network goes to the ground state. Subsequent spike inputs cannot make any of the neurons go to the UP state, and the network remains silent. In particular, the spike from e does not make N 4 spike. The input sequence sba!bae is thus rejected. Similar reasoning shows that all non-sheep-language sequences, such as sbbbaaba!!e, are rejected. Hence, the spiking dynamics of the network shown in figure 5 is isomorphic to the FSA shown in figure 2.

Parity checking network
Even with a small number of neurons, our network can detect spike sequences with nontrivial characteristics. Consider a case of spike sequences from two sensory neurons a and b. The task is to recognize all sequences that contain odd numbers of spikes from both a and b, and reject all others. Two spike sequences can differ by only one spike, for instance sbabbe and sbabbae; yet one should be recognized and the other rejected. The order of the spikes does not matter. For text strings, this parity-checking task can be solved with a FSA with four states, as shown in figure 7(A). The start state S 1 goes to S 2 with input b, to S 4 with a; S 2 goes to S 1 with b, to S 3 with a; S 3 goes to S 2 with a, to S 4 with b and S 4 goes to S 3 with b, to S 1 with a. From the state transition diagram it is easy to deduce that the end state S 3 is reached at the end only if the input string contains odd numbers of a and b.
The corresponding network is shown in figure 7(B), which is wired by implementing the FSA following the rule described previously. In figure 8(A), we show the dynamics of the network with input sbbbbaaaabbabbbaae, which contains 7 a's and 9 b's. N 3 spikes when e does, indicating the recognition of the sequence. In figure 8(B), we show the case with input sababaaaabbbaabbe, which contains 8 a's and 7 b's. N 3 does not spike when e does, indicating the rejection. The network can tolerate noisy fluctuations in the membrane potentials of the neurons to some extent. To show this, we vary the maximum strengths G soma,noise and G dendrite,noise of the random noise spikes, while keeping the standard deviations of the fluctuations in the soma and dendrites the same. For each noise level, defined as the standard deviation of the somatic membrane potential during a period of no sensory inputs, we generate 500 random spike sequences of a and b. The sequences are uniformly sampled from all possible ones containing 1-10 spikes of a and b. The intervals between the spikes are randomly selected from 30 to 80 ms, as before. We compute the percentage of recognitions of the sequences that should be recognized, and the percentage of rejections of the sequences that should be rejected. The results are shown in figure 9. The network performance is perfect up to a noise level about 1 mV. Beyond this level, the rate of false positive (recognizing a sequence that should be rejected) and the rate of false negative (rejecting that should be recognized) go up. The rate of false positive tends to grow faster with the noise compared to the rate of false negative. This reflects the fact that with large noise, neurons tend to spend more time in the UP states, and the end state neuron spikes readily when e does.

Discussion
We have shown that biologically realistic neural networks can behave like FSA to decode spatial temporal spike sequences. The states of the networks consist of the UP states of neurons sustained by plateau potentials in the dendrites. The transitions between the states are controlled by feedforward excitation, lateral excitation and feedback inhibition. Coincident inputs of feedforward and lateral excitations are required for producing plateau potentials at the dendrites. This corresponds to the AND operation in FSA. Any FSA can be mapped into an equivalent network. Thus, the ability of our networks in recognizing spatiotemporal spike sequences is the same as that of FSA in recognizing text strings. One network can recognize an unlimited number of sequences with common characteristics, as demonstrated by the sheep language network and the parity checking network. The intervals between the spikes in the sequences can vary within a large range. Thus, the networks can handle uneven temporal warping of sensory inputs.
Our mechanism relies on transitions between network states for temporal integration. This is similar to previous proposals that used short-term synaptic depression [9] or transient network dynamics [11] for imprinting the temporal information of inputs to the network states. Unlike these proposals, however, our mechanism is capable of integration over arbitrarily long time, since the dynamics of our networks is stable due to the binary nature of the plateau potentials and the tight inhibitory control of the transitions. In principle, any network dynamics might be related to FSA; however, such a relationship is often at best speculative or implicit. Our mechanism is explicitly understood in terms of FSA.
We have used a single neuron to represent one state. A simple extension is to replace each neuron with a group of neurons. In this case, state transitions are accompanied by synchronous spiking of the neurons representing the same states. Since the role of individual neurons is reduced, the networks should be more resistant to noise in single neurons. The same can be done with the inhibitory neuron. With a group of inhibitory neurons, the feedforward inhibition is accomplished by their synchronous spikes at each sensory input; random spiking of an inhibitory neuron does not matter. Replacing each sensory neuron with a group likewise should increase the robustness of the sensory inputs against noise.
Transitions between bistable membrane potentials are widely observed in cortical pyramidal neurons in vivo [25]- [27]. We propose that such state transitions in single neurons can be utilized to integrate temporal information. In our networks, spikes of the excitatory neurons occur near the UP to DOWN state transitions, although they can also be absent during the transitions. Whether this is true in the cortical neurons remains to be seen. This prediction should be modified if a state is represented with a group of neurons rather than a single neuron, which might be more realistic. In this case, the neurons representing the same state should spike synchronously during the UP to DOWN state transitions; random spiking of the neurons in the UP states can be allowed.
The beginning and end of a sensory input are signaled with the start and end sensory neurons, respectively. These boundary detecting neurons are similar to the ON and OFF neurons found in sensory areas such as the retina [4,28] and the auditory cortex [29,30]. In our sequence detecting networks, the neurons corresponding to the start and end states in FSA are connected by the start and end sensory neurons, respectively, and these specific connections are the only characteristic that distinguishes the start and end state neurons from others in the networks. The coincident spiking of the end sensory neurons and the end state neurons signals the recognition of a input spike sequence. The neurons in the networks have a wide range of input selectivity: while a start state neuron spikes whenever there is a sensory input, an end state neuron usually spikes only to highly complex input patterns. Other neurons have intermediate selectivity.
Our model of the excitatory neuron is quite simple, and only crudely approximates the complex dendritic structures of cortical neurons. Moreover, the model lacks the wealth of ionic conductances found in the dendrite and soma of cortical neurons, as well as neurotransmitters that can help to stabilize the UP states [26]. A more elaborate neuron model might improve the robustness of our networks against noisy fluctuations.
In our networks, the interneuron has highly convergent inputs and highly divergent outputs. It spikes reliably with a short latency to the inputs from the sensory neurons. Inhibitory neurons with these properties have been found in the somatosensory cortex [31]. Although we rely on feedforward inhibition for controlling the state transitions, it is possible to also include feedback inhibition, often found in cortical microcircutry, as done in our previous work [12]. This should widen the range of the synaptic strengths that make the networks function properly [12].
An unresolved issue is how the specific connections between the excitatory neurons and those from the sensory neurons to the dendrites of the excitatory neurons might be set through experience. In computer science, learning FSA through examples is an active research area [32]. Such research might inspire biologically plausible mechanisms of learning the specific connectivity for recognizing a given set of spatiotemporal spike sequences.
The soma generates a spike if V s > V th = −54 mV, and resets V s to the reset potential V r = −64 mV, and keeps V s = V r for a refractory period of 5 ms. The dendrite does not generate spikes. Noisy fluctuations of the membrane potentials are induced by random excitatory and inhibitory spike inputs at the soma and the dendrites. In each compartment, the random excitatory spike times are generated according to a Poisson process with a rate of 200 Hz; the same for the random inhibitory spikes. The synaptic strength at each random excitatory or inhibitory spike is randomly selected from 0 to G soma,noise in the soma and 0 to G dendrite,noise in the dendrite. Unless specified otherwise, we set G soma,noise = 0.3 and G dendrite,noise = 0.07, which makes the somatic and dendritic membrane potentials fluctuate with 1 mV standard deviation.