Efficient information transfer by Poisson neurons

Recently, it has been suggested that certain neurons with Poissonian spiking statistics may communicate by discontinuously switching between two levels of firing intensity. Such a situation resembles in many ways the optimal information transmission protocol for the continuous-time Poisson channel known from information theory. In this contribution we employ the classical information-theoretic results to analyze the efficiency of such a transmission from different perspectives, emphasising the neurobiological viewpoint. We address both the ultimate limits, in terms of the information capacity under metabolic cost constraints, and the achievable bounds on performance at rates below capacity with fixed decoding error probability. In doing so we discuss optimal values of experimentally measurable quantities that can be compared with the actual neuronal recordings in a future effort.


INTRODUCTION
The problem of information processing and transmission in the brain is historically one of the most intensively studied topics in neurosciences [1][2][3].Frequently, the theoretical approach to this problem relies on the methods of information theory [4] with emphasis on channel capacity as the ultimate fidelity criterion [5][6][7][8].The operational interpretation of channel capacity relies on an essentially digital transmission protocol (not necessarily binary) and a special encoding-decoding setup known as the separation assumption [4,9].Additionally, the information rate equal to the capacity is only an asymptotic quantity, assuming arbitrarily reliable communication, infinite decoder complexity and delay.The unconstrained (or asymptotically achievable) amount of information, however, might not be the main objective for biological systems, as metabolic costs and real-time information processing should definitely be considered [10][11][12].In this paper we attempt to include multiple factors that might affect the efficiency of the actual information transmission, thus continuing the effort started in [11,13,14].
The available information-theoretic methods do not allow to include many of the detailed biological phenomena easily [4,15].In particular, different time scales of the dynamics dictate that dependence on the history and feedback is frequently present in neural systems [16].On the other hand, simplified models often provide a satisfactory description of experimental data [17][18][19].The classic Poissonian assumption of neural spiking statistics seems sufficient under broad circumstances [20,21] and the capacity of Poisson neurons has been discussed in the neuroscientific context too [22,23].
The key advantage here is that a plethora of results is available for the Poisson communication [24][25][26][27][28], so that the model can E-mail: kostal@biomed.cas.czbe employed also outside of the neurobiological context (e.g., in [29]).
It was observed in [30] that certain neurons with Poissonian spiking statistics may communicate by discontinuously switching between two discrete levels of firing intensity.We calculate the asymptotic limits on reliable communication under such circumstances and we address the problem of the effective information rate, provided that the coding system is constrained in its bandwidth and complexity.We believe that the bandwidth limitation might be of special interest since neural systems are generally limited by the finite speed of the underlying chemical and electric processes [3].The main goal of this paper is therefore to provide an information-theoretic interpretation of the results recently obtained by Mochizuki and Shinomoto [30].We are convinced that our effort is timely as the problem of constrained information transmission has been attracting attention in the theoretical neuroscience community recently [10-12, 14, 31].

Information capacity of the Poisson neuron
Let .˝;F; P/ be a probability space and F t a filtration on F. The neuronal input is described by a stationary stochastic process t 0; t 2 OE0; T adapted to F t .The neuronal response is given by the doubly stochastic Poisson process N t with intensity t C 0 .In this scenario we assume that t is proportional to the driving synaptic current [18,32,33], 0 0 is the spontaneous activity rate and N t is the number of postsynaptic spikes observed up to time t.The following two constraints are imposed on the input signal.The peakamplitude constraint, and the average power constraint where 0 Ä % Ä 1 is the maximum allowed average-to-peak ratio of the input signal.The information capacity (in nats per second) of the point process N t is given by where sup is over all signals t satisfying Eqs. ( 1) and ( 2), with no other restrictions on the form of the trajectories or their statistics.The mutual information between t and N t is where O t D E .t jF N t / and F N t is the canonical filtration of N t [34, Ch. 6.5].Hence, the conditional mean O t is the minimum mean-square error estimate of t based on the history of the process N t .It was shown in [25] (see also [24] and [34]) that, remarkably, Eq. ( 3) can be evaluated in a closed form, where If the neuron is not spontaneously active, 0 D 0, then Eq. ( 6) reduces considerably to C.%/ D L min.%; e 1 / logOEmin.%;e 1 /: The capacity-achieving input signal t is a limiting case of the random telegraph wave, i.e., of the stationary process with a piecewise constant path taking only the extremal values 0 and L.More precisely, let n be a positive integer and consider the process n t which admits the representation [25,34] where r is given by Eq. ( 8), P OE n 0 D L D r and m n t is a martingale.The capacity achiever is the process t D lim n!1 n t .As n grows the rate of transitions between states n D 0 and n D L is unbounded and therefore the optimal input has infinite bandwidth.The optimal average-to-peak ratio of the input signal equals r.Note that if 0 D 0 and % D 1 (no average-power constraint on t ) then the probability that t D L at any given time equals e 1 .

Metabolic cost of neural activity
Both empirical and theoretical studies suggest that the metabolic cost of neuronal spiking activity, in terms of ATP molecules (ATPm) expenditure, is proportional to the firing rate [35,36].By employing the linear model of Laughlin and Attwell [36] we naturally define the average metabolic cost W (in ATPm per second) evoked by the optimal input signal as where Ä D 0:71 10 9 ATPm is the cost of a single spike and ˇD 0:34 10 9 ATPm/s is the basic metabolic rate required to maintain the membrane potential.In what follows we express C.%/ in terms of W as C.W /, even though due to Eq. ( 8) the same value of W may correspond to different % for high enough %.
We propose the information efficiency, E (in bit per ATPm), as the ratio of information capacity to the associated metabolic cost, The maximum efficiency with respect to the cost, E D max W E (also known as the capacity per unit cost [37]), represents the maximum amount of information per ATPm that can be communicated at arbitrarily low probability of channel decoding error.The rationale for adopting Eq. ( 12) as the efficiency measure lies in the crucial importance of balancing the performance versus the metabolic workload for living organisms [11,38,39].

Coding capacity and Wyner's code
The extension of Shannon's discrete-time channel coding theorem for general continuous-time channels is mathematically non-trivial [40][41][42].However, in his seminal work Wyner [26,27] proved (among other things) the achievability and converse theorems for the information capacity in Eq. ( 6) by employing the equivalence between the Poisson continuous-time channel and a certain discrete-time memoryless channel.In other words the information capacity of the Poisson neuron equals its coding capacity.Hence, for any information rate up to capacity, 0 Ä R < C , and arbitrary " > 0 there exists a channel code with M e RT codewords (for T sufficiently large) such that the average probability of decoding error P e satisfies P e Ä " [4].
The construction of Wyner's code ensemble with integer parameters M > 1 and k 1, proceeds by dividing the interval OE0; T into M k subintervals and by assigning the value D L into .k=M/ M k bins and D 0 to the remaining bins (see [26,Sec. III.A] for details).In this way, M distinct input signal waveforms (codewords) are created, each satisfying Eq. ( 2) while bearing similarity to the process n t in Eq. (10).The maximum likelihood decoder observes the process N t and identifies the input waveform based on the spike counts.The probability of mismatch (decoding error) for each input waveform is P e;m .The sequence of codes provided by Wyner is essentially optimal, satisfying the Gallager's random coding bound [4].Therefore we can define the achievable average probability of decoding error at rate R as P e e E.R/T ; (15) where E.R/ > 0 for all 0 Ä R < C and P e D P M mD1 P e;m =M .The coordinate fR; E.R/g can be expressed in an implicit parametric form in terms of p 2 .0; 1 following [26] as with the following functions defined, and s given by Eq. ( 7).The value R 0 D E.0/ is the cutoff rate [43], The time window T thus plays a similar role to the channel code blocklength in discrete-time channels [4, Theorem 7.3.2].

Effective information rate and required bandwidth
The capacity represents the maximal information rate achievable under the stringent reliability criterion P e !0. Approaching capacity then requires unbounded coding windows T and infinitely fast switching rates of the input signal [25,26].It is practically impossible to implement such information transmission schemes.The error exponent E.R/ in Eq. ( 15) relates the information rate R, coding window T and the probability of decoding error P e .Therefore, we employ it to analyze the possibility of effective information transfer at rates below capacity and tolerable value of P e as follows.
For a fixed value of P e and rate R we define the required length of coding window T as where Then Eq. ( 23) follows directly from Eq. ( 15) and guarantees that the error probability does not exceed P e .The condition in Eq. ( 24) is necessary since k 1 in Wyner's code, and follows from the combination of Eq. ( 14) for k D 1 and Eq. ( 13).
Once the coding window T is determined, the M; k parameters of Wyner's code may be calculated through Eqs. ( 13) and ( 14).The maximal required bandwidth B of the input signal is of particular practical importance.Regularly, the bandwidth of a unit pulse is defined to be equal to its inverse duration in engineering applications [44], hence from the properties of Wyner's code we have In other words, Eq. ( 25) implicitly defines the effective information rate R eff .B/, which is achievable with input signal bandwidth not exceeding B at some fixed error probability P e .

RESULTS
In the following we evaluate the information rates and capacities below in bits, using the standard conversion 1 "bit" D log.2/ "nat".
First we examine the dependence of the neuron capacity given by Eq. ( 6) on the metabolic expenditure given by Eq. ( 11), more precisely we examine the capacity-cost function C.W /.We set the peak input firing rate to L D 50 Hz and investigate the effect of the spontaneous activity of four different intensities 0 D f0:1; 1; 5; 10g Hz.The chosen values are entirely physiological (note that the output firing rate peaks at L C 0 Hz) and represent a rather generic situation under various experimental conditions (e.g., [30,45,46]).We do not consider peak firing rates L > 50 Hz in order to guarantee the validity of the linear cost model in Eq. ( 11) [36] and of the Poissonian approximation to the real neuronal firing activity [47].In practice it is possible to determine L and 0 by fitting the hidden Markov model to the given the experimentally observed neuronal activity, as has been done in [30].Alternatively, one may fit the hidden Markov model to the data by fixing two firing rates in advance.Here the two firing rates may be chosen by hand or using other principles.
The respective capacity-cost functions are shown in Fig. 1a.The metabolic cost W for each 0 spans the whole possible range of values since the parameter % in Eq. (2) varies continuously in the interval .0;1/.The C.W / departs from zero at the minimum possible cost, W min D 0 Ä C ˇ, which corresponds to the metabolic cost of the spontaneous firing rate 0 and the baseline cost, see Eq. ( 11).The unconstrained channel capacity is the limit of C.W / as the metabolic cost is allowed to grow.Under all four spontaneous activity levels the unconstrained capacity exceeds 10 bit/s.The efficiency E defined by Eq. ( 12) balances the information capacity and the required metabolic expenditure (Fig. 1b).Note that the factor L= 0 D 1=s essentially describes the signal-to-noise ratio (SNR) of the Poisson neuron.The capacity in Eq. ( 6), as a function of SNR, is proportional to L and hence we may obtain different capacity values for situations with equal SNR.We see that as the SNR increases, the point of maximal efficiency E moves towards lower costs.(At the same time the cost of 1 bit (equal to 1=E ) decreases, as expected intuitively.)The point of maximal efficiency is less pronounced for low SNR.
An alternative point of view is provided by analyzing the best possible performance when the metabolic cost is fixed, W D const: It follows from Eq. ( 11) that W and the average postsynaptic firing rate hf i are linearly related due to hf i D 0 C rL.We therefore maximize the capacity as a function of s; L along the curve hf i D .sC r/L D const: (note that r is a function of s).Since L acts only as a scal- ing factor in Eq. ( 6) it is possible to verify that the solution to this constrained maximization problem requires s D 0. In other words, for the given value of hf i, the Poisson neuron with parameters 0 D 0 and L D hf i= min.%;1=e/ attains the largest capacity.The effective rate for B D 2 Hz at P e D 10 5 is shown in Fig. 1c.For convenience we neglect that M and k are integers and set M D e RT , k D rM together with where .x/ is the gamma function [48].The most striking difference with respect to Fig. 1a is the significant drop in information transfer, approximately one order of magnitude large.In addition, if the metabolic cost is neglected, the difference between maximal R eff becomes negligible with growing SNR, e.g., there is virtually no difference between 0 D 0:1 Hz and 0 D 1 Hz.Similarly to Fig. 1b the ratio R eff =W corresponds to the achievable efficiency (Fig. 1d).The notable difference here, besides the respective values of the maxima, is that the points of maximal efficiency occur at lower values of the metabolic cost (shown by vertical lines).
An alternative point of view is provided by fixing the metabolic cost and varying the maximum allowed input signal bandwidth.Fig. 2a Although currently do not have enough information on the biologically valid range of B, we believe that relatively small values, B < 10 Hz, are plausible.shows such a situation for % D 0:2.The value % D 0:2 was chosen for illustration only, there is a qualitative correspondence between the results for different % (not shown).In terms of the metabolic cost and the capacity we have W D 8:6 ATPm/s and C.W / D 22:6 bit/s for 0 D 0:1 Hz, W D 9:2 ATPm/s and C.W / D 19:8 bit/s for 0 D 1 Hz, W D 12:0 ATPm/s and C.W / D 14:3 bit/s for 0 D 5 Hz, W D 15:6 ATPm/s and C.W / D 11:0 bit/s for 0 D 10 Hz.At this value of the input signal average-to-peak ratio the growth of R eff with B is initially linear and independent of 0 .This trend continues up to certain critical B (dependent on 0 ), from which on the growth of information rate becomes much slower.The ratio R eff =B=W (Fig. 2b) balances the information rate vs. the allocated signal bandwidth vs. the incurred metabolic expenses, although the actual importance of each factor can hardly be determined unquestionably.Nonetheless the key observation here is the significant drop in information-transfer capabilities above the critical bandwidth (which grows with SNR).
Finally, we illustrate behavior of the key Wyner's code parameters in dependence on the transmission rate R at bandwidth not exceeding B D 2 Hz and % D 0:2.The required coding window T is shown in Fig. 3a.The actual value of T is analogous to the discrete-time channel code blocklength and hence the decoding delay and complexity increase with T [4,14,49].We observe that for a fixed constraint % and maximal allowed bandwidth B there exists an information transmission rate R > 0 achieving smallest T .The existence of such an "optimal" R stems from the condition in Eq. (22).If the value of P e was decreased, the minimum of T would become more pronounced and occur at smaller rates (not shown).The coding bandwidth in Eq. ( 25) combines the coding window duration with the number of time divisions M k .The number of these divisions grows explosively once the optimal rate in Fig. 3a is crossed (Fig. 3b).Wyner's coding waveforms thus approach the capacity achieving input signal in Eq. ( 10) rapidly.

DISCUSSION AND CONCLUSIONS
In this contribution we have analyzed the information coding and transmission efficiency of a Poissonian neuron.We have determined the channel capacity and efficiency (Fig. 1a, b) and stressed their asymptotic properties and the ).The required coding window duration T corresponds to the classical discrete-time channel code blocklength.The decoding complexity and delay grows with T so it is desirable to minimize T for the given rate R and P e (a).For all investigated SNRs there exists an information rate with minimal T and this rate increases with the SNR.The coding window OE0; T is divided into M k sub-intervals.The number of these bins grows explosively once the optimal information rate (minimal T ) is crossed (b).Consequently, the required bandwidth of the input signal grows too (Fig. 2a).impossibility of actually achieving these rates practically.The focus of our effort is the explicit consideration of decoding error probability and its influence on the code complexity (bandwidth or duration).These considerations might be of crucial importance for finite-sized, power-constrained and real-time operating communication systems -such as neurons.To our best knowledge, though, this part of information (or communication) theory has not been applied to neuroscience, with the exception of [14].We have found that once the balance between information, metabolic cost and code parameters is taken into account, the achievable information rates drop significantly with respect to their asymptotic counterparts (Fig. 1c, d).Similar conclusions were reached in [14] for the case of a realistic Hodgkin-Huxley neuronal model.In addition, the progress towards the higher information rates is accompanied by sudden and rapid increase in certain undesirable code parameters (Figs. 2 and 3).The difficulties associated with crossing the cutoff rate in Eq. ( 20) are well known [28,43].Here we have demonstrated similar effects at lower rates with respect to parameters of potential neurobiological importance.
Throughout this paper it has been assumed that the input signal (the presynaptic firing rate t ) has a very restricted form, taking only two possible values.This assumption stems directly from the mathematical fact that such a signal is optimal (asymptotically) for the Poisson communication, and from the recent analysis of Mochizuki and Shinomoto [30] of real neuron behavior.However, our methods do not answer the question whether, under certain coding constraints, a continuously varied t might be beneficial.In this context, Kostal and Kobayashi [14] have shown, without imposing the discreteness constraint on the presynaptic firing rate, that coding complexity restrictions result in an input with very few and well separated levels.In addition, the binary input signal usually near-achieves the information capacity values in systems with low SNR [13,50,51].
Finally, few remarks about the interpretation of our results are in place.First, the definition of achievable P e in Eq. ( 15) is only approximate for small T .The reason lies in the fact that unlike the unconstrained Gallager's bound which holds for all code lengths [4, Theorem 5.6.2], the power constraint in Eq. ( 10) prohibits an equally elegant formulation [4, Theorem 7.3.2].On the other hand, the Gallager's bound is the upper bound on the smallest possible P e , so it is entirely possible that a code with P e satisfying Eq. ( 15) exists even for very small T .Second, the capacity in Eq. ( 6) cannot be further increased by causal feedback, that is, t may depend on N ; 2 OE0; t [34].The feedback however allows for the existence of simpler codes at given R < C and P e , and therefore increases the value of the error exponent [52].The capacity thus cannot be further increased by a non-Poissonian neuron, provided that the presynaptic input still satisfies the constraints in Eq. ( 1) and ( 2) and provided that the spontaneous activity is Poissonian with intensity 0 .For further capacity-related results on the Poisson channel subject to additional constraints see [53][54][55].

Figure 1 .
Figure 1.Capacity, efficiency and the effective information transfer rate of the Poisson neuron in dependence on the average (postsynaptic) metabolic cost W . Four different values of the spontaneous firing rate 0 are examined, the input signal (presynaptic firing intensity) is restricted to t 2 OE0; L and L D 50 Hz.The capacity-cost function (a) exhibits the law of diminishing returns as W grows.The efficiency C.W /=W (b)shows a pronounced optimum for higher signal-to-noise (SNR) ratios L= 0 .Both the capacity and the efficiency are asymptotic quantities in terms of coding-decoding delay and complexity.The effective information rate R eff (c) is a non-asymptotic quantity with explicit coding considerations, i.e., t bandwidth not exceeding 2 Hz and decoding error probability P e D 10 5 .The maximal R eff is only about 5 % of the unconstrained capacity value and the benefit of increased SNR between 0 D 1 Hz and 0 D 0:1 Hz is negligible.The balance between the metabolic cost and the effective rate (d) also achieves only 5 % of the asymptotic values in (b), in addition, the optima occur for smaller metabolic cost (vertical lines).

Figure 2 .
Figure2.Effective information rate R eff when the input signal bandwidth does not exceed B under constant % D 0:2, see Eq.(2).As in Fig.1c, d it holds L D 50 Hz and P e D 10 5 .For small bandwidths the effective rate grows linearly and independently of the SNR (level of spontaneous activity 0 ) up to the critical bandwidth B. 0 / (a).The information rate per metabolic cost per allocated bandwidth (b) considers the three most important aspects of practical information transmission jointly.The rapid drop in R eff =W=B after the critical bandwidth shows that increasing the transmission rate is not equally easy in terms of practical coding parameters.

Figure 3 .
Figure 3. Detailed parameters of Wyner's code with B Ä 2 Hz and % D 0:2 in dependence on the transmission rate (L D 50 Hz and P e D 10 5).The required coding window duration T corresponds to the classical discrete-time channel code blocklength.The decoding complexity and delay grows with T so it is desirable to minimize T for the given rate R and P e (a).For all investigated SNRs there exists an information rate with minimal T and this rate increases with the SNR.The coding window OE0; T is divided into M k sub-intervals.The number of these bins grows explosively once the optimal information rate (minimal T ) is crossed (b).Consequently, the required bandwidth of the input signal grows too (Fig.2a).