Decomposition of neuronal assembly activity via empirical de-Poissonization

Consider a compound Poisson process with jump measure $\nu$ supported by finitely many positive integers. We propose a method for estimating $\nu$ from a single, equidistantly sampled trajectory and develop associated statistical procedures. The problem is motivated by the question whether nerve cells in the brain exhibit higher-order interactions in their firing patterns. According to the neuronal assembly hypothesis (Hebb [13]), synchronization of action potentials across neurons of different groups is considered a signature of assembly activity, but it was found notoriously difficult to demonstrate it in recordings of neuronal activity. Our approach based on a compound Poisson model allows to detect the presence of joint spike events of any order using only population spike count samples, thus bypassing both the ``curse of dimensionality'' and the need to isolate single-neuron spike trains in population signals.


Introduction
The cell assembly hypothesis (Hebb [13]) postulates dynamically interacting groups of neurons as building blocks of representation and processing of information in the neocortex. Synchronization of action potentials across neuronal groups was suggested as a potential signature for active assemblies (von der Malsburg [19]). In this scenario, the assembly members would exhibit specific higher-order correlations. It was shown that single nerve cells are sensitive to higher-order correlations in their input, and that they can react to such input in a correlation-specific way (Abeles [1], Diesmann et al. [8], Kuhn et al. [16], [17]). The statistical procedures for the analysis of higher-order correlations from multiple parallel spike train recordings suggested so far (Perkel et al. [23], Martignon et al. [20], [21], Grün et al. [9], [10], Nakahara and Amari [22], Gütig et al. [11], [12]) suffer from the "curse of dimensionality": the estimation of the necessary parameters (of order 2 N for a population of N neurons) poses serious problems, mainly due to insufficient sample sizes. As a consequence, most attempts to directly observe cell assemblies have resorted to pairwise interactions (Brown et al. [6]). However, when relying on second-order approaches one cannot draw direct conclusions about the presence of higher-order effects.
Here we present a novel procedure for the statistical analysis of higher-order patterns in massively parallel spike trains that circumvents the above-mentioned problem. It involves two elements: (i) We assume a model for neuronal populations with a simple descriptive parametrization of higher-order effects. Specifically, we employ correlated point processes, where correlations of any order are induced by "inserting" appropriate patterns of synchronous spikes. (ii) We base our inference on spike counts extracted from a single population (multi-unit) spike train, rather than from multiple parallel (single-unit) spike trains. This leads to a parsimoniously parametrized univariate estimation problem circumventing the curse of dimensionality, which greatly reduces the demands with respect to the size of empirical samples. This strategy also alleviates some of the difficulties arising when single-unit spike trains have to be extracted from multi-unit recordings; cf. Section 7.
The mathematical framework is as follows. Consider N neurons labelled 1, 2, . . . , N , and let A denote the collection of all non-empty subsets of the set N = {1, 2, . . . , N }. Associated with the neurons is a collection of point processes [pp's] that describe their joint activity. For each subset a ∈ A there is a pp X a (t), t ≥ 0, 1 the jump times of which indicate simultaneous firing of all neurons i ∈ a. Thus at the lowest level of single neurons there are N pp's X i (t), t ≥ 0 (i ∈ N ), where X i (t) denotes the number of "single-cell" spikes occurring in the i-th neuron during time interval [0, t]. At the level of neuron pairs there are N 2 pp's X {ij} (t) counting the number of "two-cell" spikes that occur simultaneously in neurons i and j up till time t. Next, there are N 3 pp's X {ij k} (t) counting the number of "triple-cell" spikes occurring simultaneously in neurons i, j, k up till time t. And so on.
The n-cell processes with n ≥ 2 account for possible n-fold interactions between the single neurons. These processes are not assumed to be observable; only their superposition is. Our goal is to disentangle this mixture, i.e., to extract information about 2-fold, 3-fold, etc. neuron interactions from the aggregate activity of the whole population.
Concerning the underlying n-cell processes we make the following assumptions.
(A1) All processes X a (t), a ∈ A, are statistically independent. (A2) For each a ∈ A, X a (t) is a right-continuous Poisson (point) process [ppp], with constant jump rate µ(a), and such that X a (0) = 0.
The sum activity of the population is described by the process Quantitiy Z(t) counts the number of spikes that occur anywhere in the population up till time t. Note that a (unit) jump of X a (t) at time t = τ, say, contributes |a| spikes (|a| = cardinality of subset a), hence makes Z(t) jump upwards by |a| units at time t = τ. (By (A1), (A2), the probability equals zero that there is any t ≥ 0 that is a jump time for more than one of the processes X a , a ∈ A.) From representation (1.1) and our assumptions it follows that Z(t), t ≥ 0, is a compound Poisson process with jump measure ν given by a weighted sum of Dirac point masses δ n , is the jump rate of the process Y n := a∈A, |a|=n X a , the superposition of all n-cell processes. In terms of the latter, one also has the slightly more condensed representation The Poisson process underlying all jumps, n Y n , is called the carrier process. Its jump rate is n ν n = ν + . The activity of individual neurons can be described by pp's W i , i ∈ N , where W i represents the spike train at neuron i ∈ N. In terms of the n-cell processes X a these single-unit spike trains are given by W i = a∋i X a . The collection {W i , i ∈ N } of all single-unit pp's provides a multivariate representation of the population activity by means of correlated ppp's: each W i is a Poisson pp with intensity λ i = a∋i µ(a); the correlation (or "interaction") between W i and W j (i = j) results from those n-cell processes X a (n ≥ 2) which simultaneously contribute to W i and W j , i.e., correspond to firing patterns a ∈ A including both neurons i, j. Note, finally, that the superposition of all single-unit processes should yield the total activity of the population, which in fact it does: It has to be pointed out that the subsequent developments exclusively refer to the process Z and its assumed compound Poisson character. The precise relations between, and properties of, the single-unit processes W i and the n-cell processes X a are technically irrelevant as long as they are compatible with the compound Poisson model. They do matter, however, in regard to the interpretation of the model and the results derived on its basis. The problem to be dealt with can now be stated as follows.
Problem: Suppose one is given the values of process Z(t) at L equidistant time points t l = lh, 1 ≤ l ≤ L, with spacing h > 0. The task is to estimate the jump measure ν, or at least, the first M rates ν n (n ≤ M ), from these data.
Suitable estimates of the ν n could be used to answer specific questions in regard to neural networks, such as: Are pairwise correlations sufficient to describe the network activity? Do there exist correlations of order > 10, say? Etc.
Note that continuous observation of process Z(t) would allow registration of every single jump, hence make the problem quite simple. In practice, one depends on discrete sampling, in which case it is unknown how many jumps have contributed to the increment across an interval (unless the increment is ≤ 1). The information left after discretization is contained in the increments of Z(t) across the bins (t l−1 , t l ]. These bin counts represent poissonized versions of the jump measure, which is why we call our goal de-Poissonization. De-Poissonization as a powerful analytic method has found numerous applications in, e.g., information theory, the analysis of algorithms, and combinatorics (Jacquet and Szpankowski [14], Borodin et al. [5]), but we are not aware of applications similar to those presented here. For a technically related, continuous variant of our estimation problem see Jongbloed et al. [15].
The paper is organized as follows. In Section 2 we introduce the rate estimates using Fourier inversion. Their basic asymptotic properties are derived in Section 3. A more detailed study of the conditions for the validity of the underlying approximations is deferred to Appendix A. The conncection between our approach and empirical de-Poissonization is explained in Section 4. Asymptotics-based statistical procedures are outlined in Section 5 and illustrated by Monte Carlo simulations in Section 6. The paper concludes with a discussion of some neurobiological and statistical aspects of our approach, addressing particularly the possible consequences of violations of the compound Poisson assumption and possible further developments.

Estimating the rates ν n -a Fourier-based proposal
Given the bin width h > 0, let Z 1 , . . . , Z L denote the increments of process Z(t) across the successive sampled time points, Z l = Z(lh)−Z((l −1)h). (To ease notation, we suppress the parameter h wherever feasible.) The Z l are independent and identically distributed random variables with the common characteristic function Our (first) approach to the estimation of the ν n depends on the simple observation that the exponent of (2.1) is essentially a Fourier series from which its coefficients, the rates ν n , can be recovered by Fourier inversion. With ν + = N n=1 ν n we have N n=1 ν n e inθ = ν + + h −1 log γ h (θ), (2.2) so by inverting (2.2) we get back the coefficients of interest, Eq. (2.3) suggests to estimate the ν n by substituting the unkown characteristic function γ h by the empirical characteristic function [ecf] of the observed increments, so that the ν n are estimated as follows, Note that the number of neurons, N, does not enter (2.4), hence need not be known. The exponential form of γ h immediately yields the expression for the log characteristic function. Matters are more involved with the empirical version γ h whose (continuously defined) complex logarithm may traverse different branches. Although such happens only with small probability if L is large (by the consistency of the ecf as L ↑ ∞, uniformly in θ ∈ [−π, π]; see e.g. Prakasa Rao [24,Ch. 8.3]), related difficulties may well occur in finite samples. Ways of fixing them will develop from an alternative view of our estimation problem outlined in Section 4. It has to be emphasized that the estimates can assume negative values. We do not attempt to avoid this drawback, and rather consider such a case as indicative of a small value of the corresponding rate parameter.
Linear functionals λ = n c n ν n of the ν n , with finitely many real constants c n , are straightforwardly estimated by substituting ν n by their estimates, where in the last expression C(θ) = k c k e −ikθ denotes the Fourier series of the c k (which appears when interchanging summation and integration). Of particular interest in our context are functionals of the form λ m = N n=m ν n . Here λ m represents the total activity of all n-cell processes of orders n ≥ m, and the question is whether λ m = 0 for all m greater than some m 0 (and if so, what is m 0 ).
Since the number N of neurons generally is unknown, it would be desirable to replace λ m by the infinite tail sum ρ m = ∞ n=m ν n , with the understanding that ν n = 0 for all but finitely many n. This is easily achieved on noting that the Fourier series associated with λ m is N n=m e −inθ = (e −imθ − e −i(N +1)θ )/(1 − e −iθ ) and using the Riemann-Lebesgue lemma, according to (2.6) and at least on any event where (log γ h (θ))/(1 − e −iθ ) ∈ L 1 [−π, π] it makes sense to estimate ρ m by the expression The integrability condition holds in fact with probability tending to 1 as L ↑ ∞.
For the moment it suffices to note that the pole of R m at θ = 0 is counterbalanced by the zero at θ = 0 of the log (empirical) characteristic function.

Asymptotics
The basic properties of the proposed estimates that are required for, e.g., test construction can be derived applying the usual methods of asymptotic statistics (e.g., Prakasa Rao [24]). This is done here in a heuristic fashion at first. Precise conditions under which the approximations can be expected to be valid are provided in Appendix A. By the law of large numbers, the ecf γ h (θ) is close to its expected value, inserted into (2.4), with only the first term kept, gives the stochastic approximation From (3.2) one readily obtains the approximate (normal) distribution of the estimates ν n on calculating moments of U n and applying the central limit theorem. Clearly ξ h is Hermitean, i.e., ξ h (θ) = ξ h (−θ) for real θ (the bar denotes complex conjugation), and Eξ h (θ) = 0. Therefore EU n = 0, and by the independence of the random variables Z l one has for real θ 1 , θ 2 where T = hL denotes "observation length," and Γ h the kernel Hence, using (3.2) along with the variable substition −θ 2 → θ 2 in the integration, we find that the asymptotic covariance of estimates ν m , ν n is given by Slightly more generally, the asymptotic covariance of the estimates λ b , λ c of two linear functionals corresponding to finite vectors of constants (b k ) = b, (c k ) = c is given by the bilinear form Alternatively, one may pass to the Fourier series and calculate the asymptotic covariance as The latter procedure also applies to certain infinite compounds such as the tail sums ρ m , in which case the Fourier series B, C become R m , R n , respectively; see (2.7). (Note that Γ h (θ 1 , θ 2 ) vanishes along the coordinate axes θ 1 = 0 and θ 2 = 0, cancelling the poles of R m (θ 1 ) and R n (θ 2 ).) Some insight into the structure of the (co-)variances can be gained by letting the bin width h tend to zero. Then by (3.4) and (2.1) When inserted into (3.5), this gives the approximation for (T times) the asymptotic covariances of the ν n , which are thus uncorrelated to first order, with variance ν n /T . These approximative variances are optimal in the sense that they are identical with those applying to the case where each n-cell ppp is directly observable. This is in fact little surprising considering that if hν + is small, then any bin contains at most one jump of the carrier process with high probability. Note, finally, that Ω n,n = ν n vanishes if ν n = 0, indicating that ν n tends to zero at a faster rate than O p (T −1/2 ) in this case. Of course, these conclusions apply strictly only in the limit h ↓ 0.
Let us end this section by indicating the kind of conditions that turn out, in Appendix A, to be decisive for the validity of the proposed normal approximations: T should be "large," h and ν + /T "small," and hν + is a critical quantity that should be "moderate" at most.

Connection with analytic de-Poissonization
The estimates (2.4) can also be derived in a different manner using analytic de-Poissonization. This approach provides complementary information about our estimates that affords alternative ways of calculating the relevant quantities (other than by numerical integration), or even different estimation methods.
To achieve the necessary generality we will temporarily modify, and slightly abuse, the notation used so far. Let us first assume that is the generating function of some, for the moment arbitrary, discrete random variable Z ≥ 0 with P [Z = k] = p k (k ≥ 0). Suppose that φ is analytic in a neighborhood W of the closed unit disc in the complex plane, and that it has no zeros within W . Then log φ, too, is analytic in W, and an application of Cauchy's integral formula 2 gives (on picking the main branch, where log 1 = 0) More generally we have for any n ≥ 0 Let β n denote the expression on the right-hand side of (4.2). Since φ (n) (0)/n! = p n , (4.2) then yields the relations 2 For Cauchy's formula and other related facts see Ahlfors [2].
Returning to our setting we now let Z stand for the generic specimen of the increments Z l . Then φ(e iθ ) = γ h (θ), and a comparison with (2.1) to (2.3) shows that the right-hand side of (4.2) becomes hν n for n ≥ 1, and −hν + for n = 0. Together with the evaluation (4.3) this implies that the rates ν n are expressible as functions of the probabilities p k , 0 ≤ k ≤ n, Moreover, since ν + = ρ 1 we get for the tail sums ρ m supply a basis for alternative estimation approaches to be explored elsewhere.
A key point of the preceding considerations is that the relations (4.1) to (4.5) do not only apply to the "true" quantities ν n , p k , but likewise to the estimates ν n , the empirical frequencies p k = L −1 |{ l : Z l = k}|, and the empirical characteristic function γ h (t) = k≥0 p k e itk -provided the latter has winding number n 0 = 0 w.r.t. the origin. 3 For in this case γ h (t) ≡ γ h (e it ) can be continued to an analytic function γ h (w) = k p k w k (a polynomial, in fact, since with probability 1 only finitely many p k are nonzero) that is zero-free within the unit disc, the latter by the argument principle. Thus, if the winding number is zero, the arguments that led to Eq. (4.2) are valid for the ecf, too, whence The important conclusion (from the second equality) is that the estimates ν n defined by (2.4) and (4.7), respectively, are identical ifn 0 = 0. By evaluating the left-hand side of (4.7) one obtains as above ifn 0 = 0, showing that the rate estimates can be directly calculated from the histogram of the bin counts in this case. Explicit expressions for asymptotic (co-)variances can be derived in a similar way; cf. Appendix B. Unfortunately, formulae quickly become complicated, and a computer algebra system will be required to handle all but the most simple cases.

Tests
Tests of null-hypotheses pertaining to the parameters ν k can be constructed on the basis of our estimates and (co-)variance approximations under the assumption that the estimates are consistent and approximately normally distributed. Practical implementation of any such test requires an estimate of the asymptotic (co-)variances, that is, of the kernel Γ h (θ 1 , θ 2 ); the other quantities are known. One obvious possibility is to replace the "true" characteristic function γ h by its empirical version γ h everywhere in (3.4). For an alternative that is expected to be more stable statistically, note that by (3.4) and (2.1) the kernel Γ h (θ 1 , θ 2 ) admits the explicit representation which allows us to estimate Γ h (θ 1 , θ 2 ) by substituting estimates ν n for the unknowns ν n . The ν n can assume negative values, with potentially unpleasant consequences. To avoid such, we plug in positive parts ν + n instead (x + = x if x > 0, and x + = 0 otherwise). In practice the sum in (5.1) has to be truncated, and the proposed estimate of Γ h (θ 1 , θ 2 ) becomes with some suitably chosen number K. For example, if a null-hypothesis implies ν k = 0 for k > m, one may take K = m. This choice is not mandatory, however. One also could set K = M if ν 1 , . . . , ν M are the rates that are being estimated. 4 Whatever the concrete choice of Γ h , let T −1 Ω m,n , T −1 Σ m,n , . . . denote the estimated covariances obtained by replacing Γ h in (3.5), (B.1), . . . by Γ h . Standard ANOVA type tests impose linear restrictions on the rates such as H 0 : "Aν = 0," where ν denotes the column vector with entries ν 1 , . . . , ν M and A is some q×M matrix of full rank q. The corresponding Wald type test statistic W = T (A ν) ′ (A ΩA ′ ) −1 A ν is approximately χ 2 -distributed with q degrees of freedom under H 0 . Related two-sample analogs may be used to test for equality of ν in different subjects, or under different conditions in the same subject.
Other tests relevant to our goals pertain to the tail sums ρ m . Let ν denote the maximal index k such that ν k > 0. Clearly then, ρ m > 0 for every m ≤ ν and ρ m = 0 for every m > ν. This suggests to consider null-hypotheses of the form H 0 : "ρ m = 0 ∀ m 1 ≤ m ≤ m 2 ," for some fixed pair m 1 , m 2 such that 2 ≤ m 1 ≤ m 2 . An associated test would reject H 0 if V = max m1≤m≤m2 V m exceeds some critical value, where V m = (T / Σ m,m ) 1/2 ρ m . An approximate null-distribution for V is difficult to derive theoretically (except for the trivial case m 1 = m 2 ), but may be estimable by means of the bootstrap method. Alternatively, one may test each single hypothesis "ρ m = 0" observing that V m is approximately standard normally distributed (if ρ m = 0), and finally apply some test level correction. Most useful in applications may be a simple screening procedure: plotting the standardized statistics V m (or the corresponding p-values) against m could suffice to reveal the major features of interest.

Simulation results
In the following we present simulation results for several parameter configurations, including one where the Fourier inversion method fails to perform reliably. For Example 1 we chose (ν 1 , . . . , ν 5 ) = (40, 10, 4, 3, 1) and ν k = 0 for all k > 5. Observation length was T = 30, and bin width h = 0.02. (All times are specified in seconds, all rates are given in events/second.) For Example 2 we put ν 1 = 150 and ν 7 = 7, and ν n = 0 otherwise. In this case we had T = 60 and h = 0.005. For each example we generated 50 realizations of the respective compound Poisson process by Monte Carlo simulation. In order to illustrate the connection with the usual multivariate approach we split each process into a multivariate Poisson process representing a hypothetical population of neurons (cf. the paragraph preceding Eq. (1.4)). The populations comprised N = 30 and N = 20 neurons, respectively, in the two examples shown, yielding average single-neuron firing rates of about 3 Hz and 10 Hz. For simplicity, we chose a maximum entropy representation where all neurons have the same likelihood to participate in a particular pattern. That is, we generated independent ppp's Y n with intensities ν n and assigned the n spikes occurring at each jump of Y n randomly to one of the N n n-element subsets a of the neurons. Recall, however, that the single spike trains merely are shown for illustration: our statistical analyses only depend on the bin counts of the compound process. These are displayed below the single spike trains in Figure 1.
For each simulated realization we estimated ν n and ρ n for n = 1, . . . , 12 using the Fourier method, with associated covariance matrix estimates obtained according to (5.2). As an illustration of hypothesis testing we calculated test power profiles β n as follows: for each n = 1, . . . , 12 we determined β n as the relative frequency among the 50 simulations in which the test statistic V n = (T / Σ n,n ) 1/2 ρ n exceeded the value 2, roughly corresponding to a one-sided 5 % level test of the null-hypothesis "ρ n = 0". The estimated rate profiles reproduce the "true" profiles fairly well (Figure 2, left-hand column). The sharp decrease of the power profile in Figure 2d) properly reflects the vanishing of all ν n for n > 7. Such a clear separation cannot be expected in Example 1 where the transition from the non-zero to the zero rates is rather smooth; see Figure 2c). Simulations sometimes exhibit a peculiar phenomenon: the ν n tend to form clusters. While in benign cases there is only one cluster concentrated at the true rates, increase of hν + induces splits of the estimated rates into two or more distinct clusters located at totally wrong rate profiles. The reason for this phenomenon is that the ecf has a nonzero winding numbern 0 in such a case, making the log-ecf end up in a different branch. This gives us a simple diagnostics: the phase angle of γ h (θ), ℑ log γ h (θ), then changes by a nonzero amount 2πn 0 along the loop [−π, π] ∋ θ → γ h (θ). In particular, ℑ log γ h (π) − ℑ log γ h (0) equals a nonzero multiple of π in such a case (and is zero otherwise). An example where plain Fourier-based de-Poissonization fails. Rate profiles (a,c,e) and imaginary parts of the log characteristic functions (b,d,f ) of the "true" model are represented by black lines, the empirical analogs by blue or red lines, respectively, if the winding number of the initial ecf is zero or non-zero. The deviating rate profiles (a) associated with a non-zero winding number of the ecf (b) are effectively corrected by "shrinking" (c,d) and "editing" (e,f ), with one exception in case of shrinking.
A simple remedy is to shrink the loop towards 1 so that it does not circle around 0 anymore: given some δ ∈ (0, 1), replace γ h (θ) by γ h (θ; δ) = δ + (1 − δ) γ h (θ), and obtain rate estimates ν k (δ) based on γ h (·; δ) (rather than on γ h (·)). As long as δ is small, the resulting perturbation of the estimates should be small, too. For a less crude corrective one starts out directly from the origin of non-zero winding numbers: such result if γ h considered as a polynomial in w ∈ C has zeros within the unit disc (cf. Section 4). The idea here is to move any such zero "outside" by multiplying it with a suitable factor. For definiteness, let ǫ > 0 and suppose that γ h (w) = K k=0 p k w k (K = max l Z l ) has zeros α 1 , . . . , α K , γ h (w) = K k=1 (w − α k )/(1 − α k ). Let "edited zeros" α k = α k (ǫ) be defined by α k = (1 + ǫ)α k /|α k | if |α k | ≤ 1 + ǫ, and α k = α k otherwise 5 , and put Then define rate estimates ν n (ǫ) as above, now replacing γ h (·) by γ h (·; ǫ). Figure 3 illustrates both procedures for the following choice of parameters: (ν 1 , . . . , ν 4 ) = (17,11,14,6) and ν k = 0 for k > 4; T = 60; h = 0.05; 6 shrinking parameter δ = 0.02; editing parameter ǫ = 0.075. Our simulations indicate that editing induces less bias in the rate estimates than shrinking. Moreover, it works always, while for shrinking this holds true only if δ is selected adaptively. On the other hand, such an adaptive choice is easily implemented, and shrinking does not require the computation of all zeros of γ h . We will come back to these matters elsewhere, along with more extensive simulation studies and applications to measured data.

Discussion
In this paper we have proposed a compound Poisson model along with associated statistical procedures for the analysis of higher-order interactions in neuronal assembly activity. The proposed model is purely phenomenological; explanation of neural mechanisms is neither intended nor within its scope. Its main purpose is to provide a formal framework for the statistical analysis of active assemblies in sampled data, to the degree they are reflected by the occurrence of higherorder spike patterns. This goal is achieved without recourse to simultaneous recordings of the single-neuron spike trains. Only the compound signal is used, which comprises the superimposed spike trains of all neurons. It is then impossible to identify specific groups of neurons that participate in specific coordinated events. In return, our approach allows to design statistical procedures that operate reliably on much smaller samples than other methods suggested previously. Moreover, it avoids certain technical problems associated with the identification of individual neurons in population signals typically obtained with extracellular recording electrodes: Although spikes have still to be isolated from noise, and simultaneous spikes on a single channel have to be correctly identified as such, our approach practically eliminates the need of "spike sorting", a difficult and error-prone procedure (often based on a classification of spike wave-forms) through which spikes are assigned to individual neurons.
Technically, our approach hinges on the compound Poisson character of the bin count distribution. Ultimately, it is this assumption that establishes the connection between the observed counts and the not directly observable higherorder correlations in the activity of the neuron population. Doing without any such functional relation is impossible in inferences based on the compound activity only. On the other hand, as with any model the assumptions may not be adequate in applications, and potentially serious consequences have to be taken into account if they are violated. Let us adress some issues that might turn out problematic with actual neuronal behavior.
It is currently hard to judge how well a real neuronal population conforms to the Poisson model, although irregular (i.e. Poissonian) spike trains are considered to be characteristic of normal operation of neocortical networks. Recently, evidence is accumulating that the single-neuron firing rates in the cortex of behaving animals are extremely low (≪ 1 Hz, Lee et al. [18]), much lower than previously thought (Shadlen and Newsome [26]). In our model, individual spikes are distributed over a large number of different assemblies, so "statistics of rare events" may be an adequate approximation. Deviations from Poisson statistics in real neurons, on the other hand, seem to occur mainly for higher firing rates, e.g. during stimulus response (Amarashingham et al. [3]) or in the large projection neurons of the motor cortex (Baker and Lemon [4]).
Our assumption of identically distributed bin counts means that the compound process Z, and the carrier process Y + , respectively, should have stationary increments within the observation period [0, T ]. While quasi-stationary stretches of data of sufficient length could possibly be obtained in anesthetized animals, the task-dependent modulations of neuronal firing rates in behaving animals generate inevitable non-stationarities. The consequences of unattended firing rate modulations are potentially quite serious: fluctuating rates increase the weight in the tail of the count distribution, which falsely predicts the presence of (higher-order) correlations. The related problems are lessened in experiments avoiding rapid changes of external conditions and long observation periods. In regard to the conflict between stationarity and efficiency, which demand short and long observation intervals, respectively, the parsimonious parametrization of the higher-order interactions appears as a particularly advantageous feature of our approach, since it helps to limit the length of the observation period required for reliable inferences.
Alternative parametrizations could also contribute to alleviate the consequences of instationarity. Thus one could treat the jump rate ν + of the carrier process as a nuisance parameter and consider the normalized rates ω n = ν n /ν + as the parameters of interest. Such would be appropriate particularly if only the general activation level ν + is changing across time while the proportions between the rates ν n remain unaffected. Stabilizing the variances of estimates may also be desirable, which would suggest the parametrization ω n = (ν n /ν + ) 1/2 .
For simplicity we assumed that the "coordinated" spikes that constitute a pattern in our model of population activity occur strictly simultaneously. Such precision of synchrony cannot be expected in reality. Intuitively, this would not hamper our theory to the degree to which the "jitter" is small enough compared to the width h of the sampling bins, which was 5-20 milliseconds in most test simulations we looked at. In fact, this generates yet another conflict of interest between "small bins" which improve the performance of the estimators, and "large bins" which are less affected by jittered patterns (and, perhaps, by residual correlation across time resulting from spike after-effects).
What could be gained from analyses such as those proposed in this paper?
Positive statistical evidence for higher-order correlations in neuronal spike trains would be a novel result, potentially challenging conventional rate coding theories, as well as statements about the sufficiency of second-order correlations in capturing "essential aspects" of population activity in certain neuronal systems (e.g. Schneidman et al. [25], Shlens et al. [27]). However, spiking activity in a recurrent network will by necessity have a non-trivial statistical structure, potentially including also higher-order correlations. This is a consequence of network features, as for instance shared input and multiple feedback loops. Careful analysis of corresponding network models is necessary in order to correctly identify such structurally determined correlations, and not to mistake them for signatures of dynamic neuronal assemblies.
In fact, let us first suppose that hν + stays bounded away from zero. Then and this tends to zero by (b) and (c). Similarly, if hν + = o(1) then by (a) and (c) . These considerations, along with the exponential dependence of the asymptotic variance on hν + = − log p 0 , point to the crucial role of the quantity hν + -which is evident, on the other hand, considering that hν + is the expected number of points of the carrier process in a bin. The present findings are corroborated in the following.
Higher-order formulae are a mess, but can be obtained using a computer algebra system.