Path-Integral Methods for Analyzing the Effects of Fluctuations in Stochastic Hybrid Neural Networks

We consider applications of path-integral methods to the analysis of a stochastic hybrid model representing a network of synaptically coupled spiking neuronal populations. The state of each local population is described in terms of two stochastic variables, a continuous synaptic variable and a discrete activity variable. The synaptic variables evolve according to piecewise-deterministic dynamics describing, at the population level, synapses driven by spiking activity. The dynamical equations for the synaptic currents are only valid between jumps in spiking activity, and the latter are described by a jump Markov process whose transition rates depend on the synaptic variables. We assume a separation of time scales between fast spiking dynamics with time constant \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\tau_{a}$\end{document}τa and slower synaptic dynamics with time constant τ. This naturally introduces a small positive parameter \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\epsilon=\tau _{a}/\tau$\end{document}ϵ=τa/τ, which can be used to develop various asymptotic expansions of the corresponding path-integral representation of the stochastic dynamics. First, we derive a variational principle for maximum-likelihood paths of escape from a metastable state (large deviations in the small noise limit \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\epsilon\rightarrow0$\end{document}ϵ→0). We then show how the path integral provides an efficient method for obtaining a diffusion approximation of the hybrid system for small ϵ. The resulting Langevin equation can be used to analyze the effects of fluctuations within the basin of attraction of a metastable state, that is, ignoring the effects of large deviations. We illustrate this by using the Langevin approximation to analyze the effects of intrinsic noise on pattern formation in a spatially structured hybrid network. In particular, we show how noise enlarges the parameter regime over which patterns occur, in an analogous fashion to PDEs. Finally, we carry out a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$1/\epsilon$\end{document}1/ϵ-loop expansion of the path integral, and use this to derive corrections to voltage-based mean-field equations, analogous to the modified activity-based equations generated from a neural master equation.


Introduction
One of the major challenges in neuroscience is developing our understanding of how noise at the molecular and cellular levels affects dynamics and information processing at the macroscopic level of synaptically coupled neuronal populations. It is well known that the spike trains of individual cortical neurons in vivo tend to be very noisy, having interspike interval (ISI) distributions that are close to Poisson [1,2]. Indeed, one observes trial-to-trial variability in spike trains, even across trials in which external stimuli are identical. On the other hand, neurons are continuously bombarded by thousands of synaptic inputs, many of which are uncorrelated, so that an application of the law of large numbers would suggest that total input fluctuations are small. This would make it difficult to account for the Poisson-like behavior of individual neurons, even when stochastic ion channel fluctuations or random synaptic background activity is taken into account. One paradigm for reconciling these issues is the so-called balanced network [3][4][5]. In such networks, each neuron is driven by a combination of strong excitation and strong inhibition, which mainly cancel each other out, so that the remaining fluctuations occasionally and irregularly push the neuron over the firing threshold. Even in the absence of any external sources of noise, the resulting deterministic dynamics is chaotic and neural outputs are Poisson-like. Interestingly, there is some experimental evidence that cortical networks can operate in a balanced regime [6].
Another emergent feature of balanced networks is that they can support an asynchronous state characterized by large variability in single neuron spiking, and yet arbitrarily small pairwise correlations, even in the presence of substantial amounts of shared inputs [7]. Thus there is a growing consensus that the trial-to-trial irregularity in the spiking of individual neurons is often unimportant, and that information is typically encoded in firing rates. There is then another level of neural variability, namely, trial-to-trial variations in the firing rates themselves. Recent physiological data shows that the onset of a stimulus reduces firing-rate fluctuations in cortical neurons, while having little or no effect on the spiking variability [8]. Litwin-Kumar and Doiron have recently shown how these two levels of stochastic variability can emerge in a balanced network of randomly connected spiking neurons, in which a small amount of clustered connections induces firing-rate fluctuations superimposed on spontaneous spike fluctuations [9].
Various experimental and computational studies of neural variability thus motivate the incorporation of noise into rate-based neural network models [10]. One approach is to add extrinsic noise terms to deterministic models resulting in a neural Langevin equation [11][12][13][14][15]. An alternative approach is to assume that noise arises intrinsically as a collective population effect, and to describe the stochastic dynamics in terms of a neural master equation [16][17][18][19][20]. In the latter case, neurons are partitioned into a set of M local homogeneous populations labeled α = 1, . . . , M, each consisting of N neurons. The state of each population at time t is specified by the number N α (t) of active neurons in a sliding window (t, t + t], and transition rates between the discrete states are chosen so that standard rate-based models are obtained in the mean-field limit, where statistical correlations can be ignored. There are two versions of the neural master equation, which can be distinguished by the size of the sliding window width t. (Note that the stochastic models are keeping track of changes in population activity.) One version assumes that each population operates close to an asynchronous state for large N [18,19], so that one-step changes in population activity occur relatively slowly. Hence, one can set t = 1 and take N to be large but finite. The other version of the neural master equation assumes that population activity is approximately characterized by a Poisson process [17,20]. In order to maintain a one-step jump Markov process, it is necessary to take the limits t → 0, N → ∞ such that N t = 1. Thus, one considers the number of active neurons in an infinite background sea of inactive neurons, which is reasonable if the networks are in low activity states. (Note that it is also possible to interpret the master equation of Buice et al. in terms of activity states of individual neurons rather than populations [17,20].) One way to link the two versions of the neural master equation is to extend the Doi-Peliti path-integral representation of chemical master equations [21][22][23] to the neural case; the difference between the two versions then reduces to a different choice of scaling of the underlying action functional [18]. Buice et al. [17,20] used diagrammatic perturbations methods (Feynman graphs) to generate a truncated moment hierarchy based on factorial moments, and thus determined corrections to meanfield theory involving coupling to two-point and higher-order cumulants. They also used renormalization group methods to derive scaling laws for statistical correlations close to criticality, that is, close to a bifurcation point of the underlying deterministic model [17]. On the other hand, Bressloff [18,19] showed how the path-integral representation of the master equation can be used to investigate large deviations or rare event statistics underlying escape from the basin of attraction of a metastable state, following along analogous lines to previous work on large deviations in chemical master equations [24][25][26].
One limitation of both versions of the neural master equation is that they neglect the dynamics of synaptic currents. The latter could be particularly significant if the time scale τ of synaptic dynamics is larger than the window width t. Therefore, we recently extended the Buice et al. neural master equation by formulating the network population dynamics in terms of a stochastic hybrid system also known as a 'velocity' jump Markov process [27]. The state of each population is now described in terms of two stochastic variables U α (t) and N α (t). The synaptic variables U α (t) evolve according to piecewise-deterministic dynamics describing, at the population level, synapses driven by spiking activity. These equations are only valid between jumps in spiking activity N α (t), which are described by a jump Markov process whose transition rates depend on the synaptic variables. We also showed how asymptotic methods recently developed to study metastability in other stochastic hybrid systems, such as stochastic ion channels, motor-driven intracellular cargo transport, and gene networks [28][29][30][31][32], can be extended to analyze metastability in stochastic hybrid neural networks, in a regime where the synaptic dynamics is much slower than the spiking dynamics. In the case of ion channels, N α would represent the number of open channels of type α, whereas U α would be replaced by the membrane voltage V . On the other hand, for intracellular transport, N α would be the number of motors of type α actively transporting a cargo and U α would be replaced by spatial position along the track.
In this paper we show how a path-integral representation of a stochastic hybrid neural network provides a unifying framework for a variety of asymptotic perturbation methods. The basic hybrid neural network model is described in Sect. 2, where we consider several limiting cases. In Sect. 3, we reprise the path-integral construction of Bressloff and Newby [33], highlighting certain features that were not covered in the original treatment, including the connection with large-deviation principles [34], and potential difficulties in the thermodynamic limit N → ∞. In Sect. 4, we derive the basic variational principle that can be used to explore maximum-likelihood paths of escape from a metastable state, and relate the theory to the underlying Hamiltonian structure of the path-integral representation. In Sect. 5, we show how the pathintegral representation provides an efficient method for deriving a diffusion approximation of a stochastic hybrid neural network. Although the diffusion approximation breaks down when considering escape problems, it provides useful insights into the effects of fluctuations within the basin of attraction of a given solution. We illustrate this by using the diffusion approximation to explore the effects of noise on neural pattern formation in a spatially structured network. In particular, we show how noise expands the parameter regime over which patterns can be observed, in an analogous fashion to stochastic PDEs. Finally, in Sect. 6, we use the path-integral representation to derive corrections to voltage-based mean-field equations, along analogous lines to the analysis of activity-based mean-field equations arising from the neural master equation [17,20].

Stochastic Hybrid Network Model
We first describe a stochastic neural network model that generalizes the neural master equation [17,18,20] by incorporating synaptic dynamics. (A more detailed derivation of the model can be found in [27].) Note that there does not currently exist a complete, rigorous derivation of population rate-based models starting from detailed biophysical models of individual neurons, although some significant progress has been made in a series of papers by Buice and Chow on generalized activity equations for theta neurons [35][36][37]. Therefore, the construction of the stochastic rate-based model is phenomenological in nature. However, it is motivated by the idea that finite-size effects in local populations of neurons acts as a source of intrinsic noise. Consider a set of M homogeneous populations labeled α = 1, . . . , M, with N neurons in each population. (A straightforward generalization would be for each population to consist of O(N ) neurons.) The output activity of each population is taken to be a discrete stochastic variable A α (t) given by where N α (t) is the number of neurons in the αth population that fired in the time interval [t − t, t], and t is the width of a sliding window that counts spikes. The discrete stochastic variables N α (t) are taken to evolve according to a one-step jump Markov process: with corresponding transition rates Here F is a sigmoid firing-rate or gain function where γ , κ correspond to the gain and threshold, respectively, F 0 is the maximum firing rate, and U α (t) is the effective synaptic current into the αth population, which evolves (for exponential synapses) according to We will assume that N is large but finite and take N t = 1. In the dual limits N → ∞ and τ → 0, our model then reduces to the Buice et al. [17,20] version of the neural master equation. The resulting stochastic process defined by (2.1)-(2.5) is an example of a stochastic hybrid system based on a piecewise-deterministic process. That is, the transition rate ω + depend on U α , with the latter itself coupled to the associated jump Markov according to (2.5), which is only defined between jumps, during which U α (t) evolves deterministically. It is important to note that the time constant τ a cannot be identified directly with membrane or synaptic time constants. Instead, it determines the relaxation rate of a local population to the instantaneous firing rate. Introduce the probability density Prob U α (t) ∈ (u α , u α + du), N α (t) = n α ; α = 1, . . . , M = p(u, n, t|u 0 , n 0 , 0) du, with u = (u 1 , . . . , u M ) and n = (n 1 , . . . , n M ). It follows from (2.1)-(2.5) that the probability density evolves according to the differential Chapman-Kolmogorov (CK) equation (dropping the explicit dependence on initial conditions) with v α (u, n) = −u α + β w αβ n β , (2.7) and T α the translation operator: T ±1 α f (n) = f (n α± ) for any function f with n α± denoting the configuration with n α replaced by n α ± 1. Equation (2.6) can be reexpressed in the more general form The drift 'velocities' v α (u, n) for fixed n represent the piecewise-deterministic synaptic dynamics according to and W is defined in terms of the u-dependent transition matrix T for the jump Markov process, that is, For fixed u, the matrix W α is irreducible (which means that there is a non-zero probability of transitioning, possibly in more than one step, from any state to any other state in the jump Markov process). Moreover, all off-diagonal elements are non-negative. It follows that the full transition matrix W (n, m; u) also has these properties and, hence, we can apply the Perron-Frobenius theorem to show that there exists a unique invariant measure for the Markov process. That is, the master equation has a globally attracting steady state ρ(u, n) such that p(u, n, t) → ρ(u, n) as t → ∞.
The Perron-Frobenius theorem states that [38] a real square matrix with positive entries has a unique largest real eigenvalue (the Perron eigenvalue) and that the corresponding eigenvector has strictly positive components. If we define a new transition matrix W (n, m; u) by W (n, m; u) = W (n, m; u) + δ n,m W * , W * = max m W (m, m; u) + κ, for an arbitrary κ > 0, then we can apply the Perron-Frobenius theorem directly to W and thus to W . Since n W (n, m; u) = 0 for all m, that is, η(n) = (1, 1, . . . , 1) T is a left null-vector, it follows that the Perron eigenvalue is λ = 0. The unique invariant measure then corresponds to the right null-vector of W for fixed u: m W (n, m; u)ρ(u, m) = 0. (2.10) The steady-state solution ρ(u, n) of (2.6) can be factorized as ρ(u, A sufficient condition for (2.11) to hold is Since ρ 0 (u, −1) ≡ 0, it then follows that J (u, 0) = 0 and thus J (u, n) = 0 for all n. Hence, we obtain the positive steady-state solution The Perron-Frobenius theorem ensures that this is the unique positive solution. The fact that the steady state factorizes is a consequence of the fact that the transition rates do not involve any coupling between populations-the only coupling appears in the drift terms of (2.6). Strictly speaking, the Perron-Frobenius theorem applies to finite-dimensional matrices, so we are assuming that N is finite. Nevertheless, in the thermodynamic limit N → ∞, the corresponding normalized density reduces to a Poisson process with rate F (u): There are two time scales in the CK equation (2.8), the synaptic time constant τ and the time constant τ a , which characterizes the relaxation rate of population activity. In the limit τ → 0, (2.5) reduces to the neural master equation of Buice et al. [17,20]. First, note that the synaptic variables U α (t) are eliminated by setting v α = 0, that is, U α (t) = β w αβ A β (t). This then leads to a pure birth-death process for the discrete variables N α (t). That is, let P (n, t) = Prob[N (t) = n] denote the probability that the network of interacting populations has configuration n = (n 1 , n 2 , . . . , n M ) at time t, t > 0, given some initial distribution P (n, 0). The probability distribution then evolves according to the birth-death master equation [17,18,20] where Buice et al. [20] show that the network operates in a Poisson-like regime in which the rates of the Poisson process are stochastic variables whose means evolve according to the activity-based mean-field equation On the other hand, if τ a → 0 for fixed τ , then we obtain deterministic voltage or current-based mean-field equations Since ρ(u, n) is given by a product of independent Poisson processes with rates F (u α ), consistent with the operating regime of the Buice et al. master equation [17,20], it follows that and (2.17) reduces to the standard voltage or current-based activity equation Note that the limit τ a → 0 is analogous to the slow synapse approximation used by Ermentrout [39] to reduce deterministic conductance-based neuron models to voltage-based rate models. Now suppose that the network operates in the regime 0 < τ a /τ ≡ 1. There is then a natural small parameter in the system, , which allows a variety of perturbation methods to be used: (i) A quasi-steady-state (QSS) diffusion approximation of the stochastic hybrid system, in which the CK equation (2.6) reduces to a Fokker-Planck equation [27]. This exploits the fact that for small there are typically a large number of transitions between different firing states n while the synaptic currents u hardly change at all. This implies that the system rapidly converges to the (quasi) steady state ρ(u, n), which will then be perturbed as u slowly evolves. (ii) The diffusion approximation captures the Gaussian-like fluctuations within the basin of attraction of a fixed point of the mean-field equations. However, for small this yields exponentially large errors for the transition rates between metastable states. (A similar problem arises in approximating chemical and neural master equations by a Fokker-Planck equation in the large N limit [19,24,40].) However, one can use a Wentzel-Kramers-Brillouin (WKB) approximation of solutions to the full CK equation to calculate the mean first passage time for escape [27]. (iii) Another way to analyze the dynamics of a stochastic hybrid network is to derive moment equations. However, for a nonlinear system, this yields an infinite hierarchy of coupled moment equations, resulting in the problem of moment closure. In the case of small , one can expand the moment equations to some finite order in .
In this paper, we show how a path-integral representation of a stochastic hybrid system provides a unifying framework for carrying out all three perturbation schemes highlighted above.

One-Population Model
We now derive the path-integral representation of a stochastic hybrid neural network using the construction introduced in [33]. For ease of notation, we consider a one-population model (M = 1); the generalization to multiple populations is then straightforward (see Sect. 3.4). We first discretize time by dividing a given interval [0, T ] into N equal subintervals of size t such that T = N t and set u j = u(j t), n j = n(j t). (Note that the infinitesimal time interval t used in path discretization is distinct from the width of the moving window used in the construction of the stochastic neural network; see Sect. 2. One should also take care to distinguish between the discrete time label j and the population label α.) The conditional probability density for u 1 , . . . , u N , given u 0 and a particular realization of the stochastic discrete variables n j , j = 0, . . . , N − 1, is Inserting the Fourier representation of the Dirac delta function gives On averaging with respect to the intermediate states In order to evaluate the above path integral, we introduce the eigenvalue equation and let ξ (s) m be the adjoint eigenvector satisfying In our original construction of the path-integral representation [33], we arrived at (3.4) and its adjoint through trial and error, based on our previous work on WKB methods. It turns out that the principal eigenvalue of the linear equation (3.4) can be related to the rate function of large-deviation theory, as we explain in Sect. 3.2. A basic result from linear algebra is that R (s) and ξ (s) form a bi-orthonormal set for fixed u, q. First, rewrite (3.4) and (3.5) in the compact form Defining the inner product ξ (s) , R (s ) = n ξ (s) n R (s ) n , we see that which leads to the completeness relation for all u, q. Now suppose that we insert multiple copies of the identity (3.6) into the path integral (3.2) with q = q j at the (j + 1)th time step. That is, taking we find that to leading order in O( u, t). It is important to note that the total path integral is independent of the q j , since performing the summations over s j recovers the Kronecker deltas. Let us now introduce the probability density Substituting for P using (3.2) and (3.7), leads to By inserting the eigenfunction products and using the Fourier representation of the Dirac delta function, we have introduced sums over the discrete labels s j and new phase variables p j . However, this allows us to obtain a simple action principle in the limit → 0. Since the path integral is ultimately independent of the q j , we are free to set q j = i p j for all j , thus eliminating the final exponential factor. (Fixing the q j is analogous to gauge-fixing in field theory.) This choice means that we can perform the summations with respect to the intermediate discrete states n j using the orthogonality relation We thus obtain the result that s j = s for all j , which means that we can then take the continuum limit of (3.9) to obtain the following path integral from u(0) = u 0 to u(τ ) = u (after performing the change of variables i p j → p j , that is, performing a contour deformation in the complex p-plane): Applying the Perron-Frobenius theorem to the linear operator on the left-hand side of (3.4) for fixed u and q, shows that there exists a real, simple Perron eigenvalue. We assume that the eigenvalues are ordered such that λ 0 > Re(λ 1 ) ≥ Re(λ 2 ) . . . with λ 0 the Perron eigenvalue. Since λ 0 is the only eigenvalue with a positive eigenfunction, we require on physical grounds that the initial and final states are only non-vanishing for s = 0. It follows that the sum over s in (3.9) projects to the single term s = 0. Also note that the factor R (0) n (u, p(τ ))ξ (0) n 0 (u 0 , p(0)) in (3.9) essentially projects on to stochastic trajectories that start in the discrete state n 0 and terminate in the discrete state n. We will ignore any restrictions on these discrete states and simply consider the probability density (for fixed u(0) = u 0 ) (3.10) with the action and λ 0 the Perron eigenvalue of the linear equation Formally comparing to classical mechanics, we have a path integral in a phase space (u, p) consisting of a dynamical variable u(t) and its 'conjugate momentum' p(t) with the Perron eigenvalue λ 0 (u, p) interpreted as a Hamiltonian. This underlying Hamiltonian structure of a stochastic hybrid system has also been identified using large-deviation theory [34,41]; see below.

Large-Deviation Principles
It is important to point out that the formal derivation of the path integral (3.10), see also [33], involves a few steps that have not been justified rigorously. First, we 'gauge fix' the path integral by setting q j = p j with p j pure imaginary. However, when we carry out steepest descents, we assume that the dominant contribution to the path integral in the complex p-plane occurs for real p j . (There is an assumption as regards analytic continuation.) This then allows us to apply the Perron-Frobenius theorem to the linear operator of the eigenvalue equation. Second, we have not established that the discrete path integral converges to a well-defined functional measure in the continuum limit. Nevertheless, it turns out that the resulting action S[u, p] is identical to one obtained using large-deviation theory [41][42][43]. This connection has recently been established by Bressloff and Faugeras [34]. We briefly summarize the main results here. Following [34], we take as our starting point a Lagrangian large-deviation principle of Faggionato et al. [42,43], which applies to a wide class of stochastic hybrid systems. Here we state the LDP for the particular one-population neural model. Finally, take P u 0 ,n 0 to be the probability density functional or law of the set of trajectories in Y u 0 . The following large-deviation principle then holds [42,43]: Then, for any given path

16)
where the rate function J T : Y u 0 → [0, ∞) is given by Here the symbol ∼ means asymptotic logarithmic equivalence in the limit → 0.
A key idea behind the LDP is that a slow dynamical process coupled to the fast Markov chain on Γ rapidly samples the different discrete states of Γ according to some non-negative measure ψ . In the limit → 0, one has ψ → ρ, where ρ is the ergodic measure of the Markov chain. On the other hand, for small but non-zero , ψ is itself distributed according to the LDP (3.16), whereby one averages the different functions v n (x) over the measure ψ to determine the dynamics of the slow system. In our population model, we are interested in the synaptic current u (for a current-based or voltage-based model). Eliminating ψ(t) using a contraction principle then leads to the following LDP for {u(t)} t∈[0,T ] alone [41,43]: Roughly speaking, one can understand the contraction principle in terms of steepest descents. That is where D[ψ] is the appropriate functional measure on M + ([0, T ]) Γ . The path integral is then dominated by the infimum of the rate function in the limit → 0.
In [34], it is proven that the rate function (3.18) can be written in the form of an action with Lagrangian given by Since ∂ μ λ = y, we see that p = μ i.e., we can identify the Lagrange multiplier μ in the construction of the Lagrangian as the conjugate momentum p of the Hamiltonian where λ 0 (u, p) is the Perron eigenvalue of the linear equation (3.12). It follows that the action obtained from a large-deviation principle is identical to the action (3.11) derived using formal path-integral methods.

Calculation of Perron Eigenvalue
In our previous work [32,33], we obtained an explicit solution for the Perron eigenvalue and the associated positive eigenvector by taking the number of discrete states in each population to be infinite, that is, N → ∞. However, the classical Perron-Frobenius theorem applies to finite-dimensional Markov processes. One consequence of this is that the Perron eigenvalue develops singularities in the thermodynamic limit.
In order to explore this issue, let us return to the one-population eigenvalue equation (3.12), which takes the explicit form In the infinite-dimensional case, one can formally solve this equation using the trial positive solution This yields the following equation relating Λ and p: We now collect terms independent of n and linear in n, respectively, to obtain the pair of equations It follows that There is clearly a singularity at p = 1/w such that Λ(u, p) < 0 for p > 1/w, contradicting the requirement that the eigenfunction R Assuming that R (0) n = Λ n /n! for 0 ≤ n < N , with Λ given by (3.26) and p < 1/w (positive solution), we see that the first equation is satisfied by taking R (0) However, the second equation requires In the large-N limit with p < 1/w, we set This shows that the given ansatz is a good approximation to the eigensolution for large N and p < 1/w. Clearly, the given ansatz breaks down as p crosses p = 1/w. Although the Perron-Frobenius theorem guarantees a unique positive solution for finite N , it does not have a simple expression in the large N limit. In conclusion, our expression (3.27) for the Perron eigenvalue only holds for p < 1/w. This does not affect our subsequent analysis because we evaluate the path integral in regions for which p < 1/w.

Multi-population Model
Following along identical lines to the one-population model, we can derive a pathintegral representation of the solution of the multi-population CK equation ( u, p, n), (3.30) and ξ (0) is the corresponding adjoint eigenvector. For sufficiently small p α s, (3.30) can be solved for the Perron eigenvalue in the thermodynamic limit N → ∞ using the ansatz Substituting into (3.30) and using the explicit expressions for W and v α , we find that Collecting terms in n α for each α yields p β w βα , (3.33) and collecting terms independent of all n α gives Solving for each Λ α in terms of p, we have As in the one-population model, the Perron eigenvalue has singularities, reflecting the possible breakdown of the Perron-Frobenius theorem in the thermodynamic limit.

A Variational Principle and Optimal Paths of Escape
It is clear from the formal structure of the path integral (3.28) that each synaptic variable u α has a 'conjugate momentum' p α with λ 0 (u, p) the corresponding 'Hamiltonian' H . Applying steepest descents to the path integral for small yields a variational principle in which maximum-likelihood paths minimize the action (3.29). As is well known from classical mechanics, the least action principle leads to Hamilton's equationsu H (u, p), (4.1) describing a 'classical particle' moving in the phase space (u, p). What is the physical interpretation of the solutions to Hamilton's equations? In order to address this question, suppose that the underlying deterministic mean-field equation (2.19) has a stable fixed point u s with some basin of attraction Ω, as illustrated in Fig. 1. If the system starts within Ω, then on relatively short time scales we expect the system to rapidly converge to u s along a classical deterministic trajectory, with noise generating Gaussian-like fluctuations about this trajectory. However, on a longer time scale, a rare event (large fluctuation) will generate a path of escape from u s to the boundary It follows that (u s , 0) is a fixed point in the full phase space with the line p = 0 a stable manifold. Along this manifold, u converges to u s according to the scalar version of the mean-field equation (2.19). From the explicit expression for λ 0 (u, p), (3.27), we see that there exists another zero energy solution given by which is the unique, non-trivial solution of the equation for positive functions ψ m (u). (Note that p < 1/w so that we do not have to worry about the singular nature of the Perron eigenvalue in the limit N → ∞.) It corresponds to the trajectory along the unstable manifold of (u s , 0) and is the optimal path of escape from u s . Along this optimal path λ 0 = 0, so that the corresponding action is given by the quasipotential A similar situation holds for the higher-dimensional case, except that there are now multiple maximum-likelihood paths of escape from a metastable state [27,33].

The Diffusion Approximation and Neural Pattern Formation
Another useful application of the multi-population path integral (3.28) is that it provides a direct method for obtaining a Gaussian or diffusion approximation of the stochastic hybrid system, equivalent to the one obtained using the more complicated QSS reduction [27]. Performing the rescaling p → ip/ gives The Gaussian approximation involves Taylor expanding the Lagrangian to first order in , which yields a quadratic in p: where Q αγ (u) = β w αβ F (u β )w γβ . Performing the Gaussian integration then yields with action functional where V α (u) = −u α + β w αβ F (u β ). This path integral is identical in form to the Onsager-Machlup path-integral representation [44] of solutions to the Langevin equation where the W α (t) are independent Wiener processes. Since there is no additional Jacobian factor in the Onsager-Machlup path integral, it follows that the Langevin equation is of the Ito form. As we have discussed extensively elsewhere [27,33], the diffusion or Gaussian approximation breaks down when solving escape problems. On the other hand, it provides useful information when analyzing the effects of fluctuations within the basin of attraction of a metastable state. For example, it is well known within the context of PDEs that fluctuations can enlarge the parameter regime over which time-periodic (limit cycles) or spatially periodic (Turing patterns) can occur. A similar phenomenon exists for stochastic hybrid neural networks. We will illustrate this by considering Turing-like instabilities in a spatially structured hybrid neural network under the diffusion approximation.

Noise-Induced Pattern Formation
Consider a system of coupled homogeneous neural populations that are distributed on a regular d-dimensional lattice L, with lattice spacing α and site index α ∈ L. Following recent studies of stochastic pattern formation in RD systems [45][46][47][48][49][50][51], we investigate the occurrence of stochastic neural patterns by linearizing the spatially discrete Langevin equation (5.2) about a homogeneous stationary solution u 0 of the mean-field equation (2.19) and calculating the resulting power spectrum using discrete Fourier transforms. In order to reflect the homogeneous structure of the weights we also set into (5.2) and Taylor expanding to first order in Φ gives the multivariate Ornstein-Uhlenbeck process with and Considerable insight into the behavior of the system can now be obtained by transforming to Fourier space [45,51]. For simplicity, consider a 1D lattice with periodic boundary conditions, u α+N = u α for α = 1, . . . , N and set the lattice spacing α = 1. Introduce the discrete Fourier transforms with k = 2πm/N , m = 0, . . . , N − 1. Using the following result for convolutions: the discrete Fourier transform of the Langevin equation is and ξ(k, t) = 0, Note that the homogeneous equation determines the stability of the fixed point u 0 in the absence of noise. It follows that the deterministic system is stable provided that J 0 (k) < 0 for all k. Suppose that the gain μ = F (u 0 ) is treated as a bifurcation parameter. Clearly if w(k) is bounded and μ is sufficiently small, then J 0 (k) < 0 for all k. However, if max k { w(k)} = w(k c ) > 0 then the fixed point becomes marginally stable at μ = μ c = 1/ w(k c ), resulting in the growth of a spatially periodic pattern of wavenumber k c as μ crosses μ c . A standard neural mechanism for inducing a Turing-like instability is to have a combination of short-range excitation and longer-range inhibition [52,53]. This can implemented in the 1D scalar model by taking w to be the difference-of-Gaussians (More precisely, in order to match the periodic boundary conditions, we should take w(α) = n w D (α − nN).) Spectral theory can now be used to determine the effects of noise on pattern formation. First, Fourier transforming the Langevin equation (5.6) with respect to time gives It follows that Defining the power spectrum by (5.10) From the deterministic theory, we know that the system undergoes a Turing instability (stationary patterns) rather than a Turing-Hopf instability (oscillatory patterns) so we can set Ω = 0 and determine conditions under which S(k, 0) has a peak at a nonzero, finite value of k, which is an indication of a stochastic pattern. Substituting the explicit expression for Λ(k, 0) and B 0 (k), we have (5.11) Suppose that μ ≡ F (u 0 ) < μ c so the system is below the deterministic critical point for a Turing instability. Clearly S(k, 0) becomes singular as μ → μ c , consistent with the fixed point becoming unstable. The main new result is that S(k, 0) has a peak at the critical wavenumber k c for all μ, 0 < μ < μ c = w(k c ) −1 . This follows from the fact that λ(k) < 0 for all k in the subcritical regime with min k {|λ(k)|} = |λ(k c )|. Hence, S(k, 0) will have a peak at k = k c provided that 0 < λ(k c ) ≡ 1 − μ w(k c ) < 1 =⇒ μ < μ c . (5.12) This is illustrated in Fig. 2.

Continuum Limit
The above stochastic model of a spatially structured lattice of neural populations can be reduced to a stochastic neural field by taking a continuum limit. A heuristic derivation proceeds as follows. Suppose that there is a uniform density ρ of populations distributed in R d . We then reinterpret u α as the mean current averaged over the ρ α d populations in the infinitesimal volume α d centered at the lattice point α ∈ L. If an individual population in the set of populations centered at α is labeled by the pair (α, j ), then u α = j u α,j ρ α d .
We will assume that the weights are slowly varying on the length-scale α so that w αj,α j = w αα . (Relaxing this assumption can lead to additional sources of stochasticity as explored in [12,14].) The deterministic mean-field (2.19) for an individual population becomes Averaging with respect to j gives under the approximation that all local populations are in a similar state so Effectively, we are scaling the population firing-rate function by a factor ρ α d . Finally, setting α = x, u α (t) = u(x, t), ρw αα = w(x, x ), and taking the continuum limit α → 0 yields the deterministic neural field equation Applying a similar analysis to the diffusion matrix, we have Hence, Q is independent of the local population labels j , j and the Langevin equation (5.2) becomes (5.14) Averaging with respect to j and taking the continuum limit yields the following neural field model with spatiotemporal Gaussian white noise: where B x, x = w x, x 2F U x , t (5.16) and η(x, t) = 0, For finite α, we have introduced the scaling η α (t)/ √ α d = η(x, t). From a numerical perspective, any computer simulation would involve rediscretizing space and then solving a time-discretized version of the resulting stochastic neural field equation. On the other hand, in order to investigate analytically the effects of noise on spatiotemporal dynamics such as traveling waves, it is more useful to work directly with stochastic neural fields. One can then adapt various PDE methods for studying noise in spatially extended systems [15,[54][55][56][57][58]. Finally, note that a large-deviation principle for a stochastic neural field with additive noise has been developed in [59].

Generating Functionals and the 1/ Loop Expansion
One step beyond the Gaussian approximation is to consider corrections to the meanfield equation (2.19), which couple the mean synaptic current with higher-order moments. As demonstrated previously for neural master equations [17,18,20], path integrals provide a systematic method for generating the hierarchy of moment equations. We will illustrate this by calculating the lowest-order correction to mean-field theory based on coupling to second-order correlations. One could then take investigate the bifurcation structure of the higher-dimensional dynamical system along analogous lines to Touboul and Ermentrout [13]. However, certain caution must be exercised, since one does not keep track of the validity of the truncated moment equations. Note that the path-integral methods used in this section were originally introduced within the context of stochastic processes by Martin-Siggia-Rose [60], and have previously been applied to stochastic neural networks by Sompolinsky et al. [61,62] and Buice et al. [17,20].

Generating Functional and Moments
First note that the average synaptic current U α is given by and two-point correlations are Another important characterization of the system is how the mean synaptic current responds to small external inputs. Suppose that we add a small external source term h α (t) onto the right-hand side of the deterministic rate equation (2.19). Linearizing about the time-dependent solution of the unperturbed equation (h ≡ 0) leads to the following (non-autonomous) linear equation for the perturbed solution Introducing the Green's function or propagator G 0 αβ (t, t ) according to the adjoint equation we can express the linear response as In other words, in terms of functional derivatives Now suppose that we add a source term to the path-integral representation. This corresponds to adding a term γ h γ (t)p γ (t) dt to the action (3.29). It follows that the associated Green's function for the full stochastic model is given by The above analysis motivates the introduction of the generating functional Various moments of physical interest can then be obtained by taking functional derivatives with respect to the 'current sources' J, J. For example,

Effective Action and Corrections to Mean-Field Equations
Let us rescale the currents according to J → J/ and J → J/ so that we can apply a loop expansion of the path integral (6.8), which is a diagrammatic method for carrying out an expansion based on steepest descents or the saddle-point method. First, we introduce the exact means and we shift the variables by Expanding the action in (6.8) to second order in the shifted variables u, p yields an infinite-dimensional Gaussian integral, which can be formally evaluated to give where D[ν, ν] is the matrix with components We have introduced the vectors u r , r = 1, 2 with u 1 = u, u 2 = p. Using the following identity for a matrix M: In order to use the above expansion to determine corrections to the mean-field equations, it is first necessary to introduce a little more formalism. First, consider the Legendre transformation where W [J, J] = −N −1 log Z[J, J] and Γ is known as the effective action. Since , (6.13) it follows from functionally differentiating (6.12) that . (6.14) Dynamical equations for the physical mean fields ν α (t) are then generated by setting J = 0 = J in (6.14). Another useful result is obtained by functionally differentiating (6.13) with respect to the mean fields ν, ν: where ν 1 α = ν α , ν 2 α = ν α and J 1 α = J α , J 2 α = J α . Differentiating (6.14) with respect to J, J then shows that It now follows from (6.10) and (6.12) that Γ [ν, ν] = S eff [(ν, ν)] + O( 2 ). Moreover, (6.9) and (6.15) imply that D[ν, ν] = D[ν, ν] + O( ), that is, we can take D[ν, ν] to be the inverse of the two-point covariance matrix. The first-order correction to the mean-field equation (2.19) is then obtained from (6.14) after setting J = J = ν = 0: .
The functional derivative in the above equation forces t = t = t (see also [20]). Since the only non-vanishing, equal-time two-point correlation function when p = 0 is for r = s = 1, it follows that where C βγ (t) = U β (t)U γ (t) − U β (t) U γ (t) , and δ δp α (t) Evaluating the functional derivative of the action S given by (3.29) and ( The corrections to mean-field theory for a stochastic hybrid neural network differ significantly from those derived for the Buice et al. master equation [17,20]. There are two primary sources of such differences. One arises from the fact that the mean equation is in 'Amari form' (with the weight matrix outside the nonlinearity). This accounts for all the difference in (6.16) for the mean, which would otherwise be identical to that of Buice et al., and the last term involving C in (6.17). The other difference is in the non-homogeneous source term for the C equation, which appears as γ w αγ F (u γ )w βγ . Whereas the Buice et al. correlations are determined by multiple network motifs (with the lowest order being the direct connection w αβ from β to α), our result for the hybrid model indicates that the source term is given by divergent motifs indicating common input from a third population (population γ → populations α, β).

Discussion
In conclusion, we have constructed a path-integral representation of solutions to a stochastic hybrid neural network, and shown how this provides a unifying framework for carrying out various perturbation schemes for analyzing the stochastic dynamics, namely, large deviations, diffusion approximations, and corrections to mean-field equations. We highlighted the fact that the path-integral action can be expressed in terms of a Hamiltonian, which is given by the Perron eigenvalue of an appropriately defined linear operator. The latter depends on the transition rates and drift terms of the underlying hybrid system. The resulting action is consistent with that obtained using large-deviation theory.
In terms of the theory of stochastic neural networks, our hybrid model extends the neural master equation to include the effects of synaptic currents. In the limit of fast synapses one recovers the neural master equation, which can be viewed as a stochastic version of the 'Wilson-Cowan' rate equations (with the weight matrix inside the nonlinearity). On the other hand, in the case of slow synapses, one obtains a stochastic version of the 'Amari' rate equations. This leads to significant differences in the corrections to the mean-field equations. Finally, it should be noted that the path-integral formulation presented here can be applied to more general stochastic hybrid systems such as stochastic ion channels, molecular motors, and gene networks [28][29][30][31][32]. Thus one can view our path-integral construction as the hybrid analog of the Doi-Peliti path integral for master equations.