Brought to you by:
Topical Review

Path integral methods for the dynamics of stochastic and disordered systems

, and

Published 12 December 2016 © 2016 IOP Publishing Ltd
, , Modelling and inference in the dynamics of complex interaction networks Citation John A Hertz et al 2017 J. Phys. A: Math. Theor. 50 033001 DOI 10.1088/1751-8121/50/3/033001

1751-8121/50/3/033001

Abstract

We review some of the techniques used to study the dynamics of disordered systems subject to both quenched and fast (thermal) noise. Starting from the Martin–Siggia–Rose/Janssen–De Dominicis–Peliti path integral formalism for a single variable stochastic dynamics, we provide a pedagogical survey of the perturbative, i.e. diagrammatic, approach to dynamics and how this formalism can be used for studying soft spin models. We review the supersymmetric formulation of the Langevin dynamics of these models and discuss the physical implications of the supersymmetry. We also describe the key steps involved in studying the disorder-averaged dynamics. Finally, we discuss the path integral approach for the case of hard Ising spins and review some recent developments in the dynamics of such kinetic Ising models.

Export citation and abstract BibTeX RIS

1. Introduction

Studying the statistical properties of variables or classical fields subject to stochastic forces has a long history in physics. A key tool in this effort is the Langevin equation [1]. Originally introduced for describing the Brownian motion of a particle in a fluid, over the years, this has also been used for studying the dynamics of a variety of other systems including the standard models of critical phenomena, e.g. those described by the Ginzburg–Landau Hamiltonian [2] and, later on, the relaxation of spin glass models with soft spins [3, 4].

In order to study the dynamical version of the Ginzburg–Landau model in the presence of thermal noise one has to appeal to a perturbative analysis. A useful pedagogical description of the early methods used in such a perturbative treatment can be found in the seminal book by Ma [2] (chapter 5) among other places. A major advance was made by Martin, Siggia and Rose (MSR) in 1973 [5] who realized that one can study classical variables subject to stochastic noise by representing them as Heisenberg operators with appropriately defined commutation relations with conjugate fields. These observations allowed MSR to write down a generating functional that naturally lends itself to a perturbative treatment and the use of other field theoretic tools. The most important early development that followed was the work by De Dominicis, Peliti and Janssen [68]. They realized that the conjugate field, which was a fundamental insight and step in the MSR theory, and was introduced in some sense by hand there, arises naturally if one takes a different point of view. Here one starts the analysis by writing the probability of a path taken by the stochastic variables or fields, which can be used to construct a generating functional expressed as a functional path integral. We refer the reader specifically to [9], which in addition to describing various extensions of the earlier work provides an instructive comparison between the MSR operator based formalism and the functional integral method of De Dominicis, Peliti and Janssen. This latter path integral formalism, which we describe in more detail in this review, has been used to study systems subject to both fast and quenched noise [3]. It paved the way for a more complete understanding of the ensemble averaged dynamics of these systems [4, 10] and, later on, the single sample [11, 12] dynamics of spin glasses and related models.

Despite the remarkable power of the path integral approach, a pedagogical introduction to the method and its key results is lacking. The aim of this paper is to provide a review to fill this gap. The material presented here has previously been used by the authors in a number of lectures and courses (see e.g. the online lectures in [13]); parts of it can also be found in [14].

Some of the material has also been reviewed and discussed elsewhere. The diagrammatic approach to the study of Langevin equations has been described in several classical books [2, 15, 16]. The supersymmetric method applied to dynamics is covered in [15], for example. In this review we aim to combine these classical results with more novel material, including results on the dynamics of hard spin models, in a coherent and consistent notation. By going into more depth in the calculations, we also hope to provide a useful resource, both for newcomers and more experienced researchers in the field. We also recommend that readers consult a recent review by Chow and Buice [17], which covers much of the same material that we do, studying some different model systems from a somewhat different perspective, and in particular demonstrating applications of these methods to the FitzHugh–Nagumo model of a neuron.

In what follows, we start by introducing the path integral formalism for the dynamics of a random variable that evolves according to a stochastic differential equation. We then illustrate how to perform a perturbation analysis and construct Feynman diagrams within this approach. We also show how this treatment can be cast into a supersymmetric form and discuss the physical interpretation of the supersymmetries that are revealed by this approach. Subsequently, we extend the treatment to systems with quenched interactions, especially spin glasses. Finally, we discuss how the path integral formulation can be used to study the dynamics of hard spin models with quenched interactions through a systematic expansion in the strength of the couplings. Before diving into the path integral formulation, however, we review in the next section the two main approaches (Ito and Stratonovich) for interpreting a stochastic differential equation.

2. Ito versus Stratonovich

Consider the linear Langevin equation

Equation (1)

where h(t) is a small field used to probe linear response and ζ is zero mean Gaussian noise with correlation function $\langle \zeta (t)\zeta (t^{\prime} )\rangle =2T\delta (t-t^{\prime} )$, where $\delta (\cdot )$ is the Dirac delta-function. Equation (1) can of course be integrated directly from an initial condition at time t = 0

Equation (2)

and by averaging over the noise and, uncorrelated with it, $\phi (0)$, we get the correlation function (for h = 0)6

Equation (3)

and the response function

Equation (4)

where ${\rm{\Theta }}(x)$ is the Heaviside step function, with ${\rm{\Theta }}(x)=1$ for $x\gt 0$ and ${\rm{\Theta }}(x)=0$ for $x\lt 0$.

We will see below that in the diagrammatic perturbation theory for treating nonlinear Langevin equations, we will need the equal time response $R(t,t)$. This is a priori undefined, however, since the response function has a step discontinuity at $t^{\prime} =t$.

An alternative perspective on this issue is provided by writing the response as a correlation function with the noise. Using the fact that the average over the distribution of ζ is with probability weight $P[\zeta ]\sim \exp [-{(4T)}^{-1}\int {\rm{d}}{t}\,{\zeta }^{2}(t)]$, and integrating by parts, we can write

Equation (5)

For the equal-time response $R(t,t)$ we therefore require $\langle \phi (t)\zeta (t)\rangle $, an equal-time product of fluctuating quantities.

There are two conventions for assigning values to such quantities, due to Stratonovich and to Ito. We will briefly review these conventions and refer the reader to [18] (chapter 4) for a more comprehensive treatment.

2.1. Stratonovich convention

The idea of the Stratonovich convention is that, physically, ζ is a noise process with non-zero correlation time, so we should really write $\langle \zeta (t)\zeta (t^{\prime} )\rangle ={C}_{\zeta }(t-t^{\prime} )$ with Cζ an even function, decaying quickly to zero for $| t-t^{\prime} | $ greater than some small correlation time ${\tau }_{0}$, and whose integral is $\int {\rm{d}}{t}\,{C}_{\zeta }(t)=2T$. Then, setting the field h to zero as it is no longer needed at this point:

Equation (6)

so

Equation (7)

where the second equality follows from the fast decay of Cζ compared to the macroscopic timescales of ${ \mathcal O }(1/\mu )$ so that we can approximate ${{\rm{e}}}^{-\mu (t-t^{\prime} )}=1$. When considering $\langle \phi (t)\zeta (t^{\prime} )\rangle $ with $t\gt t^{\prime} $ (more precisely, $t-t^{\prime} \gg {\tau }_{0}$) on the other hand, all of the 'mass' of the correlation function is captured in the integration range; it therefore acts just like a δ-function and we get $\langle \phi (t)\zeta (t^{\prime} )\rangle =2T\exp [-\mu (t-t^{\prime} )]$ as expected from  (4) and (5).

To summarize, in the Stratonovich convention, the equal-time value R(t,t) = T/(2T) = 1/2 of the response function is half of that obtained in the limit $t^{\prime} \to t-0$. It is therefore also called the midpoint rule; see below.

2.2. Ito convention

The Ito convention effectively assumes that the noise $\zeta (t)$ acts 'after $\phi (t)$ has been updated', so it sets $\langle \phi (t)\zeta (t)\rangle ={\mathrm{lim}}_{t^{\prime} \to t+0}\langle \phi (t)\zeta (t^{\prime} )\rangle =0$, and hence also $R(t,t)=0$.

2.3. Discretization

We will later look at path integral representations of the dynamics and so need a discretization of the stochastic process $\phi (t)$. We will now see that Ito and Stratonovich can be viewed as corresponding to different discretization methods.

Let us discretize time $t=n{\rm{\Delta }}$, with Δ a small time step eventually to be taken to zero, and write ${\phi }_{n}=\phi (t=n{\rm{\Delta }})$ and ${h}_{n}=h(t=n{\rm{\Delta }})$. The noise variables over the interval Δ are ${\zeta }_{n}={\int }_{n{\rm{\Delta }}}^{(n+1){\rm{\Delta }}}{\rm{d}}t\,\zeta (t)$, with $\langle {\zeta }_{m}{\zeta }_{n}\rangle =2T{\rm{\Delta }}{\delta }_{{mn}}$, where ${\delta }_{{mn}}$ is the Kronecker symbol. Then a suitable discrete version of  (1) is

Equation (8)

for any $\lambda \in [0,1];$ here we are evaluating (the non-noise part of) the right-hand side of  (1) as a weighted combination of the values at the two ends of the interval Δ. It is easy to solve this linear recursion exactly: from

Equation (9)

setting

Equation (10)

and assuming the initial condition ${\phi }_{0}=0$, we have

Equation (11)

Equation (12)

From this we can read off the response function ${R}_{{nm}}=\partial \langle {\phi }_{n}\rangle /\partial ({\rm{\Delta }}{h}_{m});$ setting $n=t/{\rm{\Delta }}$, $m=t^{\prime} /{\rm{\Delta }}$ and taking ${\rm{\Delta }}\to 0$ we then get for the continuous time response

Equation (13)

The value of λ only affects the equal-time response; we see that $\lambda =1/2$ gives the Stratonovich convention, while $\lambda =0$ gives Ito. Note that in the discretization  (8), $\lambda =1/2$ corresponds to evaluating the change in ϕ at the midpoint of the interval between n and $n+1$ (hence 'midpoint rule'). Ito ($\lambda =0$) on the other hand just evaluates at the earlier time. Note also that for different times, $t\ne t^{\prime} $, the two discretization schemes are equivalent as expected.

The two conventions for multiplying equal-time fluctuations are, for the systems we will look at, simply different ways of describing the same time evolution $\phi (t)$. Cases where the noise strength is coupled to ϕ, such as in $\dot{\phi }=f(\phi )+g(\phi )\zeta $, are more serious: a convention for the equal-time product $g(\phi )\zeta $ has to be adopted, and the two conventions here actually give different stochastic processes $\phi (t);$ the corresponding Fokker–Planck equations differ by a non-trivial drift term [18].

For the simple cases we consider, where the noise power T does not depend on t or ϕ, Ito versus Stratonovich is basically a matter of taste—calculated expectation values of physical observables are the same in the two approaches. Stratonovich is the more 'physical' because it corresponds to a noise process ζ with small but non-zero correlation time; it also obeys all the usual rules for transformation of variables etc. Ito is more obvious from the discretized point of view—it is very much what one might naively program in a simulation. We will also see below that it can lead to technical simplifications in calculations.

3. The Martin–Siggia–Rose/ Janssen–De Dominicis–Peliti path integral formulation

Now consider the nonlinear Langevin equation

Equation (14)

and assume for simplicity that $\phi (0)=0$. It is straightforward to extend the formalism to systems with several components ${\phi }_{i};$ the inclusion of distributions of initial values is discussed briefly in section 6. We discretize as in  (8), abbreviating ${f}_{n}=f({\phi }_{n})$:

Equation (15)

The plan from here is to write down a path integral for this process and evaluate the effects of nonlinearities in $f(\phi )$ perturbatively. Let Δ be fixed for now and let M be the largest value of the index n that we are interested in. Abbreviate $\phi =({\phi }_{1}\ldots {\phi }_{M})$ and let $\psi =({\psi }_{1}\ldots {\psi }_{M})$ be a vector of conjugate variables; then the relevant generating function is

Equation (16)

where $P[\phi ]$ is the probability of the entire history $\{{\phi }_{n}\}$. Averages can be computed by differentiation with respect to the ${\psi }_{n}$ at $\psi =0$.

Quenched disorder can also be treated: one averages the generating function over the quenched disorder before performing the perturbative expansion. In contrast to the case of equilibrium statistical mechanics, performing the average is unproblematic since, because $P[\phi ]$ is a normalized distribution, $Z[0]=1$, independent of the parameters of the model [3]. We return to this approach in section 9.

The M variables ${\phi }_{n}$ ($n=1\ldots M$) depend on the M noise variables ${\zeta }_{n}$ ($n=0\ldots M-1$). We know the distribution of the latter,

Equation (17)

so we can find the distribution of the former:

Equation (18)

where $J[\phi ]=| \partial \zeta /\partial \phi | $ is the Jacobian. Using (15) to express the ζ in terms of the ϕ,

Equation (19)

we thus have

Equation (20)

where ${\rm{d}}\phi $ means ${\prod }_{n}{\rm{d}}{\phi }_{n}$. The square can be decoupled using conjugate integration variables $\hat{\phi }={\hat{\phi }}_{0}\ldots {\hat{\phi }}_{M-1}$:

Equation (21)

Now it is time to evaluate $J[\phi ]$. Consider the matrix $\partial \zeta /\partial \phi $, remembering that the index for ζ runs from 0 to $M-1$, and the one for ϕ from 1 to M. The diagonal elements $\partial {\zeta }_{n}/\partial {\phi }_{n+1}$ are then, from  (19), $1-{\rm{\Delta }}\lambda {f}_{n+1}^{\prime }$ (with ${f}_{n}^{\prime }=f^{\prime} ({\phi }_{n})$); the elements just below the diagonal are $\partial {\zeta }_{n}/\partial {\phi }_{n}=-1-{\rm{\Delta }}(1-\lambda ){f}_{n}^{\prime }$. All other elements are zero; in particular the matrix has all zeros above the diagonal and so its determinant is the product of the diagonal elements, giving

Equation (22)

where the last equality anticipates that Δ will be made small so that we can discard ${ \mathcal O }({{\rm{\Delta }}}^{2})$ terms in the exponent. Note that for the Ito convention ($\lambda =0$), $J\equiv 1$ identically, which is one of the reasons for preferring Ito. The alternative to the above direct evaluation of $J[\phi ]$ is to represent the determinant $| \partial \zeta /\partial \phi | $ as an integral over Grassmann variables; this will be discussed in section 7.

With $J[\phi ]$ evaluated, we have

Equation (23)

Defining the average over a (normalized, complex valued) measure

Equation (24)

with the 'action'

Equation (25)

one can also write

Equation (26)

From this representation one has, in particular (taking all derivatives at $\psi =h=0$, and adopting the convention ${\hat{\phi }}_{-1}=0$) the expressions for correlation and reponse functions

Equation (27)

Equation (28)

It also follows that averages of any product of $\hat{\phi }$'s vanish. This is because (as remarked above) Z = 1 for $\psi =0$, whatever value the hn take. As a result, derivatives of any order of Z with respect to h vanish when taken at $\psi =0$. We can combine the expressions for correlation and response if we define new variables as

Equation (29)

Equation (30)

Arranging these into a big vector $\eta =({\eta }_{11},\ldots ,{\eta }_{1M},{\eta }_{20},\ldots ,{\eta }_{2M-1})$ gives

Equation (31)

and the calculation of G (the 'propagator' in quantum field theory) is often the main goal of the analysis. In the case of non-zero initial conditions one also wants to be able to calculate the means $\langle \eta \rangle $, as discussed in section 5.5 below. The fields ψ and h have served their purpose and will be set to zero from now on.

4. Perturbation theory

To illustrate the perturbative expansion, consider a concrete example:

Equation (32)

(the $3!$ factor is for later convenience). For g = 0, we recover a solvable linear Langevin equation; correspondingly, the action in  (25) becomes quadratic. For $g\ne 0$, we therefore separate this quadratic part and write

Equation (33)

Equation (34)

Equation (35)

Rearranging the first sum, the non-quadratic (or 'interacting' in a field theory) part of the action can be written in the simpler form

Equation (36)

We have dropped the last term, $\lambda {\hat{\phi }}_{M-1}{\phi }_{M}^{3}$ from the first sum; since (by causality) whatever we do with this last term does not affect any of the results for earlier times, we just have to make M larger than any times of interest for this omission to be irrelevant. Similarly, whether we have the sums start at n = 0 or n = 1 has a vanishing effect for ${\rm{\Delta }}\to 0$, so in the following we will leave off the summation ranges.

Now for g = 0 we have $S={S}_{0}$, and a corresponding normalized Gaussian measure over the η for which we denote averages by $\langle \cdots {\rangle }_{0}$ and the corresponding propagator by ${G}_{0}=\langle \eta {\eta }^{{\rm{T}}}{\rangle }_{0}$. For $g\ne 0$, our desired averages are then written as

Equation (37)

Assuming g and hence ${S}_{\mathrm{int}}$ to be small, we thus arrive at the perturbative expansion

Equation (38)

which is a series expansion (in general asymptotic, non-convergent) in the nonlinearity parameter g. We can now evaluate each term in this series using an elementary fact about multivariate Gaussian statistics known to physicists as Wick's theorem: the average of any product of zero mean Gaussian random variables is found by summing over all possible pairings; symbolically (leaving out all indices etc)

Equation (39)

Let us apply this to the simplest example. We know that $\langle 1{\rangle }_{S}=1$, so this should be true to all orders in the expansion (38). Up to ${ \mathcal O }(g)$ one has

Equation (40)

Equation (41)

Using Wick's theorem, the fourth-order average is

Equation (42)

(Why the 3? There are 3 different pairings between the 4 four variables—(1, 2) and (3, 4), (1, 3) and (2, 4), and (1, 4) and (2, 3)—but all give the same product of averages.) Here one sees how the equal time response function appears in the formalism. Inserting the value ${R}_{{nn}}^{0}=\lambda $ that we found earlier,  (41) thus becomes

Equation (43)

as it should be. This illustrates how the non-trivial determinant $J[\phi ]$ that arises in the Stratonovich formalism ($\lambda =1/2$) and appears as the term proportional to λ in the interaction part of the action  (36) is essential for maintaining the correct normalization. Similar cancellations occur at all orders in g. Ito ($\lambda =0$) is simpler here: the terms in the perturbation expansion of $\langle 1{\rangle }_{S}=1+\cdots $ all vanish individually, rather than just cancelling each other out.

5. Diagrams

5.1. Basics

Diagrams are just a pictorial way of keeping track of the various terms in the perturbation expansion (38), as evaluated using Wick's theorem. Having illustrated the equivalence between Ito and Stratonovich above, we stick to Ito ($\lambda =0$) for now, where the non-trivial part of the action is simply

Equation (44)

To illustrate the diagrammatic notation, consider again the expansion for the normalization factor

Equation (45)

We have dealt with the zeroth and first order terms above. The second order term is

Equation (46)

Represent each of the summed over time indices m and n by a vertex with four 'legs' that symbolize the four η factors with the corresponding time index. Each vertex comes with a factor $-g/6$ from $-{S}_{\mathrm{int}}$. Both of the time indices are summed over and the result multiplied by Δ; in the limit ${\rm{\Delta }}\to 0$ these scaled sums of course become time integrals. The time indices are not fixed by our choice of observable to average, and are therefore called 'internal'—we will see external vertices in a moment. Now, having drawn the vertices, we can connect the legs in a number of ways, one for each pairing that Wick's theorem gives for the average in (46): a line between a vertex at m and one at n stands for a pair average $\langle {\eta }_{{\alpha }{m}}{\eta }_{{\beta }{n}}\rangle $ (${\alpha },{\beta }=1$ or 2). E.g. the diagram where the legs from both vertices are connected to legs from the same vertex only

Equation (47)

represents all the pairings where ${\eta }_{n}$'s are connected to ${\eta }_{n}$'s only, and ${\eta }_{m}$'s to ${\eta }_{m}$'s only. At each vertex there are three choices for pairings of this form, so this diagram has the value

Equation (48)

This illustrates an important fact: if a diagram separates into subparts which are not connected, its value is just the product of the two diagrams separately (apart from the overall factor of 1/2 here, which comes from the expansion of $\exp (-{S}_{\mathrm{int}})$).

The diagram above certainly does not exhaust all the Wick pairings of  (46). So what are the other diagrams corresponding to  (46)? We can have either two mn-pairings, one nn and one mm; or four mn pairings. Altogether, one would therefore write  (46) in diagrams as:

Equation (49)

We write down the value of the last of these: start with ${\eta }_{2m}$. From the diagram, this must be connected to a η on the other vertex, i.e. either ${\eta }_{2n}$ or one of the three ${\eta }_{1n}$. In the first case, each of the remaining ${\eta }_{1m}$ legs is connected to a ${\eta }_{1n}$ leg; there are $3\times 2\times 1=6$ ways of making those connections (pairings). In the second case, the ${\eta }_{2}$ factor from the second vertex, ${\eta }_{2n}$, must be connected to one of the three ${\eta }_{1m}$ vertices, and the remaining two ${\eta }_{1m}$ and ${\eta }_{1n}$ legs can be connected to each other in two ways. Thus the value of the diagram is:

Equation (50)

(Exercise: evaluate the remaining second order diagram and, as a check, count all pairings. The diagram just above dealt with $6+18=24$ pairings, while the first one corresponded to $3\times 3=9$ pairings. Since we have 8 η's in total, there are $7\times 5\times 3\times 1\,=8!/(4!2{!}^{4})=105$ pairings overall, hence the remaining diagram must correspond to $105-24-9=72$ pairings.)

In terms of diagrams, it is now quite easy to understand that $\langle 1{\rangle }_{S}=1$ as it should to all orders in g. Consider an arbitrary diagram in the expansion; let us assume it is connected, otherwise consider the subdiagrams separately. Now within the Ito convention the response function $\langle {\eta }_{1n}{\eta }_{2m}{\rangle }_{0}$ is non-zero only for $n\gt m$ (for Stratonovich, the value for n = m is also non-zero; for $n\lt m$ it is zero either way from causality). So to make the diagram non-zero, we have to connect the ${\eta }_{2{n}_{1}}$ from a given vertex, with time index n1, say, to the ${\eta }_{1{n}_{2}}$ leg at another vertex, n2. The ${\eta }_{2{n}_{2}}$ from that vertex must in turn be connected to ${\eta }_{1{n}_{3}}$ on another vertex and so on. Eventually, because we have a finite number of vertices, we must come back to our original vertex n1. In the 'ring' sequence n1, n2, n3, ..., n1 there are at least two time indices that are in the 'wrong' order for the response function to be non-zero, so that the diagram contains at least one vanishing factor and thus vanishes itself.

The moral of the story so far is: diagrams factorize over disconnected sub-diagrams, and the sub-diagrams with only internal (summed-over) vertices vanish. The latter are also called 'vacuum diagrams' in field theory.

5.2. Diagrams for correlator/response; Dyson equation

Next we look at the diagrams for the propagator, which encapsulates correlation and response functions, $\langle {\eta }_{\alpha i}{\eta }_{\beta j}{\rangle }_{S}$ where $\alpha ,\beta \in \{1,2\}$. We now have two 'external' vertices with fixed time indices i and j; the propagator is represented as a double line between these two vertices. It is also called the 'full' or 'dressed' propagator, in contrast to the 'bare' propagator that is represented by the single lines in the diagrams and results from the pairings in Wick's theorem. To zeroth order in g, full and bare propagator are obviously equal. To first order, we have, from  (38),

Equation (51)

Equation (52)

In diagrams, we have two new contributions from the first order term, depending on whether we pair up ${\eta }_{i}$ with ${\eta }_{j}$ or with one of the ${\eta }_{n}$. We can therefore write the full propagator as

Equation (53)

The second of these has a disconnected part with only internal vertices, so vanishes. The same argument applies to higher order diagrams as well: we only need to consider connected diagrams7 . Thus, evaluating the surviving diagram,

Equation (54)

If we define a matrix ${{\rm{\Sigma }}}^{1}$ by

Equation (55)

(where ${\delta }_{{ij}}$ is the Kronecker symbol as before) and agree to absorb a factor of Δ into matrix products, so that

Equation (56)

then we can write  (54) in matrix form simply as

Equation (57)

The way factors of Δ are absorbed here ensures that matrix multiplications become time integrals in the natural way for ${\rm{\Delta }}\to 0$, while the factor ${{\rm{\Delta }}}^{-1}{\delta }_{{mn}}$ in ${{\rm{\Sigma }}}^{1}$ becomes $\delta (t-t^{\prime} )$.

To first order in g, the above is the whole story for the propagator. Now look at the second order. Among the diagrams we have ones such as

Equation (58)

These two only differ in how the internal vertices are labeled; since the latter are summed over, we can lump the two diagrams together into one unlabeled diagram. This just gives a factor of 2, which cancels exactly the prefactor $1/2!$ from the second order expansion of the exponential in  (38). Again, the same happens at higher orders: at ${ \mathcal O }({g}^{k})$ we have a prefactor of $1/k!$ but also k internal vertices which can be labeled in $k!$ different ways, so the unlabeled diagram has a prefactor of one. Thus, the unlabeled diagram

Equation (59)

has the value (bearing in mind that the ${\eta }_{2}$ components at each of the internal vertices must be connected either to a ${\eta }_{1}$ leg at the other internal vertex, or to an external vertex, and that there are three choices at each vertex for which pair of ${\eta }_{1}$'s to connect to each other)

Equation (60)

Using the definition  (55), one sees that in matrix form this is simply

Equation (61)

To make this simple form more obvious we have included in  (60) the term in the second line, which vanishes because it contains a zero ${G}_{22}^{0}$ factor. We have an example here of a 'one-particle reducible' (1PR) diagram, which can be cut in two by cutting just one bare propagator line, namely, the one in the middle. The result illustrates that the value of such diagrams factorizes into the pieces they can be cut into; e.g. the diagram

Equation (62)

has the value ${G}_{0}{{\rm{\Sigma }}}^{1}{G}_{0}{{\rm{\Sigma }}}^{1}{G}_{0}{{\rm{\Sigma }}}^{1}{G}_{0}$. If we sum up all the diagrams of this form, we get

Equation (63)

or

Equation (64)

The inverses are relative to the appropriate unit element ${\boldsymbol{I}}$ for our redefined matrix multiplication, which has elements ${I}_{\alpha i,\beta j}={{\rm{\Delta }}}^{-1}{\delta }_{\alpha \beta }{\delta }_{{ij}}$. (So ${A}^{-1}A={\boldsymbol{I}}$ means, because of the extra factor Δ in the matrix multiplication, that ${A}^{-1}$ is ${{\rm{\Delta }}}^{-2}$ times the conventional matrix inverse of A.)

The expression  (64) is an example of a resummation: we have managed to sum up an infinite subseries from among all the diagrams for the propagator. What if we want to sum up all the diagrams?

Again we can classify them into one-particle irreducible (1PI) diagrams, which cannot be cut in two by cutting a single line, and 1PR diagrams that factorize into their 1PI components. For example, to second order

Equation (65)

Equation (66)

More generally, if we denote the values of the different 1PI diagrams (defined analogously to ${{\rm{\Sigma }}}^{1}$, i.e. without the external G0 legs) by ${{\rm{\Sigma }}}^{1}$, ${{\rm{\Sigma }}}^{2}$, ... then we can get all possible (1PI and 1PR) diagrams by 'stringing' together all possible combinations of 1PI diagrams:

Equation (67)

where ${\rm{\Sigma }}={\sum }_{i=1}^{\infty }{{\rm{\Sigma }}}^{i}$. Hence we see that the full propagator, with all diagrams summed up, can be written in the general form

Equation (68)

where Σ, the so-called 'self-energy', is the sum of all 1PI diagrams. Equation  (68) is called the Dyson equation. An alternative form that is often useful is

Equation (69)

Let us write this out in terms of the separate blocks corresponding to correlation and response functions: we have

Equation (70)

and ${G}^{-1}$ has the same structure, so that also the 11 block of Σ must vanish,

Equation (71)

This can also be shown diagrammatically: a non-zero contribution to ${{\rm{\Sigma }}}_{11}$ would correspond to diagrams where the internal vertices that make connections to the two external vertices do so via ${\eta }_{1}$ legs. This leaves all the ${\eta }_{2}$ legs to be connected amongst the internal vertices, and then the same argument as for vacuum diagrams can be applied. Writing out  (69) we have thus

Equation (72)

Equation (73)

Equation (74)

Equation (75)

where we have abused the notation by writing ${\boldsymbol{I}}$ also for the non-zero M × M sub-blocks of the original $2M\times 2M$ matrix ${\boldsymbol{I}}$. The first and last of these equations are both equivalent to

Equation (76)

implying that ${{\rm{\Sigma }}}_{12}^{{\rm{T}}}$ acts like a self-energy for the reponse function. Rearranging, the components of the Dyson equation reduce to

Equation (77)

Equation (78)

Using  (76), the last equation can also be solved explicitly for C as

Equation (79)

In the above we have defined Σ to be plus the sum of all 1PI diagrams; the opposite sign convention also appears in the literature.

5.3. Self-consistency

Of course in general one cannot sum up all the diagrams for the self-energy. One must make some approximation. In  (64) we used just the lowest order diagram to approximate Σ. Can we easily improve the approximation? Yes, if we replace G0 in the expression for Σ by the full propagator G; diagrammatically,

Equation (80)

Which diagrams does this correspond to? This is easiest to find out order by order in g. To first order, Σ and G are

Equation (81)

Re-inserting G into Σ we get its form to second order and therefore also G

Equation (82)

Equation (83)

and then we can iterate to get

Equation (84)

So the simple operation of replacing G0 by G effectively sums up an infinite series of 'tadpole' diagrams in the expansion for the self-energy. This is called 'mean field theory'. It gives exact results for many models with weak long-ranged interactions; for such models the diagrams that have not been included are negligible in the thermodynamic limit. The approximation is also called 'self-consistent one-loop' or 'Hartree–Fock' approximation.

5.4. Diagrammatic conventions

We have used a particular way of drawing diagrams above that does not distinguish between the physical field $\phi ={\eta }_{1}$ and the conjugate or 'response' field $\hat{\phi }={\eta }_{2}$. This approach has the virtue of making all diagrams look essentially like in a static (equilibrium) ${\phi }^{4}$ field theory. It does however mean that each diagram can group a number of terms, involving response or correlation functions depending where on each vertex the $\hat{\phi }$ variables are located in any given Wick pairing. If one wants to avoid this one can e.g. mark on each vertex the $\hat{\phi }$-leg by an outgoing arrow. Similarly in the response part of the propagator—the ${\eta }_{1}{\eta }_{2}$ sector in our previous convention—one would mark the $\hat{\phi }$ end by an arrow. The arrows then represent the flow of time because in all contractions of the type $\langle \hat{\phi }\phi \rangle $ the $\hat{\phi }$ must be at a time before the ϕ.

The rules for constructing diagrams are modified only slightly in this more detailed diagrammatic representation: an arrow from a vertex cannot connect back to the same vertex as this would give an equal-time response function, which vanishes in the Ito convention. At each vertex one has to consider the possible choices for how the leg with an arrow can be connected to other vertices or external nodes. Diagrams are non-vanishing only if the arrows give a consistent flow of time, i.e. arrows cannot form closed loops nor can two legs with outgoing arrows be connected.

Within the above convention, a propagator without an arrow on it is always a correlation function. In our example, the diagrammatic expansion up to second order of the correlation function then reads

Equation (85)

while for the response function there are fewer diagrams that contribute:

Equation (86)

Note that the expansions (85) and (86) are written as the direct analogs of (65) and differ from the latter only through the addition of the appropriate arrows. More compact expressions can be obtained in terms of the self-energy. For the correlation function specifically, the identity (79) may be more efficient for use in practical calculations. This is illustrated in the example in section 6 below, see (106) there.

5.5. Non-zero fields and initial values

So far we have discussed dynamics without applied fields hn and with zero initial value ${\phi }_{0}$. These restrictions can be lifted, as we now explain. We continue to use the Ito convention ($\lambda =0$). It suffices to discuss non-zero fields, because a non-zero initial value ${\phi }_{0}=c$ can be produced by setting ${h}_{0}=c/{\rm{\Delta }}$. This gives ${\phi }_{1}=c+{ \mathcal O }({{\rm{\Delta }}}^{1/2})$, which in the continuous time limit ${\rm{\Delta }}\to 0$ approaches c. In the same limit the difference between ${\phi }_{0}$ and ${\phi }_{1}$ is immaterial, so the above choice of h0 effectively fixes a non-zero initial condition.

For the perturbative approach to work, we need a quadratic action S0 as our baseline while all non-quadratic terms are contained in the interacting part ${S}_{\mathrm{int}}$. If the field hn is non-zero, this generates linear terms in the action. We then have two choices: either we include these linear terms in ${S}_{\mathrm{int}}$ and treat them perturbatively, or we transform variables by expanding the full action around a stationary or saddle point.

The first approach is relatively straightforward: one now has an extra vertex, with only one $\hat{\phi }$-leg, and the perturbative expansion is jointly in g and the field amplitude. For the propagator all contributing diagrams must have a total number of legs on the internal vertices that is even so the number of field vertices must also be even, hence the expansion is effectively in g and h2. At ${ \mathcal O }({h}^{2})$ one has the extra diagram

Equation (87)

where each dot with a single connection is an h-vertex8 , while at ${ \mathcal O }({h}^{2}g)$ one gets

Equation (88)

We could represent the fact that all field vertices have a single $\hat{\phi }$-leg by an arrow pointing away from the vertex (see section 5.4). For the response function part of the propagator there would be an arrow pointing away from one of the external vertices as well so the ${ \mathcal O }({h}^{2})$ diagram in (87) vanishes as it contains an edge with opposing arrows. To a correlator $C(t,t^{\prime} )$ the diagram contributes in the continuous time limit

Equation (89)

The higher order diagrams can be evaluated similarly.

In the presence of a field even the mean $\langle {\phi }_{n}\rangle $ is in general non-zero. The perturbative expansion for this consists of diagrams with one external vertex and therefore requires an odd number of internal field vertices. The contributions to ${ \mathcal O }(h)$ and ${ \mathcal O }({hg})$ are

Equation (90)

One sees that most of the new diagrams (87) and (88) in the propagator are products of diagrams like these. In fact one can convince oneself that if one considers the connected propagator, defined as $\langle {\eta }_{\alpha i}{\eta }_{\beta j}\rangle -\langle {\eta }_{\alpha i}\rangle \langle {\eta }_{\beta j}\rangle $, then as in the case without a field only the connected diagrams remain [19]; the lowest order one of these is the third diagram in (88).

Next we look at the saddle point approach, where one defines new variables relative to a stationary point of the action, thus eliminating any linear terms in the transformed action. Looking at (23) with ${\psi }_{n}=0$, the saddle point conditions with respect to ${\phi }_{n}$ and ${\hat{\phi }}_{n}$ give

The 'initial' (at $n=M-1$) condition for the first of these equations, which results from stationarity with respect to ${\phi }_{M}$, is ${\hat{\phi }}_{M-1}=0$, and solving backwards in time then shows that ${\hat{\phi }}_{n}=0$ for all n at the saddle point. With this the second equation is just

Equation (91)

which is the discrete time version of the deterministic (noise-free) time evolution $\dot{\phi }=f(\phi )+h$. If we call the solution of this ${\phi }^{* }$, then we need to use as new variables $\delta \phi =\phi -{\phi }^{* }$. In terms of these, by making use of the saddle point conditions, the (Ito) action reads

Equation (92)

The difference $\delta {f}_{n}={f}_{n}-{f}_{n}^{* }\,=\,f({\phi }_{n}^{* }+\delta {\phi }_{n})-f({\phi }_{n}^{* })$ needs to be expanded in $\delta {\phi }_{n}$ to read off the resulting vertices in the interacting part of the action. For our example (32),

Equation (93)

If we include the $-\mu \delta \phi $ term in the unperturbed action S0 then the latter has the same form as originally except for the change of variable from ϕ to $\delta \phi $, and the same diagrammatic expansion technique can be applied. The difference is that the interaction part ${S}_{\mathrm{int}}$ now contains three different kinds of vertices resulting from the terms proportional to g in (93), with two, three and four legs respectively and different time-dependent prefactors from the time dependence of ${\phi }^{* }$.

One point to bear in mind is that even though we have defined $\delta \phi $ relative to the saddle point solution, its average is not in general zero. To ${ \mathcal O }(g)$, for example, one has for $\langle \delta \phi (t)\rangle $ the diagram

Equation (94)

coming from the $-(g/2){\phi }^{* }\delta {\phi }^{2}$ term in  (93), which evaluates to

Equation (95)

6. A one-particle example

Let us apply the diagrammatic formalism developed above to our one-particle model  (32). We want to take a distribution over the initial values $\phi (0)$ into account here. But this only takes a small extension of the formalism: if the initial distribution is a zero mean Gaussian, we can simply include the average over $\phi (0)$ in the unperturbed average $\langle \ldots {\rangle }_{0};$ the measure is still Gaussian, so we can apply all of the above formalism except that the values of C0 and R0, i.e. the components of the bare propagator are affected by the presence of uncertainty in the initial condition. If the distribution has non-Gaussian parts, we include the Gaussian part as above and the remainder is put into the non-trivial part of the action ${S}_{\mathrm{int}}$, giving a new kind of vertex in the diagrams. A non-zero mean in a Gaussian initial distribution would also be put into ${S}_{\mathrm{int}}$.

Now let us write down the Dyson equation for our model. Having derived all relations previously so that they have the obvious limits for ${\rm{\Delta }}\to 0$, we work directly with continuous times. The bare response function is, from  (4),

Equation (96)

The inverse of R0 is the operator ${\partial }_{t}+\mu $, since when applied to R0 it gives $\delta (t-t^{\prime} )$. The other quantity we need to write down the Dyson equation  (78) is ${R}_{0}^{-1}{C}_{0}{({R}_{0}^{-1})}^{{\rm{T}}}$. To find this, it is easiest to start from the fact that

Equation (97)

The lower boundary of the integral here is meant as $0-\epsilon ;$ the same applies to all integrals that follow. Averaging the product $\phi (t)\phi (t^{\prime} )$ gives

Equation (98)

so ${C}_{0}={R}_{0}{{MR}}_{0}^{{\rm{T}}}$ with $M({t}_{1},{t}_{2})$ given by the square brackets; thus

Equation (99)

Now we can write down the Dyson equation  (78):

Equation (100)

Equation (101)

To first order in g the self-energy is, from  (55),

Equation (102)

Within this approximation, the Dyson equation becomes

Equation (103)

Equation (104)

From the first equation,

Equation (105)

and with this the second equation for C can also be solved (compare  (79)) to give

Equation (106)

For the simplest case where C0 is time-translation invariant, corresponding to $\langle {\phi }^{2}(0)\rangle ={C}_{0}(t,t)=T/\mu $, we see that the effect of g in the response function R is just to replace $\mu \to \mu +{gT}/(2\mu )$. At long times the effect on C is similar though at short times C will not be time-translation invariant.

If me make our first order (one-loop) approximation self-consistent, the only change is to replace C0 by C in  (102) and correspondingly in the Dyson equation, so that

Equation (107)

This has a simple interpretation: it corresponds to replacing our original nonlinear force term  (32) by

Equation (108)

with $\langle {\phi }^{2}(t)\rangle =C(t,t)$ to be determined self-consistently. (The non-self-consistent version instead sets $\langle {\phi }^{2}(t)\rangle ={C}_{0}(t,t)$.) Assuming that we can find a solution with $C(t,t)=c\,=$ constant, we get for C by inserting  (107) into  (106) and setting $\tilde{\mu }=\mu +{cg}/2$

Equation (109)

For $\langle {\phi }^{2}(0)\rangle =T/\tilde{\mu }$ we then indeed get a time-translationally invariant $C(t,t^{\prime} )=(T/\tilde{\mu })\,\exp \,(-\tilde{\mu }| t-t^{\prime} | )$. This has $C(t,t)=T/\tilde{\mu }$ and so the self-consistent equation determining c and $\tilde{\mu }$ is

Equation (110)

This is the self-consistent analog of our previous result $\tilde{\mu }=\mu +({gT})/(2\mu )$, to which it reduces when expanded to first order in g.

We see that for our particular model the self-consistent approximation gives a more sensible result than the 'vanilla' first order approximation: it allows a time-translation invariant solution for both C and R

Equation (111)

Equation (112)

which also obeys the fluctuation–dissipation theorem (FDT), $R(t,t^{\prime} )=(1/T)(\partial /\partial t^{\prime} )C(t,t^{\prime} )$ for $t\gt t^{\prime} $.

6.1. Mode-coupling theory

What if we want to improve the approximation to the self-energy further? The systematic approach is to include the lowest-order diagram not so far taken into account. We have the only first-order diagram already; the second-order 'tadpole' diagrams are also taken into account through self-consistency. The only missing second order diagram is therefore the 'watermelon' diagram

Equation (113)

To work out what contribution to Σ this gives, let us revert temporarily to discrete time notation and label the left and right vertex m and n, respectively. The elements ${{\rm{\Sigma }}}_{1m,2n}$ of ${{\rm{\Sigma }}}_{12}$ correspond to those pairings where a ${\eta }_{1m}$ leg from vertex m is attached to an external vertex that would be on the left, and the ${\eta }_{2n}$ leg from vertex n attached to an external vertex on the right. Internally (among the remaining legs) we thus have two ${\eta }_{1m}{\eta }_{1n}$ pairings and one ${\eta }_{2m}{\eta }_{1n}$ pairing. To work out the prefactor of the diagram, note that there are three choices for the externally attached ${\eta }_{1m};$ there are three choices for which of the ${\eta }_{1n}$ to pair up with ${\eta }_{2m};$ and two more choices for how to make the two remaining ${\eta }_{1m}{\eta }_{1n}$ pairings. Thus, the diagram gives for ${{\rm{\Sigma }}}_{12}$

Equation (114)

For ${{\rm{\Sigma }}}_{22}$, we have both the ${\eta }_{2m}$ and ${\eta }_{2n}$ legs attached externally, and 6 choices for how to connect the three ${\eta }_{1m}$ and ${\eta }_{1n}$ legs internally, giving

Equation (115)

We can again sum an infinite series of additional diagrams by replacing bare (${C}_{0},{R}_{0}$) by full quantities ($C,R$) here. Reverting to continuous time notation and including the first-order contribution in ${{\rm{\Sigma }}}_{12}$, we thus get for the self-energy in the self-consistent two-loop approximation

Equation (116)

Equation (117)

A final comment: up to the order which we have considered, the self energy components ${{\rm{\Sigma }}}_{12}(t,t^{\prime} )$ and ${{\rm{\Sigma }}}_{22}(t,t^{\prime} )$ are simple functions of the correlation and response functions. This is not normally true once higher order diagrams are taken into account. For example, if we went to third order in g we would have to include the diagram

Equation (118)

in the self energy; all other third order diagrams are automatically included by self-consistency. This diagram now has one internal vertex whose time index is not fixed by the two time indices that the self-energy carries. It therefore gives a contribution to ${{\rm{\Sigma }}}_{..}(t,t^{\prime} )$ which has an integral over this 'internal' time; e.g. for ${{\rm{\Sigma }}}_{22}$ we get a contribution

Equation (119)

Diagrams of even higher order contain additional time integrals; in general, then, the self-energy is a functional of the response and correlation functions.

An approximation, like that in (116) and (117), in which we keep only diagrams for ${\rm{\Sigma }}(t,t^{\prime} )$ that are functions of $G(t,t^{\prime} )$ and $C(t,t^{\prime} )$ has come to be called a mode coupling approximation. These have been studied extensively, often because they are exact for some mean-field (infinite-range) models [20].

7. Ghosts and supersymmetry

7.1. Using Grassmann variables

In section 4, we discussed how the results of the perturbation theory are independent of the value of λ chosen in the discretization of the Langevin equation: the choice of λ changes the Jacobian but does not affect correlation functions and other observables. As alluded to already in section 3, there is another way of including the Jacobian in the path integral using Grassmann variables, which is both conceptually interesting and can often simplify notation. This observation was first made, and its consequences on dynamics explored, by Feigel'man and Tsvelik [21, 22], who followed earlier work by Parisi and Sourlas [23] on supersymmetric properties of an equilibrium system in a random external field. The current section is devoted to describing this approach.

Before getting into the details of the supersymmetric formalism, we need to familiarize ourselves with Grassmann variables. Grassmann variables are charaterized by the fact that they anticommute with each other. Multiplication of them is also associative; i.e. for Grassmann variables ${\xi }_{i}$, we have

Equation (120)

Equation (121)

A consequence of anti-commutation is that

Equation (122)

Grassmann variables can be added to each other and also multiplied by complex numbers: one says formally that they form an algebra, i.e. a vector field over the complex numbers endowed with a multiplication. This means that, given (122), the most general functions of one and two Grassmann variables can be written respectively as

Equation (123)

Equation (124)

where ${c}_{0},{c}_{1},{c}_{2}$ and c12 are arbitrary complex numbers.

Integration and differentiation for Grassmann variables are defined by

Equation (125)

and these lead to the following formulae that we will use later:

Equation (126)

Equation (127)

As a consequence of the above, and using two independent sets of Grassmann variables ${\xi }_{n}$ and ${\overline{\xi }}_{n}$, one has the following important representation of the determinant of a matrix ${\boldsymbol{A}}$:

Equation (128)

where $D[\xi \overline{\xi }]={\prod }_{n}{\rm{d}}{\xi }_{n}{\rm{d}}{\overline{\xi }}_{n}$. The integrand on the right-hand side of (128) defines a Gaussian measure for Grassmann variables under which we have $\langle {\xi }_{m}{\overline{\xi }}_{n}\rangle =-{({{\boldsymbol{A}}}^{-1})}_{{mn}}$.

Employing the representation (128) for the determinant of a matrix, we can write the Jacobian that appears in the transformation (18) in the following way. First, we recall that the non-zero elements of the Jacobian are

Equation (129)

Equation (130)

and therefore

Equation (131)

where ${\xi }_{n}$ (with n = 1,..., M) and ${\overline{\xi }}_{n}$ (with $n=0,\ldots ,\,M-1$) are Grassmann numbers.

The dynamical partition function is defined as before in (26) but with the average now also involving integration over the Grassmann variables,

Equation (132)

The action reads

Equation (133)

where the second line replaces the last term in  (25). In the continuous time limit ${\rm{\Delta }}\to 0$ this can be written as

Equation (134)

Let us see how the inclusion of the Grassmann 'ghosts' works out for our example case of $f(\phi )=-\mu \phi -(g/3!){\phi }^{3}$. Going back to discretized time temporarily will help us understand how to treat equal-time correlations. With $f(\phi )$ as given, the action can be written as

Equation (135)

Equation (136)

Equation (137)

The coefficient matrix ${\boldsymbol{A}}$ appearing in the ghost term of the bare action S0 has entries $1+{\rm{\Delta }}\lambda \mu $ on the main diagonal and $-1+{\rm{\Delta }}(1-\lambda )\mu $ on the diagonal below. This matrix is easily inverted to show that the ghost covariance is

Equation (138)

for $m\gt n$ and 0 otherwise; the last expression applies for ${\rm{\Delta }}\to 0$. The ghost correlator is therefore causal and reads in the continuum limit:

Equation (139)

While this is λ-independent, the dependence on λ reappears in how equal-time Wick contractions are treated in the perturbative expansion in powers of $-{S}_{\mathrm{int}}$. To see this, note that up to vanishing corrections the interacting action is

Equation (140)

The square brackets in the first line, including the factor i, define what we previously called the response field ${\eta }_{2n}$, which has equal-time correlator with ${\phi }_{n}$ of ${R}_{{nn}}^{0}=\lambda $. Similarly we could now define the combination in square brackets in the second line as a new Grassmann response field ${\tilde{\xi }}_{n}$, which has equal-time correlator with ξ of $\langle {\xi }_{n}{\tilde{\xi }}_{n}{\rangle }_{0}=\lambda \langle {\xi }_{n}{\overline{\xi }}_{n-1}{\rangle }_{0}=-\lambda $ using  (138). If for simplicity of notation we do not distinguish between $\tilde{\xi }$ and $\overline{\xi }$ (and similarly ${\eta }_{2}$ and $\hat{\phi }$) then the upshot of this discussion is that for ${\rm{\Delta }}\to 0$ one can work directly with the continuous-time version

Equation (141)

of the interacting action, provided that one remembers that the equal-time ghost correlator has to be set to $\langle \xi (t)\overline{\xi }(t){\rangle }_{0}=-\lambda $ in any Wick contractions, and similarly $\langle \phi (t)i\hat{\phi }(t){\rangle }_{0}=\lambda $. Note that there are never any contractions between ordinary and Grassmann variables because the average of any single Grassmann variable over a (Grassmann) Gaussian vanishes.

To see how the perturbation theory with ghosts works in practice, consider the response function to first order in the perturbation:

Equation (142)

Equation (143)

The physical piece of this is the first term and the contraction without equal-time response factors in the first integral:

Equation (144)

When $\lambda \ne 0$, i.e. for any convention other than Ito, there are two other non-zero contractions of the first integral:

Equation (145)

In addition the two possible Wick contractions of the ghost term in the second line of  (143) give, bearing in mind that $\langle \overline{\xi }\xi {\rangle }_{0}=-\langle \xi \overline{\xi }{\rangle }_{0}$,

Equation (146)

Because $\langle \phi \,{\rm{i}}\hat{\phi }{\rangle }_{0}=\lambda $ at equal-time while the ghost correlator has the opposite sign $\langle \xi \overline{\xi }{\rangle }_{0}=-\lambda $, the terms in  (145) and  (146) exactly cancel each other as they should.

In other words, whether or not we include the Jacobian, the extra terms  (145) do not appear in the perturbation theory, either because they are simply equal to zero, or because they cancel with the additional terms  (146) that arise from the ghost correlation functions from the Grassmann representation of the Jacobian. As will be elaborated in the next section, this is formally a consequence of a symmetry of the theory that represents the fact that the path probabilities are normalized to one.

7.2. Manifestly supersymmetric description

Let us now define the superfield Φ as

Equation (147)

where θ and $\overline{\theta }$ are themselves Grassmann variables and are sometimes referred to as Grassmann time. The action  (134) can then be written in terms of Φ in a compact way that reveals unexpected symmetries of the problem.

To see this, it is convenient to define

Equation (148)

so $f(\phi )=-{\rm{d}}{V}(\phi )/{\rm{d}}\phi $. The existence of the potential V is crucial for the supersymmetric description; it cannot be used for systems of more than one variable when these are non-equilibrium in the sense that the drift f cannot be written as a gradient.

In our current one-dimensional example where a potential can always be defined we have, by Taylor expansion around $V(\phi )$ and throwing away terms that vanish because of (122),

Equation (149)

Equation (150)

which leads to

Equation (151)

giving us two of the terms in  (134).

The remaining terms can be written in terms of derivatives of Φ. We have

Equation (152)

Equation (153)

Equation (154)

If we then evaluate the quantity

Equation (155)

we find that we get the $T{\hat{\phi }}^{2}$ term in the action (134). Similarly, we find

Equation (156)

which are the negatives of the terms in (134) involving time derivatives. Putting all these results together and defining ${\rm{d}}\tau \equiv {\rm{d}}{t}\,{\rm{d}}\theta \,{\rm{d}}\overline{\theta }$, we can write the action in the form

Equation (157)

It will be handy to introduce the notation

Equation (158)

Equation (159)

so that

Equation (160)

Up to here the formalism is general enough—subject to the existence of the potential V—that it can describe also non-stationary dynamics, e.g. relaxation to equilibrium. From now on we restrict ourselves further by assuming we have a stationary state. The supersymmetric action then has several symmetries; the obvious one is time translation invariance. But it is also invariant under other 'translations' that involve shifts in the Grassmann times θ and $\overline{\theta }$. In what follows, we identify these invariances and investigate their physical meanings [15, 24].

Consider a 'translation' generated by the operator

Equation (161)

This produces a shift $\overline{\theta }\to \overline{\theta }+\epsilon $ and a change in Φ:

Equation (162)

where epsilon is a Grassmann 'infinitesimal' that acts like a separate Grassmann variable. Now, by using (153) (or simply by substituting $\overline{\theta }\to \overline{\theta }+\epsilon $ in the definition (147) of the superfield), we find

Equation (163)

But according to the definition of the superfield, whatever appears in the expression (163) not mutiplied by θ, $\overline{\theta }$ or $\overline{\theta }\theta $ should be identified as the new ϕ, and whatever appears multiplied (from the right) by θ should be identified as the new $\overline{\xi }$. The component fields therefore transform as

Equation (164)

Equation (165)

Equation (166)

Equation (167)

This transformation is called supersymmetric because it mixes 'bosonic' degrees of freedom (i.e. ϕ and $\hat{\phi }$) with 'fermionic' (i.e. Grassmann) ones.

To verify that such a transformation is indeed a symmetry of the model, we proceed analogously to what we would do to test, say, rotational invariance in an ordinary field theory with vector fields. We start by performing the 'rotation' (164)–(167) on the superfield components in each term of the integrand in (160), i.e. we substitute the transformed Φ and carry out the required derivatives with respect to time and Grassmann time. In general, this leads to a change in the integrand, which we then integrate over τ (i.e. over θ, $\overline{\theta }$ and t). If the result is zero we have a symmetry. As an example, let us see how the shift generated by ${D}^{\prime} $ affects the 'kinetic' term $\overline{{D}}{\rm{\Phi }}{D}{\rm{\Phi }}$ in the action. From (152) we see, using (166) and (167), that

Equation (168)

For the change in ${\rm{D}}{\rm{\Phi }}$, we see from (153) that the first term ${\partial }_{\overline{\theta }}{\rm{\Phi }}$ does not change as it involves only ξ and $\hat{\phi }$, and these do not change under (165) and (167). From (154), (164) and (165), the change in $\theta {\partial }_{t}{\rm{\Phi }}$ is

Equation (169)

Putting these contributions together, the change in $\overline{{D}}{\rm{\Phi }}{D}{\rm{\Phi }}$ is

Equation (170)

Only terms proportional to $\theta \overline{\theta }$ will survive the integration over θ and $\overline{\theta }$, but these cancel:

Equation (171)

A similar calculation shows that the 'potential' term $\int {\rm{d}}\tau \,V({\rm{\Phi }})$ is also unchanged. Thus, the action S is invariant under ${D}^{\prime} ={\partial }_{\overline{\theta }}$.

An analogous calculation shows that the kinetic term is not invariant under shifts generated by $\overline{{D}}={\partial }_{\theta }$. However, we can try combining a shift in θ with one in time, using the generator

Equation (172)

and see whether there is a value of α for which $\overline{{D}}{\rm{\Phi }}{D}{\rm{\Phi }}$ is invariant. Now the transformation of ξ acquires a new term proportional to ${\partial }_{t}\phi $,

Equation (173)

and $\hat{\phi }$ is also changed, proportional to ${\partial }_{t}\overline{\xi }$:

Equation (174)

The remaining variables ϕ and $\overline{\xi }$ transform according to

Equation (175)

Equation (176)

Then, after some algebra, we find that, under our trial $\overline{{D}}^{\prime} $, $\overline{{D}}{\rm{\Phi }}{D}{\rm{\Phi }}$ is changed by

Equation (177)

(here $\beta =1/T$). Integrating over θ and $\overline{\theta }$ leaves

Equation (178)

so we see that with the choice $\alpha =\beta $ this just reduces to

Equation (179)

This vanishes on integration over t (for stationary initial and final states), proving the invariance of this part of the action. Again the proof of invariance for the $V({\rm{\Phi }})$ term is similar to that for ${D}^{\prime} $, so the total action is invariant under $\overline{{D}}^{\prime} $.

The reader who is uneasy with the formal manipulations using superfields here can check these results by applying the transformations (164)–(167) for ${D}^{\prime} $ and (173)–(176) for $\overline{{D}}^{\prime} $ directly to ϕ, $\hat{\phi }$, ξ and $\overline{\xi }$, in the form (134) of the action not using superfields.

To see the meaning of these supersymmetries, we consider the supercorrelation function

Equation (180)

where 1 stands for $({t}_{1},{\theta }_{1},\overline{{\theta }_{1}})$ and analogously for 2. When we use (147) and expand the product, many terms vanish, either because the averages are of products of Grassmann variables with ordinary ones or of a pair of $\hat{\phi }$'s. The remaining terms are

Equation (181)

where ${\phi }_{1}$ means $\phi ({t}_{1})$, etc.

From the invariance of the action under ${D}^{\prime} $ (translations in $\overline{\theta }$), we have

Equation (182)

The vanishing of the term proportional to ${\theta }_{2}$ says that the ghost correlation function $\langle {\xi }_{1}{\overline{\xi }}_{2}\rangle $ has to be the negative of the response function ${\rm{i}}\langle {\phi }_{1}{\hat{\phi }}_{2}\rangle $ (as in (139)). The vanishing of the term proportional to ${\theta }_{1}$ says the same thing if we notice that the ghost correlation function here is $\langle \overline{\xi }\xi \rangle $, not $\langle \xi \overline{\xi }\rangle $. Thus, invariance under ${D}^{\prime} $, through its enforcement of the cancellation of disconnected diagrams, is just conservation of probability.

Analogously, for the $\overline{{D}}^{\prime} $ symmetry (172), we have

Equation (183)

This gives

Equation (184)

For ${t}_{1}\gt {t}_{2}$, $\langle {\hat{\phi }}_{1}{\phi }_{2}\rangle $ vanishes, so we have, also using ${\partial }_{{t}_{2}}\langle {\phi }_{1}{\phi }_{2}\rangle =-{\partial }_{{t}_{1}}\langle {\phi }_{1}{\phi }_{2}\rangle $ from time translation invariance,

Equation (185)

Thus,

Equation (186)

which is the FDT.

To summarize, the theory has three invariances: time translation, ${D}^{\prime} $ and $\overline{{D}}^{\prime} $. ${D}^{\prime} $ expresses conservation of probability, and $\overline{{D}}^{\prime} $ expresses the FDT, i.e. the fact that the system is in equilibrium.

So far, we have treated a single-site problem. It is straightforward to extend the formalism to multiple degrees of freedom ${\phi }_{i}$. However, as emphasized before, the supersymmetric construction is permitted only when the drift is the negative gradient of a potential, ${f}_{i}=-\partial V/\partial {\phi }_{i}$. Otherwise the $\overline{{D}}^{\prime} $ supersymmetry fails. This means that the FDT is not obeyed; the system, even though it may possess a steady state, is not in equilibrium. (Of course, it is well-known from simple arguments making no reference to supersymmetry that models with non-gradient drifts, as arising e.g. from asymmetric coupling matrices, do not satisfy the FDT.)

7.3. Superdiagrams

In addition to the insight it provides into the symmetries of the problem, the supersymmetric formulation can also be of practical advantage in calculations. For example, in diagrammatic perturbation theory, one needs only draw the diagrams for the static problem. Another example can be found in Biroli's analysis [11] of dynamical TAP equations for the p-spin glass, where the entire structure of the argument and the equations could be carried over from the static treatment of Plefka [25]. Here we sketch how to do diagrammatic perturbation theory in the superfield language and show, in a simple example, how it reduces to the diagrams we had in the MSR formalism with the $\langle \phi (t)\phi (t^{\prime} )\rangle $ and $\langle \phi (t)\hat{\phi }(t^{\prime} )\rangle $ correlation functions [20].

We write our standard model  (14) in the form

Equation (187)

with

Equation (188)

and

Equation (189)

To do perturbation theory, we can expand $\exp (-{S}_{\mathrm{int}})$ in g and apply Wick's theorem to evaluate the resulting averages, just as we did in section 5. Wick's theorem holds also for Grassmann variables although in principle one has to be careful with sign changes that arise from changing the order of the variables when performing contractions. For example, $\langle {\xi }_{1}{\overline{\xi }}_{1}{\xi }_{2}{\overline{\xi }}_{3}\rangle =\langle {\xi }_{1}{\overline{\xi }}_{1}\rangle \langle {\xi }_{2}{\overline{\xi }}_{3}\rangle -\langle {\xi }_{1}{\overline{\xi }}_{3}\rangle \langle {\xi }_{2}{\overline{\xi }}_{1}\rangle $. Fortunately this is not an issue in our context as we only need averages of powers of the superfield Φ, and from (147) Φ only contains products of pairs of Grassmann variables: commuting such pairs through each other never gives any minus signs.

The first thing we need for the perturbation theory is the correlation function or propagator of the non-interacting system. Integrating (188) by parts, we get

Equation (190)

It is convenient to write this as

Equation (191)

where ${[\ldots ]}_{-}$ and ${[\ldots ]}_{+}$ denote the commutator and anti-commutator respectively. It is then simple to show that

Equation (192)

and

Equation (193)

The term in (191) involving the anticommutator can be neglected because it vanishes on time integration, so we can write S0 in the appealing form

Equation (194)

with9

Equation (195)

From this, we identify

Equation (196)

as the inverse of the free propagator:

Equation (197)

where we have used the fact that $({\overline{\theta }}_{1}-{\overline{\theta }}_{2})({\theta }_{1}-{\theta }_{2})$ acts as a delta-function in the Grassmann times. Now, multiplying by $({{D}}^{(2)}+\mu )$ from the left, using the fact that ${({{D}}^{(2)})}^{2}={\partial }_{t}^{2}$, and Fourier transforming in time, we arrive at

Equation (198)

Back in the time domain, this is

Equation (199)

where $t={t}_{1}-{t}_{2}$. It is comforting to confirm that we can get this result in another, simpler way, using the representation (181) with the equalities implied by ${D}^{\prime} $ invariance (182):

Equation (200)

Using the free correlation and response functions (3) and (4) here then leads to (199).

To see how the diagrammatics work in this formalism, consider the second-order watermelon graph  (113) for the self-energy,

Equation (201)

(Here we are doing the resummed expansion in which Σ is a functional of the full correlation function Q, not just Q0, as in section 6.)

From the representation  (200), the part of $Q(1,2)$ with no Grassmann times multiplying it is the correlation function, and the parts mutiplied by $\pm ({\overline{\theta }}_{2}-{\overline{\theta }}_{1}){\theta }_{\mathrm{1,2}}$ are the retarded and advanced response functions, respectively. Expanding $Q{(1,2)}^{3}$, we get

Equation (202)

This has the same form as (200). We note that the components of ${\rm{\Sigma }}(1,2)$ here are just the ${ \mathcal O }({g}^{2})$ contributions to the self-energies ${{\rm{\Sigma }}}_{12}$, ${{\rm{\Sigma }}}_{21}$ and ${{\rm{\Sigma }}}_{22}$ that we found in the conventional MSR theory (116) and (117), including the factor of 3 in ${{\rm{\Sigma }}}_{12}$ and ${{\rm{\Sigma }}}_{21}$.

It is useful to discuss in more detail how perturbation-theoretic corrections to $Q(1,2)$ in the supersymmetric formulation correspond to results obtained in the conventional MSR theory. The two theories differ superficially in two ways: (1) In the conventional theory we have to keep track of two kinds of correlations functions ($\langle \phi \phi \rangle $ and ${\rm{i}}\langle \phi \hat{\phi }\rangle $) while in the supersymmetric formulation we have only one (super)field and need only draw the diagrams we would have in statics. (2) In the supersymmetric theory both real and Grassmann times of intermediate vertices in the graphs are integrated over, while in the conventional theory only ordinary times are integrated over. To compare the two ways of doing the calculation, we consider the first term obtained in expanding the Dyson equation in ${\rm{\Sigma }}(1,2)$:

Equation (203)

To resolve ${\rm{\Delta }}Q(1,4)$ into components, we use first the fact that Grassmann factors of the form $({\overline{\theta }}_{2}-{\overline{\theta }}_{1}){\theta }_{2}$ are idempotent under convolution, e.g.

Equation (204)

That means that the retarded part of ${\rm{\Delta }}Q(1,4)$ (the part proportional to $({\overline{\theta }}_{4}-{\overline{\theta }}_{1}){\theta }_{4}$) is, in the notation we have used earlier ($\langle \phi ({t}_{1})\phi ({t}_{2})\rangle =C({t}_{1},{t}_{2})$, ${\rm{i}}\langle \phi ({t}_{1})\hat{\phi }({t}_{2})\rangle =R({t}_{1},{t}_{2})$)

Equation (205)

and analogously for the advanced part.

We also get a contribution to ${\rm{\Delta }}Q(1,4)$ from the first term in ${\rm{\Sigma }}(2,3)$. Since that term contains no Grassmann times, this contribution will contain a factor

Equation (206)

These integrations pick out factors of the retarded function $R({t}_{1},{t}_{2})$ and the advanced function $R({t}_{4},{t}_{3})$, respectively, so we find a contribution, involving no Grassmann times, of

Equation (207)

These are exactly the second-order contributions to $R({t}_{1},{t}_{4})$ and $C({t}_{1},{t}_{4})$ that we would find in the conventional formulation from expanding the Dyson equation to first order in the self-energies ${{\rm{\Sigma }}}_{12}$ and ${{\rm{\Sigma }}}_{22}$ in (116) and (117), again up to ${ \mathcal O }(g)$ terms not written explicitly here. Thus, because of the algebra of the Grassmann times in the supersymmetric formulation, the results of the multiplication and convolution of the supercorrelation functions, when reduced to components, reproduce the terms found in the conventional MSR theory. This result extends to all graphs in perturbation theory, because it depends only on (1) the idempotency of factors like $({\overline{\theta }}_{2}-{\overline{\theta }}_{1}){\theta }_{2}$ and (2) the fact that multiple correlator lines between a pair of interaction vertices, like those in (201), combine as in (202).

8. An interacting example

To generalize the discussion so far to more interesting interacting models one can for example add a linear interaction between different degrees of freedom or 'soft spins' ${\phi }_{i}$. This gives the Langevin equation of motion

Equation (208)

where we assume that there are no self-interactions, hence ${J}_{{ii}}=0$. If the couplings are otherwise symmetric, ${J}_{{ij}}={J}_{{ji}}$, then this dynamics obeys detailed balance because it represents noisy gradient descent ${\partial }_{t}{\phi }_{i}=-{\partial }_{{\phi }_{i}}H+{\zeta }_{i}(t)$ on the energy function

Equation (209)

which can be thought of as a soft spin version of the Sherrington–Kirkpatrick model [26]. The diagrammatic technique in its MSR incarnation (including fermionic ghost variables, if one chooses to follow the Stratonovich convention) can be applied irrespective of any such restriction, i.e. whether or not the system obeys detailed balance. For a supersymmetric treatment interaction symmetry is necessary, on the other hand.

The generating functional has the form  (26) with action, written directly in the continuous time limit for h = 0 and using the Ito convention,

Equation (210)

In this action the interaction gives an additional vertex with two legs, $\int {\rm{d}}{t}\,{\sum }_{{ij}}{J}_{{ij}}{\hat{\phi }}_{i}(t){\phi }_{j}(t)$. (As before we do not introduce a new symbol for this vertex as the meaning is clear from the number of legs attached.) The diagrammatic expansion now becomes a joint expansion in both g and the interaction amplitude; formally one could set ${J}_{{ij}}=J{\hat{J}}_{{ij}}$, consider the ${\hat{J}}_{{ij}}$ fixed and expand in J. To illustrate the new diagrams that appear we restrict ourselves to g = 0. The expansion of the propagator to second order in J is then simply

All propagators and vertices now carry site indices in addition to the time index and the 'sector' label 1 or 2 for physical (ϕ) and conjugate ($\hat{\phi }$) fields. If we focus on the response function part of the propagator, the expansion becomes

Equation (211)

Writing out the diagrams this translates to

Equation (212)

where all internal site indices ($k,l,m,n$) are to be summed over. Here ${R}_{{ij}0}(t,t^{\prime} )$ is the response function of the unperturbed dynamics, which because of the absence of interactions is diagonal in the site indices, ${R}_{{ij}0}(t,t^{\prime} )={\delta }_{{ij}}{R}_{0}(t,t^{\prime} )$. The time integrals become simple products in frequency space, giving for the Fourier transform ${R}_{{ij}}(\omega )=\int {\rm{d}}{t}\,{R}_{{ij}}(t,t^{\prime} ){{\rm{e}}}^{{\rm{i}}\omega (t-t^{\prime} )}$

Equation (213)

or in matrix form

Equation (214)

Equation (215)

Inserting ${{\boldsymbol{R}}}_{0}^{-1}(\omega )=(\mu -{\rm{i}}\omega ){\boldsymbol{I}}$ where ${\boldsymbol{I}}$ is the identity matrix gives ${\boldsymbol{R}}(\omega )={[(\mu -{\rm{i}}\omega ){\boldsymbol{I}}-{\boldsymbol{J}}]}^{-1}$. Given that we are considering g = 0 where the dynamics is purely linear, this is easily seen to be the exact result. This was possible to obtain here because we were able to sum up all diagrams, or equivalently because only a single diagram (the bare quadratic vertex) contributes to the self-energy.

Up to here the discussion applies for general Jij. To simplify further one needs to make assumptions on the statistics of these interactions. One interesting case is that of a soft-spin Sherrington–Kirkpatrick model for which the Jij are zero-mean Gaussian random variables, uncorrelated for different index pairs ij, except that the symmetry in the interaction matrix is imperfect:

Equation (216)

The symmetry parameter has the value $\kappa =1$ for a fully symmetric matrix, while $\kappa =0$ gives a fully asymmetric matrix.

For the local response function Rii one can in general simplify  (213) to

Equation (217)

where the term of first order in Jij vanishes due to the lack of self-interactions. For the soft-spin SK model, the sum ${\sum }_{m}{J}_{{im}}{J}_{{mi}}$ in the second order term has average $(N-1)\kappa {J}^{2}/N=\kappa {J}^{2}+{ \mathcal O }(1/N)$ while its variance is ${ \mathcal O }(1/N)$. For large N it is therefore self-averaging, i.e. equal to $\kappa {J}^{2}$ with probability one. Hence

Equation (218)

As one might have expected, because this model has each node interacting with all others the nodes become equivalent, making the local response functions Rii independent of i. It remains true at higher orders that the local response only depends on the overall coupling amplitude J and the symmetry parameter κ. This can be shown by a separate diagrammatic argument [2729] or by explicit averaging of the path generating function Z over the disorder, i.e. the statistics of the Jij. This latter approach is the subject of the following section.

9. Quenched averaged dynamics of soft spin models

In the previous sections, we described how the dynamics of a set of interacting scalar variables ${\phi }_{i}$ that evolve according to a stochastic differential equation can be studied perturbatively using the path integral formalism. When the evolution equations depend on random quantities such as the Jij above, then instead of studying a single system with fixed Jij one can consider an ensemble of systems, each with a different set of interactions Jij drawn from some distribution. The resulting ensemble averages are expected to reflect the behavior of a typical sample as far as macroscopic quantities such as the average local response $(1/N){\sum }_{i}{R}_{{ii}}$ are concerned: these are self-averaging, i.e. to leading order dependent only on the overall statistics of the interactions in the system. Where the interactions are weak and long-ranged as in the soft-spin SK model, all spins also become equivalent so even the local responses Rii are self-averaging as we saw above. This would not be true, for example, for systems with interactions on a network with finite connectivity.

We illustrate the approach for the dynamics given by (208), with dynamic action in (210) and assuming Gaussian statistics for the Jij as specified in (216). The derivation that we outline here was first described in [4, 30] for symmetric interactions; the role of the degree of asymmetry in the interactions was analysed later in [31].

Using the fact that the average of $\exp ({\rm{i}}{\boldsymbol{x}}\cdot {\boldsymbol{z}})$ over a vector of zero mean Gaussian random variables ${\boldsymbol{z}}$ with covariance ${{\boldsymbol{A}}}^{-1}$ is $\exp (-{\boldsymbol{x}}\cdot {{\boldsymbol{A}}}^{-1}{\boldsymbol{x}}/2)$, the ${\boldsymbol{J}}$-average (denoted by an overline) of the part of the generating functional Z that depends on the Jij reads

Equation (219)

Equation (220)

Equation (221)

One can drop the restriction $i\ne j$ for large N as including the i = j terms only gives subleading corrections. A mean-field decoupling of the quartic terms gives

Equation (222)

Equation (223)

Here

Equation (224)

are the average local correlation and response functions of the system. They are in principle dependent on the specific trajectory of the system, but from the law of large numbers will assume deterministic values for $N\to \infty $. This heuristic argument can be justified by a formal calculation introducing conjugate order parameters to C and R and making a saddle point argument [4, 30, 31]. Because I now factorizes over sites i, so does the entire ensemble-averaged partition function $\overline{Z}$. The contributions from Li give an effective noise with correlation function ${J}^{2}C(t,t^{\prime} )$, and a delay term with memory kernel $\kappa R(t,t^{\prime} )$.

All spins have the same action, so we can drop the index i and write down the effective dynamics as

Equation (225)

Equation (226)

The correlation and response need to be found self-consistently from this, and to facilitate calculation of the response we have added back in the field term h(t).

Let us now take a moment to study the dynamics of the bare model (g = 0) that arises from the above equation. At g = 0, and assuming that for long times a stationary regime is reached where $R(t,t^{\prime} )=R(t-t^{\prime} )$, the equation of motion in frequency space is

Equation (227)

Averaging both sides over the noise term ${\zeta }_{\mathrm{eff}}$ gives for the mean $m(\omega )=\langle \phi (\omega )\rangle $

Equation (228)

From this, we find the response function

Equation (229)

To leading order in J2 this gives ${R}^{-1}(\omega )={R}_{0}^{-1}(\omega )-\kappa {J}^{2}{R}_{0}(\omega )$, which after inverting and re-expanding to ${ \mathcal O }({J}^{2})$ agrees with the perturbative result  (218) as it should.

The critical dynamics of system can be understood via the low frequency behavior of the response function. In fact, one can define a characteristic response time scale as

Equation (230)

Taking the ω-derivative of both sides of (229) and multiplying by R gives

Equation (231)

and evaluating at $\omega =0$ one finds

Equation (232)

The response timescale can therefore be expressed as

Equation (233)

In other words, when $R(0)={J}^{-1}$, the relaxation time scale of the system diverges and the system exhibits critical slowing down. From (229) one sees that the critical value of J obeys $J=\mu -{J}^{2}/J$, hence $J=\mu /2$. Using the explicit solution of (229), which reads

Equation (234)

shows further that at criticality the response function has a singularity $\propto \,{\omega }^{1/2}$ for $\omega \to 0$. In the time domain this corresponds to a power law tail $R(t-t^{\prime} )\propto {(t-t^{\prime} )}^{-3/2}$, which is responsible for the diverging mean timescale.

10. Path integrals for hard spin models

So far we have considered models in which the dynamical variables took continuous values. Here we show that, with some modifications, the same approach can be used to study the dynamics of models involving Ising spins.

Consider a system of N binary spin variables, ${\boldsymbol{\sigma }}=({\sigma }_{1},\ldots ,{\sigma }_{N})$ with ${\sigma }_{i}=\pm 1$. We consider synchronous dynamics, where time t advances in discrete steps and at each time step all spins are updated. Specifically, we assume that the spins decide their states according to the following probability distribution

Equation (235)

Equation (236)

If the external fields ${h}_{i}^{\mathrm{ext}}$ are constant in time as written above and if the couplings are symmetric, i.e. ${J}_{{ij}}={J}_{{ji}}$, this dynamics reaches an equilibrium state where detailed balance is satisfied and configurations are visited according to the Boltzmann distribution $p({\boldsymbol{\sigma }})\propto \exp (-H)$ with the Hamiltonian [32]

Equation (237)

where we have introduced the abbreviation $\mathrm{lc}(x)\equiv \mathrm{ln}(2\cosh (x))$ and emphasize that it is the total fields ${h}_{i}({\boldsymbol{\sigma }})$, which depend on the configuration, that appear inside the log $\cosh $.

An alternative to synchronous dynamics that is commonly studied in the literature is continuous time Glauber dynamics [33], in which spins are updated individually and at random (exponentially distributed) time intervals. This is achieved by prescribing that in any infinitesimal time interval $\delta t\to 0$, every spin is flipped with probability

Equation (238)

Again, for constant external fields and symmetric couplings an equilibrium Boltzmann distribution is reached, but this time with the familiar Ising Hamiltonian

Equation (239)

For derivations of these equilibrium distributions see [34, 35]. Although the Glauber dynamics is the one that leads to the usual Ising Hamiltonian, in what follows we focus on the synchronous case for two reasons: (i) the path integral formulation is slightly simpler in terms of notation and (ii) the synchronous case has been the focus of recent work on the dynamics of hard spin models, including our own on the inverse problem of inferring the couplings in the model from the statistics of the spin history [3638].

10.1. Path integral formulation

The generating functional for the dynamics is defined as

Equation (240)

where the average denoted by $\langle \cdots \rangle $ is over the distribution of trajectories generated according to the probability distribution  (235) and the ${\psi }_{i}$ are fields that allow one to obtain the statistics of the ${\sigma }_{i}$ by taking derivatives. The key to writing a path integral representation of the generating functional in this case is to work with the local fields hi(t), which are continuous for $N\to \infty $, and not the original spin variables:

Equation (241)

where ${\boldsymbol{J}}$ is the interaction matrix, $\mathrm{Tr}$ indicates a sum over all spin trajectories $\{{\boldsymbol{\sigma }}(t)\}$ and similarly $D[h]$ an integral over all field trajectories $\{{\boldsymbol{h}}(t)\}$. The discrete time variable range is $t=0,1,\ldots ,T-1$ where T is the final time.

The delta function in (241) is introduced to enforce the definition (236) and can be written as a Fourier transform, leading to

Equation (242)

Equation (243)

This expression can be used in several ways. One is to average over the distribution of ${\boldsymbol{J}};$ the second is to derive mean-field equations for the system. We do not go into the first route here, i.e. the quenched averaged dynamics of the system, as this is both very involved and also discussed in detail elsewhere [39]. However, as an example of how one can use the path integeral formulation (242) of the generating functional, we consider in the following subsections the simple saddle point approximation to the path integral as well as possible improvements to this.

10.2. Saddle point

To derive the saddle point equations, we first perform the trace in (242), which for a single timestep involves

Equation (244)

Equation (245)

Using this we can write

Equation (246)

Equation (247)

The saddle point equations for stationarity of S with respect to the hi(t) and ${\hat{h}}_{i}(t)$ are then

Equation (248)

Equation (249)

These equations can be written in a simpler form in terms of the magnetizations in the system biased by ψ, which are generally given by ${m}_{i}(t)={\partial }_{{\psi }_{i}(t)}\mathrm{ln}Z$. In the saddle point approximation $Z\approx {Z}_{s}\equiv \exp (-S)$, and because S is stationary with respect to h and $\hat{h}$,

Equation (250)

Here we have used 's' superscripts to denote saddle point values and introduced ${\mu }_{i}(t)$ as a convenient abbreviation for later use. The saddle point equations then simplify to

Equation (251)

Equation (252)

For the physical dynamics we want the solution at $\psi =0$, which we denote with a '0' superscript. One can show that this has ${\hat{h}}_{i}^{0}(t)=0$. Indeed, if one considers the dynamics over a finite number of timesteps $t=1,2,\ldots ,T$, then both ${\boldsymbol{h}}(t)$ and $\hat{{\boldsymbol{h}}}(t)$ are defined over the range $t=0,1,\ldots ,T-1$ and accordingly one finds that in the first saddle point equation  (248) taken at $t=T-1$, the $\hat{{\boldsymbol{h}}}(t+1)$-term inside the tanh is absent. For $\psi =0$ this equation then dictates ${\hat{h}}_{i}^{0}(T-1)=0$, and working backwards in time from there one finds recursively ${\hat{h}}_{i}^{0}(t)=0$ for all t. Accordingly (251) and (252) simplify to the standard mean field equations for the magnetizations,

Equation (253)

Equation (254)

Note that the saddle point value of the action (247) is then zero for $\psi =0$, as expected from the normalization condition $Z[0]=1$.

10.3. Beyond the saddle point: naive approach

One can go beyond the saddle point approximation and take into account the Gaussian fluctuations around the saddle point to obtain a better estimate of the generating functional and hence of the equations of motion for the mi(t). The Gaussian corrections to the log generating functional are

Equation (255)

where $\delta h$ indicates the deviation of h from its saddle point value. We denote by ${\partial }^{2}S$ the matrix of the second derivatives of S with respect to h and $\hat{h}$ calculated at the saddle point; this has entries

Equation (256)

Equation (257)

Equation (258)

From the corrected log generating functional $\mathrm{ln}Z[\psi ]={\mathrm{ln}Z}_{s}[\psi ]+A[\psi ]$ we obtain a corrected expression for the magnetization

Equation (259)

To calculate the ψ-derivative of A at $\psi =0$ we can either appeal to numerical methods or evaluate A approximately. One relatively simple approximation, which will give us an intuitive feeling for the corrections and turns out to capture the lowest order corrections in J, is to only keep the equal-time ($t=t^{\prime} )$ elements of the matrix ${\partial }^{2}S$, i.e. to discard the second term in  (257). The matrix then separates into blocks corresponding to the different timesteps t. Each of those blocks is of size $2N\times 2N$. Let us order the elements so that the top left N × N sub-block contains the entries from  (256). This block, which we will denote by ${{\boldsymbol{\alpha }}}_{t}$, is diagonal and vanishes at the physical saddle point, where ${\psi }_{i}(t)=0$ and ${\hat{h}}_{i}^{s}(t)={\hat{h}}_{i}^{0}(t)=0$. The two off-diagonal blocks  (257) are $-{\rm{i}}{\boldsymbol{I}}$ within our approximation. If we call the bottom right N × N block ${{\boldsymbol{\beta }}}_{t}$, then we have by a standard determinant block identity

Equation (260)

Near the physical saddle point ${{\boldsymbol{\alpha }}}_{t}$ is small so we can linearize the logarithm to get

Equation (261)

Equation (262)

Linearizing then also the diagonal entries of ${{\boldsymbol{\alpha }}}_{t}$ that appear in the first square bracket, and accordingly setting the last factor to its value at the physical saddle point gives

Equation (263)

When we take the derivative of this with respect to ${\psi }_{i}(t)$, we do in principle get a term from the dependence of ${\hat{{\boldsymbol{h}}}}^{s}$ on ψ. But as ${\hat{{\boldsymbol{h}}}}^{s}$ is multiplied by a factor of ${\boldsymbol{J}}$, this will give us a contribution to the derivative that is of higher order in J than the main term from the explicit dependence on ${\psi }_{i}(t)$. Discarding this higher order contribution—as our second approximation in addition to neglecting correlations between different time steps in ${\partial }^{2}S$—yields for the derivative

Equation (264)

Here we have implicitly already set $\psi =0$ because we linearized around the physical saddle point throughout. Using this result in (259), we obtain the corrected magnetization as

Equation (265)

This is the result of our naive approach to including Gaussian corrections in the saddle point integral. Note that because of the normalization $Z[0]=1$, we do not expect Gaussian corrections to $\mathrm{ln}Z[0]$ at the physical saddle point. The approximation we made in evaluating $A[\psi ]$ as a sum of local-in-time terms preserves this requirement: $A[0]=0$ as is clear from  (260) given that ${{\boldsymbol{\alpha }}}_{t}=0$ when $\psi =0$.

10.4. Beyond the saddle point: Plefka and Legendre transform approaches

It is instructive to compare  (265) with the so-called dynamical TAP equations for our system [12]

Equation (266)

which for stationary magnetizations reduce to the better-known equilibrium TAP equations [40, 41]. The negative term inside the hyperbolic tangent is known as the Onsager correction, which improves on the naive estimate of the mean effective field acting on a spin.

One now observes that the term inside the curly brackets on the right-hand side of our simpler corrected equation  (265) is exactly of the form of this Onsager correction, so there is a close relation between the two approaches. However, there are two significant differences. First, while the Onsager term in (266) appears inside the $\tanh ,$ in (265) it is outside, as if we had linearized in the correction. This is important: it means there is nothing on the right-hand side of (265) to stop the magnetizations from going outside the physical range $[-1,1]$. The origin of this difference is the fact that while the dynamical TAP method corrects the naive estimate of the field acting on a spin, the Gaussian fluctuation approach described in the previous subsection directly estimates corrections for the magnetizations themselves.

The second difference between (265) and (266) is that in the latter it is the corrected magnetizations ${m}_{i}^{\mathrm{TAP}}$ itself that appear in the Onsager term, making the approximation self-consistent. In our naive method, on the other hand, the Onsager term is evaluated at the uncorrected saddle-point magnetizations.

The dynamical TAP equations (266) can be derived by the Plefka approach [25]. This is an elegant method that captures corrections to the effective fields as explained above, ensuring in contrast to (265) that the magnetizations remain in the physical range even when corrections to the naive mean field estimate are large. This is achieved by working not with the generating functional directly (or the Helmholtz free energy in the equilibrium case), but with its Legendre transform with respect to the magnetizations (the analogue of the equilibrium Gibbs free energy). In this way, one essentially approximates the system with an independent spin model in which each spin feels an effective field ${h}_{i}^{\mathrm{eff}}$. This leads to expressions of the form ${m}_{i}(t)=\tanh ({h}_{i}^{\mathrm{eff}}(t-1))$ for the magnetizations, which therefore always lie between −1 and 1. The effective fields are determined so as to best match the Gibbs free energy of the original model, by a perturbation expansion to second order in J.

Interestingly, the philosophy of working with the Legendre transform can also be applied to the approach we have described above, which initially has a rather different starting point, namely a saddle point approximation with Gaussian corrections. We show this in the Appendix, where we demonstrate that by switching to the Gibbs free energy and keeping only first order terms in the corrections Ai one can retrieve exactly the dynamical TAP equations. We refer the reader to [42, 43] for applications of the same idea in the context of the equilibrium Ising and Potts models.

11. Summary and discussion

Path integral methods and the associated perturbative and diagrammatic tools are among the most powerful methods used in almost every part of theoretical physics. Disordered systems, such as spin glasses, neural networks, models of the immune system etc are no exception and, over the years, path integrals have played a significant role in the study of equilibrium and out-of-equilibrium properties of these systems. In disordered systems, by definition, interactions in a single system do not have any trivial structure; there is symmetry only in their statistics, as described by the distribution from which they are drawn. These systems are typically subject to external stochastic forces with both rapidly changing and quenched random components. Furthermore, these sources of randomness are expected to play crucial roles in the physics and should therefore be included explicitly in studying such systems. The path integral and diagrammatic methods treated in textbooks on field theoretic tools for other physical systems usually do not deal with disordered systems. Therefore, we tried in this paper to describe some of the key field theoretic and path integral techniques in a manner that can be applied directly to disordered systems.

We started by studying the dynamics of a scalar degree of freedom evolving according to a Langevin equation and showed how one can do perturbative calculations for this model and represent them in diagrammatic form. We then discussed the supersymmetries of the action that features in the path integral description of this system, the physical intuition and meaning behind them, and how they can help in doing diagrammatic perturbation theory by reducing the graphs to be considered to those in simpler, well-understood equilibrium calculations. Our next step was to study the dynamics of systems of interacting 'soft-spin' variables subject to Langevin equations with random interactions, first using a perturbative treatment for a single sample of the interactions, and then via a conceptually different approach based on averaging the generating function in its path integral formulation over the disorder.

Finally we switched to hard-spin models, focusing on the Ising model with synchronous update dynamics, for a single sample with arbitrary couplings. Here, as opposed to the way we treated soft-spin models, the path integral is not written in terms of the spins directly but in terms of the fields acting on them. We discussed how, from such a path integral formulation, one can derive approximate equations for the mean magnetizations at the saddle point level and compute the Gaussian fluctuation corrections around the saddle point. We showed that a naive calculation of these corrections yields equations of motion for the magnetizations that can lead to unphysical predictions. These issues can be avoided by going to improved approximation schemes like dynamical TAP equations, as described recently in other papers using a path integral formulation [12, 44], as well as several alternative approaches [38, 40, 45]. We closed the paper with the intriguing observation, however, that issues with naively corrected mean-field equations can also be cured within the general approach presented in this paper, using a Legendre transform to switch to the Gibbs free energy. It is tempting to infer that the Legendre transform implicitly achieves a resummation of the most important diagrams, but we have not been able to show this explicitly and leave this point as an open question for future work.

Although we have tried to cover what we view as the key concepts in this review, we have had to leave out several important issues to maintain a coherent focus. We list two of these in the following.

The use of dynamical models in inference. The path integral approach described here is designed initially for a forward problem: given the interactions in a system (as well as external fields etc where relevant), predict the dynamics of the order parameters of interest. In recent years, there has been strong interest in the inverse problem: given observations of the order parameter dynamics, find the interactions; for a review see e.g. [46]. This interest has been generated by recent advances in recording technology in various fields of science, allowing massive data sets to be gathered that invite researchers to attempt to reverse-engineer the underlying systems from the observed data.

The path integral formulation for forward problems described here is a natural first step in finding inverse equations, which in the simplest case consist of an inversion of the forward equations. Furthermore, in some cases, the path integral method can be immediately applied to the inverse problem itself. For instance, in the presence of hidden degrees of freedom, when trying to reconstruct the interactions in a spin model by observing only some of the spins and not the others, then calculating the likelihood of the data involves tracing over the trajectory of the hidden spins. This can then be done using the path integral methods discussed in this review [47].

The extended Plefka expansion. In a spin system at equilibrium, the magnetization of each spin is really the only parameter of interest, with spin–spin correlations then determined indirectly from the linear response of the system. This is not the case for out-of-equilibrium dynamics. Here out-of-equilibrium correlation and response functions need to be calculated in addition to the magnetizations in order to achieve a full statistical description of the system. So while at equilibrium and also in the simplest form of the dynamical TAP approach for Ising models [12] it makes sense to perform a Plefka expansion by fixing only the magnetization (and in the dynamics, the conjugate fields), in the full dynamical treatment response and correlation functions should also be fixed as order parameters. This is even more important for soft spin models, where even at equilibrium one would want to use at least the means and variances of the local degrees of freedom as order parameters. Such 'extended' Plefka expansions have been used for the so-called p-spin spherical spin glass model in [11] and, more recently, for general systems of coupled Langevin equations [48] and Ising spin glasses [44]. They should be a productive route for further progress in the field, e.g. by extending them further to incorporate inference from observed data [49].

Other interesting current work in this field includes that of Cooper and Dawson [50], Polettini [51], and Nemoto et al [52]. These lie beyond the scope of our elementary treatment of the subject, but we hope that our introduction will equip readers with the tools to explore these and other more advanced problems.

Acknowledgments

This work has been partially supported by the Marie Curie Initial Training Network NETADIS (FP7, grant 290038). YR was also financially supported by the Kavli Foundation, Norwegian Research Council, Center of Excellence Scheme (grant number 223262) and a Starr Foundation fellowship at the Simons Center for Systems Biology, Institute for Advanced Study. The authors are also grateful to Luca Peliti for fruitful discussions. PS acknowledges the stimulating research environment provided by the EPSRC Centre for Doctoral Training in Cross-Disciplinary Approaches to Non-Equilibrium Systems (CANES, EP/L015854/1).

Appendix

The Legendre transform approach starts from the saddle point value of the log generating functional

Equation (A.1)

We first rewrite this expression in a form that will simplify some of the algebra below, by decomposing the function $\mathrm{lc}(\cdot )$ in (A.1) as

Equation (A.2)

Equation (A.3)

Here ${H}_{2}[x]$ is simply the entropy of a single spin with magnetization x. This decomposition gives, bearing in mind the definition of ${\mu }_{i}(t)$ in  (250),

Equation (A.4)

The second sum in the second line now cancels the second sum in the first line because of the saddle point equation  (252), and we end up with the expression

Equation (A.5)

The Legendre transform of this expression with respect to the magnetizations mi(t) is

Equation (A.6)

where the ψ are to be expressed in terms of the m by solving

Equation (A.7)

We already know that $\partial \mathrm{ln}{Z}_{s}[\psi ]/\partial {\psi }_{i}(t)={m}_{i}^{s}(t)={\mu }_{i}(t)$, so at this saddle point level one has ${m}_{i}(t)={\mu }_{i}(t)$. The second saddle point equation  (252) then becomes ${{\boldsymbol{h}}}^{s}(t)={{\boldsymbol{h}}}^{\mathrm{ext}}+{\boldsymbol{J}}{\boldsymbol{m}}(t)$ and inserting this gives

Equation (A.8)

The equation of state is, by standard Legendre transform properties,

Equation (A.9)

which yields

Equation (A.10)

At ${\psi }_{i}(t)=0$ this is naturally solved by ${m}_{i}(t+1)=\tanh ({h}_{i}^{\mathrm{ext}}+{\sum }_{j}{J}_{{ij}}{m}_{j}(t))$. So far we have not achieved anything new: we have merely provided an alternative way of obtaining the naive mean-field equations (253) and (254) we already had.

Now let us consider the Legendre transform of the generating functional including the Gaussian correction, $\mathrm{ln}Z[\psi ]=\mathrm{ln}{Z}_{s}[\psi ]+A[\psi ]$

Equation (A.11)

where as before ψ is to be treated as a function of the magnetizations, as determined by the condition

Equation (A.12)

Here we have used the earlier definition ${A}_{i}(t)=\partial A/\partial {\psi }_{i}(t)$.

Expressing the right-hand side of (A.11) in terms of m, and keeping only terms linear Ai, we obtain, as shown below,

Equation (A.13)

Remarkably, all the Ai terms have cancelled here and the only difference to the saddle point result ${{\rm{\Gamma }}}_{s}[m]$ in  (A.8) is the naive addition of the correction term A, expressed in terms of m.

From ${\rm{\Gamma }}[m]$ we can now derive the equation of state for the magnetizations from

Equation (A.14)

Equation (A.15)

If we are only interested in corrections to quadratic order in J for the field acting on each spin, we can drop the terms in the second line of the equation above as the term in square brackets is already ${ \mathcal O }({J}^{2})$. In the physical limit $\psi \to 0$ one then obtains

Equation (A.16)

We can now appreciate the role played by the Legendre transform: the corrections we obtain act on the effective fields and so are inside the $\tanh $. As explained in the main text this makes more physical sense.

To evaluate $\partial A/\partial {m}_{i}(t)$, one can start from the expression  (262) for A. Replacing ${h}_{i}^{s}(t)$ in this using the saddle point equation  (252) gives

Equation (A.17)

As A is already ${ \mathcal O }({J}^{2})$, replacing all ${\mu }_{i}(t)$ by mi(t) in this expression only gives a negligible correction of ${ \mathcal O }({J}^{4})$. Also the first square bracket is small at the physical saddle point, of ${ \mathcal O }({J}^{2})$, so the derivative of the final factor with respect to mi(t) can be dropped. Finally the derivative of the tanh can also be neglected as it is of order J. One thus finds to leading order the simple result

Equation (A.18)

Combined with (A.16) this yields the dynamical TAP equation (266).

It remains to show  (A.13). Using the expression (A.5) for $\mathrm{ln}{Z}_{s}$ and ${m}_{i}(t)\,={\mu }_{i}(t)+{A}_{i}(t)$ in (A.11), we can write Γ as

Equation (A.19)

We have shifted the time index t by one in the second and fourth sum for later convenience. Now we express the right-hand side of (A.19) in terms of m and keep only terms linear in Ai. For the last sum we need the following identity, which can be obtained from the definition of ${\mu }_{i}(t)$ in (250) together with ${m}_{i}(t)={\mu }_{i}(t)+{A}_{i}(t)$:

Equation (A.20)

To linear order in Ai, the various terms on the right-hand side of (A.19) are then

Equation (A.21)

Equation (A.22)

Equation (A.23)

Equation (A.24)

Putting all these together we notice that the second term on the right-hand side of the first equation above and the first term of the last equation, the second terms in the second and last equations, as well as the last terms in the second and last equations cancel each other, yielding

Equation (A.25)

Using the fact that from the first saddle point equation (251)

Equation (A.26)

we can write the terms in the second line of (A.25), together with the first term in the third line, as

Equation (A.27)

As the term in curly braces is itself a correction term that is non-zero only because of the difference between ${\mu }_{i}(t)$ and mi(t), this overall expression can be neglected as subleading. The remaining terms of  (A.25) then give exactly (A.13) as claimed. Note that keeping only terms linear in Ai in this calculation—which then eventually cancel—in some sense plays the same role as expanding to second order in J2 in the Plefka method, as Ai is of ${ \mathcal O }({J}^{2})$.

Footnotes

  • We use brackets, $\langle \cdot \rangle $, to denote averages over all noise histories and, where relevant (as in equation (3)), initial conditions. Whether the initial condition average is included or not will be clear from the context. Other kinds of brackets, with subscripts, that appear later in the paper will denote averages over specific distributions that appear in the theoretical development.

  • Digression on equilibrium statistical mechanics: equilibrium averages can be evaluated by a diagrammatic expansion very similar to the one here. The main difference is that the partition function (whose perturbative expansion is the same as the one for $\langle 1{\rangle }_{S}$ above) is not automatically normalized. Averages are thus written as $\langle \ldots \rangle =\langle \ldots {{\rm{e}}}^{-{S}_{\mathrm{int}}}{\rangle }_{0}/\langle {{\rm{e}}}^{-{S}_{\mathrm{int}}}{\rangle }_{0}$ and the denominator, when expanded, ensures again that disconnected diagrams vanish. Equivalently, one can think of averages as derivatives of the log partitition function; the perturbative expansion for $\mathrm{ln}Z$ consists of just the connected diagrams from the expansion of Z.

  • It is common in the literature to use a different symbol, such as a cross, to represent the external field. Here, in the interest of a more uniform notation, we stick with dots, and the number of lines connected at a dot indicates whether it means h, g, or a coupling of some still-different order, such as the cubic vertex in (93).

  • We could of course add any term proportional to ${\partial }_{t}$ to ${{D}}^{(2)}$ without changing the action, i.e. we could put any coefficient a in front of the ${\partial }_{t}$ term in (193) and the action would remain unchanged. Choosing a = 2 would correspond to including the anticommutator term from (191). However, the choice a = 1, i.e. ${{D}}^{(2)}={[\overline{{D}},{D}]}_{-}$, will prove convenient when we want to invert this operator to find the unperturbed superfield correlation function.

Please wait… references are loading.
10.1088/1751-8121/50/3/033001