Abstract
We review some of the techniques used to study the dynamics of disordered systems subject to both quenched and fast (thermal) noise. Starting from the Martin–Siggia–Rose/Janssen–De Dominicis–Peliti path integral formalism for a single variable stochastic dynamics, we provide a pedagogical survey of the perturbative, i.e. diagrammatic, approach to dynamics and how this formalism can be used for studying soft spin models. We review the supersymmetric formulation of the Langevin dynamics of these models and discuss the physical implications of the supersymmetry. We also describe the key steps involved in studying the disorder-averaged dynamics. Finally, we discuss the path integral approach for the case of hard Ising spins and review some recent developments in the dynamics of such kinetic Ising models.
Export citation and abstract BibTeX RIS
1. Introduction
Studying the statistical properties of variables or classical fields subject to stochastic forces has a long history in physics. A key tool in this effort is the Langevin equation [1]. Originally introduced for describing the Brownian motion of a particle in a fluid, over the years, this has also been used for studying the dynamics of a variety of other systems including the standard models of critical phenomena, e.g. those described by the Ginzburg–Landau Hamiltonian [2] and, later on, the relaxation of spin glass models with soft spins [3, 4].
In order to study the dynamical version of the Ginzburg–Landau model in the presence of thermal noise one has to appeal to a perturbative analysis. A useful pedagogical description of the early methods used in such a perturbative treatment can be found in the seminal book by Ma [2] (chapter 5) among other places. A major advance was made by Martin, Siggia and Rose (MSR) in 1973 [5] who realized that one can study classical variables subject to stochastic noise by representing them as Heisenberg operators with appropriately defined commutation relations with conjugate fields. These observations allowed MSR to write down a generating functional that naturally lends itself to a perturbative treatment and the use of other field theoretic tools. The most important early development that followed was the work by De Dominicis, Peliti and Janssen [6–8]. They realized that the conjugate field, which was a fundamental insight and step in the MSR theory, and was introduced in some sense by hand there, arises naturally if one takes a different point of view. Here one starts the analysis by writing the probability of a path taken by the stochastic variables or fields, which can be used to construct a generating functional expressed as a functional path integral. We refer the reader specifically to [9], which in addition to describing various extensions of the earlier work provides an instructive comparison between the MSR operator based formalism and the functional integral method of De Dominicis, Peliti and Janssen. This latter path integral formalism, which we describe in more detail in this review, has been used to study systems subject to both fast and quenched noise [3]. It paved the way for a more complete understanding of the ensemble averaged dynamics of these systems [4, 10] and, later on, the single sample [11, 12] dynamics of spin glasses and related models.
Despite the remarkable power of the path integral approach, a pedagogical introduction to the method and its key results is lacking. The aim of this paper is to provide a review to fill this gap. The material presented here has previously been used by the authors in a number of lectures and courses (see e.g. the online lectures in [13]); parts of it can also be found in [14].
Some of the material has also been reviewed and discussed elsewhere. The diagrammatic approach to the study of Langevin equations has been described in several classical books [2, 15, 16]. The supersymmetric method applied to dynamics is covered in [15], for example. In this review we aim to combine these classical results with more novel material, including results on the dynamics of hard spin models, in a coherent and consistent notation. By going into more depth in the calculations, we also hope to provide a useful resource, both for newcomers and more experienced researchers in the field. We also recommend that readers consult a recent review by Chow and Buice [17], which covers much of the same material that we do, studying some different model systems from a somewhat different perspective, and in particular demonstrating applications of these methods to the FitzHugh–Nagumo model of a neuron.
In what follows, we start by introducing the path integral formalism for the dynamics of a random variable that evolves according to a stochastic differential equation. We then illustrate how to perform a perturbation analysis and construct Feynman diagrams within this approach. We also show how this treatment can be cast into a supersymmetric form and discuss the physical interpretation of the supersymmetries that are revealed by this approach. Subsequently, we extend the treatment to systems with quenched interactions, especially spin glasses. Finally, we discuss how the path integral formulation can be used to study the dynamics of hard spin models with quenched interactions through a systematic expansion in the strength of the couplings. Before diving into the path integral formulation, however, we review in the next section the two main approaches (Ito and Stratonovich) for interpreting a stochastic differential equation.
2. Ito versus Stratonovich
Consider the linear Langevin equation
where h(t) is a small field used to probe linear response and ζ is zero mean Gaussian noise with correlation function , where is the Dirac delta-function. Equation (1) can of course be integrated directly from an initial condition at time t = 0
and by averaging over the noise and, uncorrelated with it, , we get the correlation function (for h = 0)6
and the response function
where is the Heaviside step function, with for and for .
We will see below that in the diagrammatic perturbation theory for treating nonlinear Langevin equations, we will need the equal time response . This is a priori undefined, however, since the response function has a step discontinuity at .
An alternative perspective on this issue is provided by writing the response as a correlation function with the noise. Using the fact that the average over the distribution of ζ is with probability weight , and integrating by parts, we can write
For the equal-time response we therefore require , an equal-time product of fluctuating quantities.
There are two conventions for assigning values to such quantities, due to Stratonovich and to Ito. We will briefly review these conventions and refer the reader to [18] (chapter 4) for a more comprehensive treatment.
2.1. Stratonovich convention
The idea of the Stratonovich convention is that, physically, ζ is a noise process with non-zero correlation time, so we should really write with Cζ an even function, decaying quickly to zero for greater than some small correlation time , and whose integral is . Then, setting the field h to zero as it is no longer needed at this point:
so
where the second equality follows from the fast decay of Cζ compared to the macroscopic timescales of so that we can approximate . When considering with (more precisely, ) on the other hand, all of the 'mass' of the correlation function is captured in the integration range; it therefore acts just like a δ-function and we get as expected from (4) and (5).
To summarize, in the Stratonovich convention, the equal-time value R(t,t) = T/(2T) = 1/2 of the response function is half of that obtained in the limit . It is therefore also called the midpoint rule; see below.
2.2. Ito convention
The Ito convention effectively assumes that the noise acts 'after has been updated', so it sets , and hence also .
2.3. Discretization
We will later look at path integral representations of the dynamics and so need a discretization of the stochastic process . We will now see that Ito and Stratonovich can be viewed as corresponding to different discretization methods.
Let us discretize time , with Δ a small time step eventually to be taken to zero, and write and . The noise variables over the interval Δ are , with , where is the Kronecker symbol. Then a suitable discrete version of (1) is
for any here we are evaluating (the non-noise part of) the right-hand side of (1) as a weighted combination of the values at the two ends of the interval Δ. It is easy to solve this linear recursion exactly: from
setting
and assuming the initial condition , we have
From this we can read off the response function setting , and taking we then get for the continuous time response
The value of λ only affects the equal-time response; we see that gives the Stratonovich convention, while gives Ito. Note that in the discretization (8), corresponds to evaluating the change in ϕ at the midpoint of the interval between n and (hence 'midpoint rule'). Ito () on the other hand just evaluates at the earlier time. Note also that for different times, , the two discretization schemes are equivalent as expected.
The two conventions for multiplying equal-time fluctuations are, for the systems we will look at, simply different ways of describing the same time evolution . Cases where the noise strength is coupled to ϕ, such as in , are more serious: a convention for the equal-time product has to be adopted, and the two conventions here actually give different stochastic processes the corresponding Fokker–Planck equations differ by a non-trivial drift term [18].
For the simple cases we consider, where the noise power T does not depend on t or ϕ, Ito versus Stratonovich is basically a matter of taste—calculated expectation values of physical observables are the same in the two approaches. Stratonovich is the more 'physical' because it corresponds to a noise process ζ with small but non-zero correlation time; it also obeys all the usual rules for transformation of variables etc. Ito is more obvious from the discretized point of view—it is very much what one might naively program in a simulation. We will also see below that it can lead to technical simplifications in calculations.
3. The Martin–Siggia–Rose/ Janssen–De Dominicis–Peliti path integral formulation
Now consider the nonlinear Langevin equation
and assume for simplicity that . It is straightforward to extend the formalism to systems with several components the inclusion of distributions of initial values is discussed briefly in section 6. We discretize as in (8), abbreviating :
The plan from here is to write down a path integral for this process and evaluate the effects of nonlinearities in perturbatively. Let Δ be fixed for now and let M be the largest value of the index n that we are interested in. Abbreviate and let be a vector of conjugate variables; then the relevant generating function is
where is the probability of the entire history . Averages can be computed by differentiation with respect to the at .
Quenched disorder can also be treated: one averages the generating function over the quenched disorder before performing the perturbative expansion. In contrast to the case of equilibrium statistical mechanics, performing the average is unproblematic since, because is a normalized distribution, , independent of the parameters of the model [3]. We return to this approach in section 9.
The M variables () depend on the M noise variables (). We know the distribution of the latter,
so we can find the distribution of the former:
where is the Jacobian. Using (15) to express the ζ in terms of the ϕ,
we thus have
where means . The square can be decoupled using conjugate integration variables :
Now it is time to evaluate . Consider the matrix , remembering that the index for ζ runs from 0 to , and the one for ϕ from 1 to M. The diagonal elements are then, from (19), (with ); the elements just below the diagonal are . All other elements are zero; in particular the matrix has all zeros above the diagonal and so its determinant is the product of the diagonal elements, giving
where the last equality anticipates that Δ will be made small so that we can discard terms in the exponent. Note that for the Ito convention (), identically, which is one of the reasons for preferring Ito. The alternative to the above direct evaluation of is to represent the determinant as an integral over Grassmann variables; this will be discussed in section 7.
With evaluated, we have
Defining the average over a (normalized, complex valued) measure
with the 'action'
one can also write
From this representation one has, in particular (taking all derivatives at , and adopting the convention ) the expressions for correlation and reponse functions
It also follows that averages of any product of 's vanish. This is because (as remarked above) Z = 1 for , whatever value the hn take. As a result, derivatives of any order of Z with respect to h vanish when taken at . We can combine the expressions for correlation and response if we define new variables as
Arranging these into a big vector gives
and the calculation of G (the 'propagator' in quantum field theory) is often the main goal of the analysis. In the case of non-zero initial conditions one also wants to be able to calculate the means , as discussed in section 5.5 below. The fields ψ and h have served their purpose and will be set to zero from now on.
4. Perturbation theory
To illustrate the perturbative expansion, consider a concrete example:
(the factor is for later convenience). For g = 0, we recover a solvable linear Langevin equation; correspondingly, the action in (25) becomes quadratic. For , we therefore separate this quadratic part and write
Rearranging the first sum, the non-quadratic (or 'interacting' in a field theory) part of the action can be written in the simpler form
We have dropped the last term, from the first sum; since (by causality) whatever we do with this last term does not affect any of the results for earlier times, we just have to make M larger than any times of interest for this omission to be irrelevant. Similarly, whether we have the sums start at n = 0 or n = 1 has a vanishing effect for , so in the following we will leave off the summation ranges.
Now for g = 0 we have , and a corresponding normalized Gaussian measure over the η for which we denote averages by and the corresponding propagator by . For , our desired averages are then written as
Assuming g and hence to be small, we thus arrive at the perturbative expansion
which is a series expansion (in general asymptotic, non-convergent) in the nonlinearity parameter g. We can now evaluate each term in this series using an elementary fact about multivariate Gaussian statistics known to physicists as Wick's theorem: the average of any product of zero mean Gaussian random variables is found by summing over all possible pairings; symbolically (leaving out all indices etc)
Let us apply this to the simplest example. We know that , so this should be true to all orders in the expansion (38). Up to one has
Using Wick's theorem, the fourth-order average is
(Why the 3? There are 3 different pairings between the 4 four variables—(1, 2) and (3, 4), (1, 3) and (2, 4), and (1, 4) and (2, 3)—but all give the same product of averages.) Here one sees how the equal time response function appears in the formalism. Inserting the value that we found earlier, (41) thus becomes
as it should be. This illustrates how the non-trivial determinant that arises in the Stratonovich formalism () and appears as the term proportional to λ in the interaction part of the action (36) is essential for maintaining the correct normalization. Similar cancellations occur at all orders in g. Ito () is simpler here: the terms in the perturbation expansion of all vanish individually, rather than just cancelling each other out.
5. Diagrams
5.1. Basics
Diagrams are just a pictorial way of keeping track of the various terms in the perturbation expansion (38), as evaluated using Wick's theorem. Having illustrated the equivalence between Ito and Stratonovich above, we stick to Ito () for now, where the non-trivial part of the action is simply
To illustrate the diagrammatic notation, consider again the expansion for the normalization factor
We have dealt with the zeroth and first order terms above. The second order term is
Represent each of the summed over time indices m and n by a vertex with four 'legs' that symbolize the four η factors with the corresponding time index. Each vertex comes with a factor from . Both of the time indices are summed over and the result multiplied by Δ; in the limit these scaled sums of course become time integrals. The time indices are not fixed by our choice of observable to average, and are therefore called 'internal'—we will see external vertices in a moment. Now, having drawn the vertices, we can connect the legs in a number of ways, one for each pairing that Wick's theorem gives for the average in (46): a line between a vertex at m and one at n stands for a pair average ( or 2). E.g. the diagram where the legs from both vertices are connected to legs from the same vertex only
represents all the pairings where 's are connected to 's only, and 's to 's only. At each vertex there are three choices for pairings of this form, so this diagram has the value
This illustrates an important fact: if a diagram separates into subparts which are not connected, its value is just the product of the two diagrams separately (apart from the overall factor of 1/2 here, which comes from the expansion of ).
The diagram above certainly does not exhaust all the Wick pairings of (46). So what are the other diagrams corresponding to (46)? We can have either two mn-pairings, one nn and one mm; or four mn pairings. Altogether, one would therefore write (46) in diagrams as:
We write down the value of the last of these: start with . From the diagram, this must be connected to a η on the other vertex, i.e. either or one of the three . In the first case, each of the remaining legs is connected to a leg; there are ways of making those connections (pairings). In the second case, the factor from the second vertex, , must be connected to one of the three vertices, and the remaining two and legs can be connected to each other in two ways. Thus the value of the diagram is:
(Exercise: evaluate the remaining second order diagram and, as a check, count all pairings. The diagram just above dealt with pairings, while the first one corresponded to pairings. Since we have 8 η's in total, there are pairings overall, hence the remaining diagram must correspond to pairings.)
In terms of diagrams, it is now quite easy to understand that as it should to all orders in g. Consider an arbitrary diagram in the expansion; let us assume it is connected, otherwise consider the subdiagrams separately. Now within the Ito convention the response function is non-zero only for (for Stratonovich, the value for n = m is also non-zero; for it is zero either way from causality). So to make the diagram non-zero, we have to connect the from a given vertex, with time index n1, say, to the leg at another vertex, n2. The from that vertex must in turn be connected to on another vertex and so on. Eventually, because we have a finite number of vertices, we must come back to our original vertex n1. In the 'ring' sequence n1, n2, n3, ..., n1 there are at least two time indices that are in the 'wrong' order for the response function to be non-zero, so that the diagram contains at least one vanishing factor and thus vanishes itself.
The moral of the story so far is: diagrams factorize over disconnected sub-diagrams, and the sub-diagrams with only internal (summed-over) vertices vanish. The latter are also called 'vacuum diagrams' in field theory.
5.2. Diagrams for correlator/response; Dyson equation
Next we look at the diagrams for the propagator, which encapsulates correlation and response functions, where . We now have two 'external' vertices with fixed time indices i and j; the propagator is represented as a double line between these two vertices. It is also called the 'full' or 'dressed' propagator, in contrast to the 'bare' propagator that is represented by the single lines in the diagrams and results from the pairings in Wick's theorem. To zeroth order in g, full and bare propagator are obviously equal. To first order, we have, from (38),
In diagrams, we have two new contributions from the first order term, depending on whether we pair up with or with one of the . We can therefore write the full propagator as
The second of these has a disconnected part with only internal vertices, so vanishes. The same argument applies to higher order diagrams as well: we only need to consider connected diagrams7 . Thus, evaluating the surviving diagram,
If we define a matrix by
(where is the Kronecker symbol as before) and agree to absorb a factor of Δ into matrix products, so that
then we can write (54) in matrix form simply as
The way factors of Δ are absorbed here ensures that matrix multiplications become time integrals in the natural way for , while the factor in becomes .
To first order in g, the above is the whole story for the propagator. Now look at the second order. Among the diagrams we have ones such as
These two only differ in how the internal vertices are labeled; since the latter are summed over, we can lump the two diagrams together into one unlabeled diagram. This just gives a factor of 2, which cancels exactly the prefactor from the second order expansion of the exponential in (38). Again, the same happens at higher orders: at we have a prefactor of but also k internal vertices which can be labeled in different ways, so the unlabeled diagram has a prefactor of one. Thus, the unlabeled diagram
has the value (bearing in mind that the components at each of the internal vertices must be connected either to a leg at the other internal vertex, or to an external vertex, and that there are three choices at each vertex for which pair of 's to connect to each other)
Using the definition (55), one sees that in matrix form this is simply
To make this simple form more obvious we have included in (60) the term in the second line, which vanishes because it contains a zero factor. We have an example here of a 'one-particle reducible' (1PR) diagram, which can be cut in two by cutting just one bare propagator line, namely, the one in the middle. The result illustrates that the value of such diagrams factorizes into the pieces they can be cut into; e.g. the diagram
has the value . If we sum up all the diagrams of this form, we get
or
The inverses are relative to the appropriate unit element for our redefined matrix multiplication, which has elements . (So means, because of the extra factor Δ in the matrix multiplication, that is times the conventional matrix inverse of A.)
The expression (64) is an example of a resummation: we have managed to sum up an infinite subseries from among all the diagrams for the propagator. What if we want to sum up all the diagrams?
Again we can classify them into one-particle irreducible (1PI) diagrams, which cannot be cut in two by cutting a single line, and 1PR diagrams that factorize into their 1PI components. For example, to second order
More generally, if we denote the values of the different 1PI diagrams (defined analogously to , i.e. without the external G0 legs) by , , ... then we can get all possible (1PI and 1PR) diagrams by 'stringing' together all possible combinations of 1PI diagrams:
where . Hence we see that the full propagator, with all diagrams summed up, can be written in the general form
where Σ, the so-called 'self-energy', is the sum of all 1PI diagrams. Equation (68) is called the Dyson equation. An alternative form that is often useful is
Let us write this out in terms of the separate blocks corresponding to correlation and response functions: we have
and has the same structure, so that also the 11 block of Σ must vanish,
This can also be shown diagrammatically: a non-zero contribution to would correspond to diagrams where the internal vertices that make connections to the two external vertices do so via legs. This leaves all the legs to be connected amongst the internal vertices, and then the same argument as for vacuum diagrams can be applied. Writing out (69) we have thus
where we have abused the notation by writing also for the non-zero M × M sub-blocks of the original matrix . The first and last of these equations are both equivalent to
implying that acts like a self-energy for the reponse function. Rearranging, the components of the Dyson equation reduce to
Using (76), the last equation can also be solved explicitly for C as
In the above we have defined Σ to be plus the sum of all 1PI diagrams; the opposite sign convention also appears in the literature.
5.3. Self-consistency
Of course in general one cannot sum up all the diagrams for the self-energy. One must make some approximation. In (64) we used just the lowest order diagram to approximate Σ. Can we easily improve the approximation? Yes, if we replace G0 in the expression for Σ by the full propagator G; diagrammatically,
Which diagrams does this correspond to? This is easiest to find out order by order in g. To first order, Σ and G are
Re-inserting G into Σ we get its form to second order and therefore also G
and then we can iterate to get
So the simple operation of replacing G0 by G effectively sums up an infinite series of 'tadpole' diagrams in the expansion for the self-energy. This is called 'mean field theory'. It gives exact results for many models with weak long-ranged interactions; for such models the diagrams that have not been included are negligible in the thermodynamic limit. The approximation is also called 'self-consistent one-loop' or 'Hartree–Fock' approximation.
5.4. Diagrammatic conventions
We have used a particular way of drawing diagrams above that does not distinguish between the physical field and the conjugate or 'response' field . This approach has the virtue of making all diagrams look essentially like in a static (equilibrium) field theory. It does however mean that each diagram can group a number of terms, involving response or correlation functions depending where on each vertex the variables are located in any given Wick pairing. If one wants to avoid this one can e.g. mark on each vertex the -leg by an outgoing arrow. Similarly in the response part of the propagator—the sector in our previous convention—one would mark the end by an arrow. The arrows then represent the flow of time because in all contractions of the type the must be at a time before the ϕ.
The rules for constructing diagrams are modified only slightly in this more detailed diagrammatic representation: an arrow from a vertex cannot connect back to the same vertex as this would give an equal-time response function, which vanishes in the Ito convention. At each vertex one has to consider the possible choices for how the leg with an arrow can be connected to other vertices or external nodes. Diagrams are non-vanishing only if the arrows give a consistent flow of time, i.e. arrows cannot form closed loops nor can two legs with outgoing arrows be connected.
Within the above convention, a propagator without an arrow on it is always a correlation function. In our example, the diagrammatic expansion up to second order of the correlation function then reads
while for the response function there are fewer diagrams that contribute:
Note that the expansions (85) and (86) are written as the direct analogs of (65) and differ from the latter only through the addition of the appropriate arrows. More compact expressions can be obtained in terms of the self-energy. For the correlation function specifically, the identity (79) may be more efficient for use in practical calculations. This is illustrated in the example in section 6 below, see (106) there.
5.5. Non-zero fields and initial values
So far we have discussed dynamics without applied fields hn and with zero initial value . These restrictions can be lifted, as we now explain. We continue to use the Ito convention (). It suffices to discuss non-zero fields, because a non-zero initial value can be produced by setting . This gives , which in the continuous time limit approaches c. In the same limit the difference between and is immaterial, so the above choice of h0 effectively fixes a non-zero initial condition.
For the perturbative approach to work, we need a quadratic action S0 as our baseline while all non-quadratic terms are contained in the interacting part . If the field hn is non-zero, this generates linear terms in the action. We then have two choices: either we include these linear terms in and treat them perturbatively, or we transform variables by expanding the full action around a stationary or saddle point.
The first approach is relatively straightforward: one now has an extra vertex, with only one -leg, and the perturbative expansion is jointly in g and the field amplitude. For the propagator all contributing diagrams must have a total number of legs on the internal vertices that is even so the number of field vertices must also be even, hence the expansion is effectively in g and h2. At one has the extra diagram
where each dot with a single connection is an h-vertex8 , while at one gets
We could represent the fact that all field vertices have a single -leg by an arrow pointing away from the vertex (see section 5.4). For the response function part of the propagator there would be an arrow pointing away from one of the external vertices as well so the diagram in (87) vanishes as it contains an edge with opposing arrows. To a correlator the diagram contributes in the continuous time limit
The higher order diagrams can be evaluated similarly.
In the presence of a field even the mean is in general non-zero. The perturbative expansion for this consists of diagrams with one external vertex and therefore requires an odd number of internal field vertices. The contributions to and are
One sees that most of the new diagrams (87) and (88) in the propagator are products of diagrams like these. In fact one can convince oneself that if one considers the connected propagator, defined as , then as in the case without a field only the connected diagrams remain [19]; the lowest order one of these is the third diagram in (88).
Next we look at the saddle point approach, where one defines new variables relative to a stationary point of the action, thus eliminating any linear terms in the transformed action. Looking at (23) with , the saddle point conditions with respect to and give
The 'initial' (at ) condition for the first of these equations, which results from stationarity with respect to , is , and solving backwards in time then shows that for all n at the saddle point. With this the second equation is just
which is the discrete time version of the deterministic (noise-free) time evolution . If we call the solution of this , then we need to use as new variables . In terms of these, by making use of the saddle point conditions, the (Ito) action reads
The difference needs to be expanded in to read off the resulting vertices in the interacting part of the action. For our example (32),
If we include the term in the unperturbed action S0 then the latter has the same form as originally except for the change of variable from ϕ to , and the same diagrammatic expansion technique can be applied. The difference is that the interaction part now contains three different kinds of vertices resulting from the terms proportional to g in (93), with two, three and four legs respectively and different time-dependent prefactors from the time dependence of .
One point to bear in mind is that even though we have defined relative to the saddle point solution, its average is not in general zero. To , for example, one has for the diagram
coming from the term in (93), which evaluates to
6. A one-particle example
Let us apply the diagrammatic formalism developed above to our one-particle model (32). We want to take a distribution over the initial values into account here. But this only takes a small extension of the formalism: if the initial distribution is a zero mean Gaussian, we can simply include the average over in the unperturbed average the measure is still Gaussian, so we can apply all of the above formalism except that the values of C0 and R0, i.e. the components of the bare propagator are affected by the presence of uncertainty in the initial condition. If the distribution has non-Gaussian parts, we include the Gaussian part as above and the remainder is put into the non-trivial part of the action , giving a new kind of vertex in the diagrams. A non-zero mean in a Gaussian initial distribution would also be put into .
Now let us write down the Dyson equation for our model. Having derived all relations previously so that they have the obvious limits for , we work directly with continuous times. The bare response function is, from (4),
The inverse of R0 is the operator , since when applied to R0 it gives . The other quantity we need to write down the Dyson equation (78) is . To find this, it is easiest to start from the fact that
The lower boundary of the integral here is meant as the same applies to all integrals that follow. Averaging the product gives
so with given by the square brackets; thus
Now we can write down the Dyson equation (78):
To first order in g the self-energy is, from (55),
Within this approximation, the Dyson equation becomes
From the first equation,
and with this the second equation for C can also be solved (compare (79)) to give
For the simplest case where C0 is time-translation invariant, corresponding to , we see that the effect of g in the response function R is just to replace . At long times the effect on C is similar though at short times C will not be time-translation invariant.
If me make our first order (one-loop) approximation self-consistent, the only change is to replace C0 by C in (102) and correspondingly in the Dyson equation, so that
This has a simple interpretation: it corresponds to replacing our original nonlinear force term (32) by
with to be determined self-consistently. (The non-self-consistent version instead sets .) Assuming that we can find a solution with constant, we get for C by inserting (107) into (106) and setting
For we then indeed get a time-translationally invariant . This has and so the self-consistent equation determining c and is
This is the self-consistent analog of our previous result , to which it reduces when expanded to first order in g.
We see that for our particular model the self-consistent approximation gives a more sensible result than the 'vanilla' first order approximation: it allows a time-translation invariant solution for both C and R
which also obeys the fluctuation–dissipation theorem (FDT), for .
6.1. Mode-coupling theory
What if we want to improve the approximation to the self-energy further? The systematic approach is to include the lowest-order diagram not so far taken into account. We have the only first-order diagram already; the second-order 'tadpole' diagrams are also taken into account through self-consistency. The only missing second order diagram is therefore the 'watermelon' diagram
To work out what contribution to Σ this gives, let us revert temporarily to discrete time notation and label the left and right vertex m and n, respectively. The elements of correspond to those pairings where a leg from vertex m is attached to an external vertex that would be on the left, and the leg from vertex n attached to an external vertex on the right. Internally (among the remaining legs) we thus have two pairings and one pairing. To work out the prefactor of the diagram, note that there are three choices for the externally attached there are three choices for which of the to pair up with and two more choices for how to make the two remaining pairings. Thus, the diagram gives for
For , we have both the and legs attached externally, and 6 choices for how to connect the three and legs internally, giving
We can again sum an infinite series of additional diagrams by replacing bare () by full quantities () here. Reverting to continuous time notation and including the first-order contribution in , we thus get for the self-energy in the self-consistent two-loop approximation
A final comment: up to the order which we have considered, the self energy components and are simple functions of the correlation and response functions. This is not normally true once higher order diagrams are taken into account. For example, if we went to third order in g we would have to include the diagram
in the self energy; all other third order diagrams are automatically included by self-consistency. This diagram now has one internal vertex whose time index is not fixed by the two time indices that the self-energy carries. It therefore gives a contribution to which has an integral over this 'internal' time; e.g. for we get a contribution
Diagrams of even higher order contain additional time integrals; in general, then, the self-energy is a functional of the response and correlation functions.
An approximation, like that in (116) and (117), in which we keep only diagrams for that are functions of and has come to be called a mode coupling approximation. These have been studied extensively, often because they are exact for some mean-field (infinite-range) models [20].
7. Ghosts and supersymmetry
7.1. Using Grassmann variables
In section 4, we discussed how the results of the perturbation theory are independent of the value of λ chosen in the discretization of the Langevin equation: the choice of λ changes the Jacobian but does not affect correlation functions and other observables. As alluded to already in section 3, there is another way of including the Jacobian in the path integral using Grassmann variables, which is both conceptually interesting and can often simplify notation. This observation was first made, and its consequences on dynamics explored, by Feigel'man and Tsvelik [21, 22], who followed earlier work by Parisi and Sourlas [23] on supersymmetric properties of an equilibrium system in a random external field. The current section is devoted to describing this approach.
Before getting into the details of the supersymmetric formalism, we need to familiarize ourselves with Grassmann variables. Grassmann variables are charaterized by the fact that they anticommute with each other. Multiplication of them is also associative; i.e. for Grassmann variables , we have
A consequence of anti-commutation is that
Grassmann variables can be added to each other and also multiplied by complex numbers: one says formally that they form an algebra, i.e. a vector field over the complex numbers endowed with a multiplication. This means that, given (122), the most general functions of one and two Grassmann variables can be written respectively as
where and c12 are arbitrary complex numbers.
Integration and differentiation for Grassmann variables are defined by
and these lead to the following formulae that we will use later:
As a consequence of the above, and using two independent sets of Grassmann variables and , one has the following important representation of the determinant of a matrix :
where . The integrand on the right-hand side of (128) defines a Gaussian measure for Grassmann variables under which we have .
Employing the representation (128) for the determinant of a matrix, we can write the Jacobian that appears in the transformation (18) in the following way. First, we recall that the non-zero elements of the Jacobian are
and therefore
where (with n = 1,..., M) and (with ) are Grassmann numbers.
The dynamical partition function is defined as before in (26) but with the average now also involving integration over the Grassmann variables,
The action reads
where the second line replaces the last term in (25). In the continuous time limit this can be written as
Let us see how the inclusion of the Grassmann 'ghosts' works out for our example case of . Going back to discretized time temporarily will help us understand how to treat equal-time correlations. With as given, the action can be written as
The coefficient matrix appearing in the ghost term of the bare action S0 has entries on the main diagonal and on the diagonal below. This matrix is easily inverted to show that the ghost covariance is
for and 0 otherwise; the last expression applies for . The ghost correlator is therefore causal and reads in the continuum limit:
While this is λ-independent, the dependence on λ reappears in how equal-time Wick contractions are treated in the perturbative expansion in powers of . To see this, note that up to vanishing corrections the interacting action is
The square brackets in the first line, including the factor i, define what we previously called the response field , which has equal-time correlator with of . Similarly we could now define the combination in square brackets in the second line as a new Grassmann response field , which has equal-time correlator with ξ of using (138). If for simplicity of notation we do not distinguish between and (and similarly and ) then the upshot of this discussion is that for one can work directly with the continuous-time version
of the interacting action, provided that one remembers that the equal-time ghost correlator has to be set to in any Wick contractions, and similarly . Note that there are never any contractions between ordinary and Grassmann variables because the average of any single Grassmann variable over a (Grassmann) Gaussian vanishes.
To see how the perturbation theory with ghosts works in practice, consider the response function to first order in the perturbation:
The physical piece of this is the first term and the contraction without equal-time response factors in the first integral:
When , i.e. for any convention other than Ito, there are two other non-zero contractions of the first integral:
In addition the two possible Wick contractions of the ghost term in the second line of (143) give, bearing in mind that ,
Because at equal-time while the ghost correlator has the opposite sign , the terms in (145) and (146) exactly cancel each other as they should.
In other words, whether or not we include the Jacobian, the extra terms (145) do not appear in the perturbation theory, either because they are simply equal to zero, or because they cancel with the additional terms (146) that arise from the ghost correlation functions from the Grassmann representation of the Jacobian. As will be elaborated in the next section, this is formally a consequence of a symmetry of the theory that represents the fact that the path probabilities are normalized to one.
7.2. Manifestly supersymmetric description
Let us now define the superfield Φ as
where θ and are themselves Grassmann variables and are sometimes referred to as Grassmann time. The action (134) can then be written in terms of Φ in a compact way that reveals unexpected symmetries of the problem.
To see this, it is convenient to define
so . The existence of the potential V is crucial for the supersymmetric description; it cannot be used for systems of more than one variable when these are non-equilibrium in the sense that the drift f cannot be written as a gradient.
In our current one-dimensional example where a potential can always be defined we have, by Taylor expansion around and throwing away terms that vanish because of (122),
which leads to
giving us two of the terms in (134).
The remaining terms can be written in terms of derivatives of Φ. We have
If we then evaluate the quantity
we find that we get the term in the action (134). Similarly, we find
which are the negatives of the terms in (134) involving time derivatives. Putting all these results together and defining , we can write the action in the form
It will be handy to introduce the notation
so that
Up to here the formalism is general enough—subject to the existence of the potential V—that it can describe also non-stationary dynamics, e.g. relaxation to equilibrium. From now on we restrict ourselves further by assuming we have a stationary state. The supersymmetric action then has several symmetries; the obvious one is time translation invariance. But it is also invariant under other 'translations' that involve shifts in the Grassmann times θ and . In what follows, we identify these invariances and investigate their physical meanings [15, 24].
Consider a 'translation' generated by the operator
This produces a shift and a change in Φ:
where is a Grassmann 'infinitesimal' that acts like a separate Grassmann variable. Now, by using (153) (or simply by substituting in the definition (147) of the superfield), we find
But according to the definition of the superfield, whatever appears in the expression (163) not mutiplied by θ, or should be identified as the new ϕ, and whatever appears multiplied (from the right) by θ should be identified as the new . The component fields therefore transform as
This transformation is called supersymmetric because it mixes 'bosonic' degrees of freedom (i.e. ϕ and ) with 'fermionic' (i.e. Grassmann) ones.
To verify that such a transformation is indeed a symmetry of the model, we proceed analogously to what we would do to test, say, rotational invariance in an ordinary field theory with vector fields. We start by performing the 'rotation' (164)–(167) on the superfield components in each term of the integrand in (160), i.e. we substitute the transformed Φ and carry out the required derivatives with respect to time and Grassmann time. In general, this leads to a change in the integrand, which we then integrate over τ (i.e. over θ, and t). If the result is zero we have a symmetry. As an example, let us see how the shift generated by affects the 'kinetic' term in the action. From (152) we see, using (166) and (167), that
For the change in , we see from (153) that the first term does not change as it involves only ξ and , and these do not change under (165) and (167). From (154), (164) and (165), the change in is
Putting these contributions together, the change in is
Only terms proportional to will survive the integration over θ and , but these cancel:
A similar calculation shows that the 'potential' term is also unchanged. Thus, the action S is invariant under .
An analogous calculation shows that the kinetic term is not invariant under shifts generated by . However, we can try combining a shift in θ with one in time, using the generator
and see whether there is a value of α for which is invariant. Now the transformation of ξ acquires a new term proportional to ,
and is also changed, proportional to :
The remaining variables ϕ and transform according to
Then, after some algebra, we find that, under our trial , is changed by
(here ). Integrating over θ and leaves
so we see that with the choice this just reduces to
This vanishes on integration over t (for stationary initial and final states), proving the invariance of this part of the action. Again the proof of invariance for the term is similar to that for , so the total action is invariant under .
The reader who is uneasy with the formal manipulations using superfields here can check these results by applying the transformations (164)–(167) for and (173)–(176) for directly to ϕ, , ξ and , in the form (134) of the action not using superfields.
To see the meaning of these supersymmetries, we consider the supercorrelation function
where 1 stands for and analogously for 2. When we use (147) and expand the product, many terms vanish, either because the averages are of products of Grassmann variables with ordinary ones or of a pair of 's. The remaining terms are
where means , etc.
From the invariance of the action under (translations in ), we have
The vanishing of the term proportional to says that the ghost correlation function has to be the negative of the response function (as in (139)). The vanishing of the term proportional to says the same thing if we notice that the ghost correlation function here is , not . Thus, invariance under , through its enforcement of the cancellation of disconnected diagrams, is just conservation of probability.
Analogously, for the symmetry (172), we have
This gives
For , vanishes, so we have, also using from time translation invariance,
Thus,
which is the FDT.
To summarize, the theory has three invariances: time translation, and . expresses conservation of probability, and expresses the FDT, i.e. the fact that the system is in equilibrium.
So far, we have treated a single-site problem. It is straightforward to extend the formalism to multiple degrees of freedom . However, as emphasized before, the supersymmetric construction is permitted only when the drift is the negative gradient of a potential, . Otherwise the supersymmetry fails. This means that the FDT is not obeyed; the system, even though it may possess a steady state, is not in equilibrium. (Of course, it is well-known from simple arguments making no reference to supersymmetry that models with non-gradient drifts, as arising e.g. from asymmetric coupling matrices, do not satisfy the FDT.)
7.3. Superdiagrams
In addition to the insight it provides into the symmetries of the problem, the supersymmetric formulation can also be of practical advantage in calculations. For example, in diagrammatic perturbation theory, one needs only draw the diagrams for the static problem. Another example can be found in Biroli's analysis [11] of dynamical TAP equations for the p-spin glass, where the entire structure of the argument and the equations could be carried over from the static treatment of Plefka [25]. Here we sketch how to do diagrammatic perturbation theory in the superfield language and show, in a simple example, how it reduces to the diagrams we had in the MSR formalism with the and correlation functions [20].
We write our standard model (14) in the form
with
and
To do perturbation theory, we can expand in g and apply Wick's theorem to evaluate the resulting averages, just as we did in section 5. Wick's theorem holds also for Grassmann variables although in principle one has to be careful with sign changes that arise from changing the order of the variables when performing contractions. For example, . Fortunately this is not an issue in our context as we only need averages of powers of the superfield Φ, and from (147) Φ only contains products of pairs of Grassmann variables: commuting such pairs through each other never gives any minus signs.
The first thing we need for the perturbation theory is the correlation function or propagator of the non-interacting system. Integrating (188) by parts, we get
It is convenient to write this as
where and denote the commutator and anti-commutator respectively. It is then simple to show that
and
The term in (191) involving the anticommutator can be neglected because it vanishes on time integration, so we can write S0 in the appealing form
with9
From this, we identify
as the inverse of the free propagator:
where we have used the fact that acts as a delta-function in the Grassmann times. Now, multiplying by from the left, using the fact that , and Fourier transforming in time, we arrive at
Back in the time domain, this is
where . It is comforting to confirm that we can get this result in another, simpler way, using the representation (181) with the equalities implied by invariance (182):
Using the free correlation and response functions (3) and (4) here then leads to (199).
To see how the diagrammatics work in this formalism, consider the second-order watermelon graph (113) for the self-energy,
(Here we are doing the resummed expansion in which Σ is a functional of the full correlation function Q, not just Q0, as in section 6.)
From the representation (200), the part of with no Grassmann times multiplying it is the correlation function, and the parts mutiplied by are the retarded and advanced response functions, respectively. Expanding , we get
This has the same form as (200). We note that the components of here are just the contributions to the self-energies , and that we found in the conventional MSR theory (116) and (117), including the factor of 3 in and .
It is useful to discuss in more detail how perturbation-theoretic corrections to in the supersymmetric formulation correspond to results obtained in the conventional MSR theory. The two theories differ superficially in two ways: (1) In the conventional theory we have to keep track of two kinds of correlations functions ( and ) while in the supersymmetric formulation we have only one (super)field and need only draw the diagrams we would have in statics. (2) In the supersymmetric theory both real and Grassmann times of intermediate vertices in the graphs are integrated over, while in the conventional theory only ordinary times are integrated over. To compare the two ways of doing the calculation, we consider the first term obtained in expanding the Dyson equation in :
To resolve into components, we use first the fact that Grassmann factors of the form are idempotent under convolution, e.g.
That means that the retarded part of (the part proportional to ) is, in the notation we have used earlier (, )
and analogously for the advanced part.
We also get a contribution to from the first term in . Since that term contains no Grassmann times, this contribution will contain a factor
These integrations pick out factors of the retarded function and the advanced function , respectively, so we find a contribution, involving no Grassmann times, of
These are exactly the second-order contributions to and that we would find in the conventional formulation from expanding the Dyson equation to first order in the self-energies and in (116) and (117), again up to terms not written explicitly here. Thus, because of the algebra of the Grassmann times in the supersymmetric formulation, the results of the multiplication and convolution of the supercorrelation functions, when reduced to components, reproduce the terms found in the conventional MSR theory. This result extends to all graphs in perturbation theory, because it depends only on (1) the idempotency of factors like and (2) the fact that multiple correlator lines between a pair of interaction vertices, like those in (201), combine as in (202).
8. An interacting example
To generalize the discussion so far to more interesting interacting models one can for example add a linear interaction between different degrees of freedom or 'soft spins' . This gives the Langevin equation of motion
where we assume that there are no self-interactions, hence . If the couplings are otherwise symmetric, , then this dynamics obeys detailed balance because it represents noisy gradient descent on the energy function
which can be thought of as a soft spin version of the Sherrington–Kirkpatrick model [26]. The diagrammatic technique in its MSR incarnation (including fermionic ghost variables, if one chooses to follow the Stratonovich convention) can be applied irrespective of any such restriction, i.e. whether or not the system obeys detailed balance. For a supersymmetric treatment interaction symmetry is necessary, on the other hand.
The generating functional has the form (26) with action, written directly in the continuous time limit for h = 0 and using the Ito convention,
In this action the interaction gives an additional vertex with two legs, . (As before we do not introduce a new symbol for this vertex as the meaning is clear from the number of legs attached.) The diagrammatic expansion now becomes a joint expansion in both g and the interaction amplitude; formally one could set , consider the fixed and expand in J. To illustrate the new diagrams that appear we restrict ourselves to g = 0. The expansion of the propagator to second order in J is then simply
All propagators and vertices now carry site indices in addition to the time index and the 'sector' label 1 or 2 for physical (ϕ) and conjugate () fields. If we focus on the response function part of the propagator, the expansion becomes
Writing out the diagrams this translates to
where all internal site indices () are to be summed over. Here is the response function of the unperturbed dynamics, which because of the absence of interactions is diagonal in the site indices, . The time integrals become simple products in frequency space, giving for the Fourier transform
or in matrix form
Inserting where is the identity matrix gives . Given that we are considering g = 0 where the dynamics is purely linear, this is easily seen to be the exact result. This was possible to obtain here because we were able to sum up all diagrams, or equivalently because only a single diagram (the bare quadratic vertex) contributes to the self-energy.
Up to here the discussion applies for general Jij. To simplify further one needs to make assumptions on the statistics of these interactions. One interesting case is that of a soft-spin Sherrington–Kirkpatrick model for which the Jij are zero-mean Gaussian random variables, uncorrelated for different index pairs ij, except that the symmetry in the interaction matrix is imperfect:
The symmetry parameter has the value for a fully symmetric matrix, while gives a fully asymmetric matrix.
For the local response function Rii one can in general simplify (213) to
where the term of first order in Jij vanishes due to the lack of self-interactions. For the soft-spin SK model, the sum in the second order term has average while its variance is . For large N it is therefore self-averaging, i.e. equal to with probability one. Hence
As one might have expected, because this model has each node interacting with all others the nodes become equivalent, making the local response functions Rii independent of i. It remains true at higher orders that the local response only depends on the overall coupling amplitude J and the symmetry parameter κ. This can be shown by a separate diagrammatic argument [27–29] or by explicit averaging of the path generating function Z over the disorder, i.e. the statistics of the Jij. This latter approach is the subject of the following section.
9. Quenched averaged dynamics of soft spin models
In the previous sections, we described how the dynamics of a set of interacting scalar variables that evolve according to a stochastic differential equation can be studied perturbatively using the path integral formalism. When the evolution equations depend on random quantities such as the Jij above, then instead of studying a single system with fixed Jij one can consider an ensemble of systems, each with a different set of interactions Jij drawn from some distribution. The resulting ensemble averages are expected to reflect the behavior of a typical sample as far as macroscopic quantities such as the average local response are concerned: these are self-averaging, i.e. to leading order dependent only on the overall statistics of the interactions in the system. Where the interactions are weak and long-ranged as in the soft-spin SK model, all spins also become equivalent so even the local responses Rii are self-averaging as we saw above. This would not be true, for example, for systems with interactions on a network with finite connectivity.
We illustrate the approach for the dynamics given by (208), with dynamic action in (210) and assuming Gaussian statistics for the Jij as specified in (216). The derivation that we outline here was first described in [4, 30] for symmetric interactions; the role of the degree of asymmetry in the interactions was analysed later in [31].
Using the fact that the average of over a vector of zero mean Gaussian random variables with covariance is , the -average (denoted by an overline) of the part of the generating functional Z that depends on the Jij reads
One can drop the restriction for large N as including the i = j terms only gives subleading corrections. A mean-field decoupling of the quartic terms gives
Here
are the average local correlation and response functions of the system. They are in principle dependent on the specific trajectory of the system, but from the law of large numbers will assume deterministic values for . This heuristic argument can be justified by a formal calculation introducing conjugate order parameters to C and R and making a saddle point argument [4, 30, 31]. Because I now factorizes over sites i, so does the entire ensemble-averaged partition function . The contributions from Li give an effective noise with correlation function , and a delay term with memory kernel .
All spins have the same action, so we can drop the index i and write down the effective dynamics as
The correlation and response need to be found self-consistently from this, and to facilitate calculation of the response we have added back in the field term h(t).
Let us now take a moment to study the dynamics of the bare model (g = 0) that arises from the above equation. At g = 0, and assuming that for long times a stationary regime is reached where , the equation of motion in frequency space is
Averaging both sides over the noise term gives for the mean
From this, we find the response function
To leading order in J2 this gives , which after inverting and re-expanding to agrees with the perturbative result (218) as it should.
The critical dynamics of system can be understood via the low frequency behavior of the response function. In fact, one can define a characteristic response time scale as
Taking the ω-derivative of both sides of (229) and multiplying by R gives
and evaluating at one finds
The response timescale can therefore be expressed as
In other words, when , the relaxation time scale of the system diverges and the system exhibits critical slowing down. From (229) one sees that the critical value of J obeys , hence . Using the explicit solution of (229), which reads
shows further that at criticality the response function has a singularity for . In the time domain this corresponds to a power law tail , which is responsible for the diverging mean timescale.
10. Path integrals for hard spin models
So far we have considered models in which the dynamical variables took continuous values. Here we show that, with some modifications, the same approach can be used to study the dynamics of models involving Ising spins.
Consider a system of N binary spin variables, with . We consider synchronous dynamics, where time t advances in discrete steps and at each time step all spins are updated. Specifically, we assume that the spins decide their states according to the following probability distribution
If the external fields are constant in time as written above and if the couplings are symmetric, i.e. , this dynamics reaches an equilibrium state where detailed balance is satisfied and configurations are visited according to the Boltzmann distribution with the Hamiltonian [32]
where we have introduced the abbreviation and emphasize that it is the total fields , which depend on the configuration, that appear inside the log .
An alternative to synchronous dynamics that is commonly studied in the literature is continuous time Glauber dynamics [33], in which spins are updated individually and at random (exponentially distributed) time intervals. This is achieved by prescribing that in any infinitesimal time interval , every spin is flipped with probability
Again, for constant external fields and symmetric couplings an equilibrium Boltzmann distribution is reached, but this time with the familiar Ising Hamiltonian
For derivations of these equilibrium distributions see [34, 35]. Although the Glauber dynamics is the one that leads to the usual Ising Hamiltonian, in what follows we focus on the synchronous case for two reasons: (i) the path integral formulation is slightly simpler in terms of notation and (ii) the synchronous case has been the focus of recent work on the dynamics of hard spin models, including our own on the inverse problem of inferring the couplings in the model from the statistics of the spin history [36–38].
10.1. Path integral formulation
The generating functional for the dynamics is defined as
where the average denoted by is over the distribution of trajectories generated according to the probability distribution (235) and the are fields that allow one to obtain the statistics of the by taking derivatives. The key to writing a path integral representation of the generating functional in this case is to work with the local fields hi(t), which are continuous for , and not the original spin variables:
where is the interaction matrix, indicates a sum over all spin trajectories and similarly an integral over all field trajectories . The discrete time variable range is where T is the final time.
The delta function in (241) is introduced to enforce the definition (236) and can be written as a Fourier transform, leading to
This expression can be used in several ways. One is to average over the distribution of the second is to derive mean-field equations for the system. We do not go into the first route here, i.e. the quenched averaged dynamics of the system, as this is both very involved and also discussed in detail elsewhere [39]. However, as an example of how one can use the path integeral formulation (242) of the generating functional, we consider in the following subsections the simple saddle point approximation to the path integral as well as possible improvements to this.
10.2. Saddle point
To derive the saddle point equations, we first perform the trace in (242), which for a single timestep involves
Using this we can write
The saddle point equations for stationarity of S with respect to the hi(t) and are then
These equations can be written in a simpler form in terms of the magnetizations in the system biased by ψ, which are generally given by . In the saddle point approximation , and because S is stationary with respect to h and ,
Here we have used 's' superscripts to denote saddle point values and introduced as a convenient abbreviation for later use. The saddle point equations then simplify to
For the physical dynamics we want the solution at , which we denote with a '0' superscript. One can show that this has . Indeed, if one considers the dynamics over a finite number of timesteps , then both and are defined over the range and accordingly one finds that in the first saddle point equation (248) taken at , the -term inside the tanh is absent. For this equation then dictates , and working backwards in time from there one finds recursively for all t. Accordingly (251) and (252) simplify to the standard mean field equations for the magnetizations,
Note that the saddle point value of the action (247) is then zero for , as expected from the normalization condition .
10.3. Beyond the saddle point: naive approach
One can go beyond the saddle point approximation and take into account the Gaussian fluctuations around the saddle point to obtain a better estimate of the generating functional and hence of the equations of motion for the mi(t). The Gaussian corrections to the log generating functional are
where indicates the deviation of h from its saddle point value. We denote by the matrix of the second derivatives of S with respect to h and calculated at the saddle point; this has entries
From the corrected log generating functional we obtain a corrected expression for the magnetization
To calculate the ψ-derivative of A at we can either appeal to numerical methods or evaluate A approximately. One relatively simple approximation, which will give us an intuitive feeling for the corrections and turns out to capture the lowest order corrections in J, is to only keep the equal-time ( elements of the matrix , i.e. to discard the second term in (257). The matrix then separates into blocks corresponding to the different timesteps t. Each of those blocks is of size . Let us order the elements so that the top left N × N sub-block contains the entries from (256). This block, which we will denote by , is diagonal and vanishes at the physical saddle point, where and . The two off-diagonal blocks (257) are within our approximation. If we call the bottom right N × N block , then we have by a standard determinant block identity
Near the physical saddle point is small so we can linearize the logarithm to get
Linearizing then also the diagonal entries of that appear in the first square bracket, and accordingly setting the last factor to its value at the physical saddle point gives
When we take the derivative of this with respect to , we do in principle get a term from the dependence of on ψ. But as is multiplied by a factor of , this will give us a contribution to the derivative that is of higher order in J than the main term from the explicit dependence on . Discarding this higher order contribution—as our second approximation in addition to neglecting correlations between different time steps in —yields for the derivative
Here we have implicitly already set because we linearized around the physical saddle point throughout. Using this result in (259), we obtain the corrected magnetization as
This is the result of our naive approach to including Gaussian corrections in the saddle point integral. Note that because of the normalization , we do not expect Gaussian corrections to at the physical saddle point. The approximation we made in evaluating as a sum of local-in-time terms preserves this requirement: as is clear from (260) given that when .
10.4. Beyond the saddle point: Plefka and Legendre transform approaches
It is instructive to compare (265) with the so-called dynamical TAP equations for our system [12]
which for stationary magnetizations reduce to the better-known equilibrium TAP equations [40, 41]. The negative term inside the hyperbolic tangent is known as the Onsager correction, which improves on the naive estimate of the mean effective field acting on a spin.
One now observes that the term inside the curly brackets on the right-hand side of our simpler corrected equation (265) is exactly of the form of this Onsager correction, so there is a close relation between the two approaches. However, there are two significant differences. First, while the Onsager term in (266) appears inside the in (265) it is outside, as if we had linearized in the correction. This is important: it means there is nothing on the right-hand side of (265) to stop the magnetizations from going outside the physical range . The origin of this difference is the fact that while the dynamical TAP method corrects the naive estimate of the field acting on a spin, the Gaussian fluctuation approach described in the previous subsection directly estimates corrections for the magnetizations themselves.
The second difference between (265) and (266) is that in the latter it is the corrected magnetizations itself that appear in the Onsager term, making the approximation self-consistent. In our naive method, on the other hand, the Onsager term is evaluated at the uncorrected saddle-point magnetizations.
The dynamical TAP equations (266) can be derived by the Plefka approach [25]. This is an elegant method that captures corrections to the effective fields as explained above, ensuring in contrast to (265) that the magnetizations remain in the physical range even when corrections to the naive mean field estimate are large. This is achieved by working not with the generating functional directly (or the Helmholtz free energy in the equilibrium case), but with its Legendre transform with respect to the magnetizations (the analogue of the equilibrium Gibbs free energy). In this way, one essentially approximates the system with an independent spin model in which each spin feels an effective field . This leads to expressions of the form for the magnetizations, which therefore always lie between −1 and 1. The effective fields are determined so as to best match the Gibbs free energy of the original model, by a perturbation expansion to second order in J.
Interestingly, the philosophy of working with the Legendre transform can also be applied to the approach we have described above, which initially has a rather different starting point, namely a saddle point approximation with Gaussian corrections. We show this in the Appendix, where we demonstrate that by switching to the Gibbs free energy and keeping only first order terms in the corrections Ai one can retrieve exactly the dynamical TAP equations. We refer the reader to [42, 43] for applications of the same idea in the context of the equilibrium Ising and Potts models.
11. Summary and discussion
Path integral methods and the associated perturbative and diagrammatic tools are among the most powerful methods used in almost every part of theoretical physics. Disordered systems, such as spin glasses, neural networks, models of the immune system etc are no exception and, over the years, path integrals have played a significant role in the study of equilibrium and out-of-equilibrium properties of these systems. In disordered systems, by definition, interactions in a single system do not have any trivial structure; there is symmetry only in their statistics, as described by the distribution from which they are drawn. These systems are typically subject to external stochastic forces with both rapidly changing and quenched random components. Furthermore, these sources of randomness are expected to play crucial roles in the physics and should therefore be included explicitly in studying such systems. The path integral and diagrammatic methods treated in textbooks on field theoretic tools for other physical systems usually do not deal with disordered systems. Therefore, we tried in this paper to describe some of the key field theoretic and path integral techniques in a manner that can be applied directly to disordered systems.
We started by studying the dynamics of a scalar degree of freedom evolving according to a Langevin equation and showed how one can do perturbative calculations for this model and represent them in diagrammatic form. We then discussed the supersymmetries of the action that features in the path integral description of this system, the physical intuition and meaning behind them, and how they can help in doing diagrammatic perturbation theory by reducing the graphs to be considered to those in simpler, well-understood equilibrium calculations. Our next step was to study the dynamics of systems of interacting 'soft-spin' variables subject to Langevin equations with random interactions, first using a perturbative treatment for a single sample of the interactions, and then via a conceptually different approach based on averaging the generating function in its path integral formulation over the disorder.
Finally we switched to hard-spin models, focusing on the Ising model with synchronous update dynamics, for a single sample with arbitrary couplings. Here, as opposed to the way we treated soft-spin models, the path integral is not written in terms of the spins directly but in terms of the fields acting on them. We discussed how, from such a path integral formulation, one can derive approximate equations for the mean magnetizations at the saddle point level and compute the Gaussian fluctuation corrections around the saddle point. We showed that a naive calculation of these corrections yields equations of motion for the magnetizations that can lead to unphysical predictions. These issues can be avoided by going to improved approximation schemes like dynamical TAP equations, as described recently in other papers using a path integral formulation [12, 44], as well as several alternative approaches [38, 40, 45]. We closed the paper with the intriguing observation, however, that issues with naively corrected mean-field equations can also be cured within the general approach presented in this paper, using a Legendre transform to switch to the Gibbs free energy. It is tempting to infer that the Legendre transform implicitly achieves a resummation of the most important diagrams, but we have not been able to show this explicitly and leave this point as an open question for future work.
Although we have tried to cover what we view as the key concepts in this review, we have had to leave out several important issues to maintain a coherent focus. We list two of these in the following.
The use of dynamical models in inference. The path integral approach described here is designed initially for a forward problem: given the interactions in a system (as well as external fields etc where relevant), predict the dynamics of the order parameters of interest. In recent years, there has been strong interest in the inverse problem: given observations of the order parameter dynamics, find the interactions; for a review see e.g. [46]. This interest has been generated by recent advances in recording technology in various fields of science, allowing massive data sets to be gathered that invite researchers to attempt to reverse-engineer the underlying systems from the observed data.
The path integral formulation for forward problems described here is a natural first step in finding inverse equations, which in the simplest case consist of an inversion of the forward equations. Furthermore, in some cases, the path integral method can be immediately applied to the inverse problem itself. For instance, in the presence of hidden degrees of freedom, when trying to reconstruct the interactions in a spin model by observing only some of the spins and not the others, then calculating the likelihood of the data involves tracing over the trajectory of the hidden spins. This can then be done using the path integral methods discussed in this review [47].
The extended Plefka expansion. In a spin system at equilibrium, the magnetization of each spin is really the only parameter of interest, with spin–spin correlations then determined indirectly from the linear response of the system. This is not the case for out-of-equilibrium dynamics. Here out-of-equilibrium correlation and response functions need to be calculated in addition to the magnetizations in order to achieve a full statistical description of the system. So while at equilibrium and also in the simplest form of the dynamical TAP approach for Ising models [12] it makes sense to perform a Plefka expansion by fixing only the magnetization (and in the dynamics, the conjugate fields), in the full dynamical treatment response and correlation functions should also be fixed as order parameters. This is even more important for soft spin models, where even at equilibrium one would want to use at least the means and variances of the local degrees of freedom as order parameters. Such 'extended' Plefka expansions have been used for the so-called p-spin spherical spin glass model in [11] and, more recently, for general systems of coupled Langevin equations [48] and Ising spin glasses [44]. They should be a productive route for further progress in the field, e.g. by extending them further to incorporate inference from observed data [49].
Other interesting current work in this field includes that of Cooper and Dawson [50], Polettini [51], and Nemoto et al [52]. These lie beyond the scope of our elementary treatment of the subject, but we hope that our introduction will equip readers with the tools to explore these and other more advanced problems.
Acknowledgments
This work has been partially supported by the Marie Curie Initial Training Network NETADIS (FP7, grant 290038). YR was also financially supported by the Kavli Foundation, Norwegian Research Council, Center of Excellence Scheme (grant number 223262) and a Starr Foundation fellowship at the Simons Center for Systems Biology, Institute for Advanced Study. The authors are also grateful to Luca Peliti for fruitful discussions. PS acknowledges the stimulating research environment provided by the EPSRC Centre for Doctoral Training in Cross-Disciplinary Approaches to Non-Equilibrium Systems (CANES, EP/L015854/1).
Appendix
The Legendre transform approach starts from the saddle point value of the log generating functional
We first rewrite this expression in a form that will simplify some of the algebra below, by decomposing the function in (A.1) as
Here is simply the entropy of a single spin with magnetization x. This decomposition gives, bearing in mind the definition of in (250),
The second sum in the second line now cancels the second sum in the first line because of the saddle point equation (252), and we end up with the expression
The Legendre transform of this expression with respect to the magnetizations mi(t) is
where the ψ are to be expressed in terms of the m by solving
We already know that , so at this saddle point level one has . The second saddle point equation (252) then becomes and inserting this gives
The equation of state is, by standard Legendre transform properties,
which yields
At this is naturally solved by . So far we have not achieved anything new: we have merely provided an alternative way of obtaining the naive mean-field equations (253) and (254) we already had.
Now let us consider the Legendre transform of the generating functional including the Gaussian correction,
where as before ψ is to be treated as a function of the magnetizations, as determined by the condition
Here we have used the earlier definition .
Expressing the right-hand side of (A.11) in terms of m, and keeping only terms linear Ai, we obtain, as shown below,
Remarkably, all the Ai terms have cancelled here and the only difference to the saddle point result in (A.8) is the naive addition of the correction term A, expressed in terms of m.
From we can now derive the equation of state for the magnetizations from
If we are only interested in corrections to quadratic order in J for the field acting on each spin, we can drop the terms in the second line of the equation above as the term in square brackets is already . In the physical limit one then obtains
We can now appreciate the role played by the Legendre transform: the corrections we obtain act on the effective fields and so are inside the . As explained in the main text this makes more physical sense.
To evaluate , one can start from the expression (262) for A. Replacing in this using the saddle point equation (252) gives
As A is already , replacing all by mi(t) in this expression only gives a negligible correction of . Also the first square bracket is small at the physical saddle point, of , so the derivative of the final factor with respect to mi(t) can be dropped. Finally the derivative of the tanh can also be neglected as it is of order J. One thus finds to leading order the simple result
Combined with (A.16) this yields the dynamical TAP equation (266).
It remains to show (A.13). Using the expression (A.5) for and in (A.11), we can write Γ as
We have shifted the time index t by one in the second and fourth sum for later convenience. Now we express the right-hand side of (A.19) in terms of m and keep only terms linear in Ai. For the last sum we need the following identity, which can be obtained from the definition of in (250) together with :
To linear order in Ai, the various terms on the right-hand side of (A.19) are then
Putting all these together we notice that the second term on the right-hand side of the first equation above and the first term of the last equation, the second terms in the second and last equations, as well as the last terms in the second and last equations cancel each other, yielding
Using the fact that from the first saddle point equation (251)
we can write the terms in the second line of (A.25), together with the first term in the third line, as
As the term in curly braces is itself a correction term that is non-zero only because of the difference between and mi(t), this overall expression can be neglected as subleading. The remaining terms of (A.25) then give exactly (A.13) as claimed. Note that keeping only terms linear in Ai in this calculation—which then eventually cancel—in some sense plays the same role as expanding to second order in J2 in the Plefka method, as Ai is of .
Footnotes
- 6
We use brackets, , to denote averages over all noise histories and, where relevant (as in equation (3)), initial conditions. Whether the initial condition average is included or not will be clear from the context. Other kinds of brackets, with subscripts, that appear later in the paper will denote averages over specific distributions that appear in the theoretical development.
- 7
Digression on equilibrium statistical mechanics: equilibrium averages can be evaluated by a diagrammatic expansion very similar to the one here. The main difference is that the partition function (whose perturbative expansion is the same as the one for above) is not automatically normalized. Averages are thus written as and the denominator, when expanded, ensures again that disconnected diagrams vanish. Equivalently, one can think of averages as derivatives of the log partitition function; the perturbative expansion for consists of just the connected diagrams from the expansion of Z.
- 8
It is common in the literature to use a different symbol, such as a cross, to represent the external field. Here, in the interest of a more uniform notation, we stick with dots, and the number of lines connected at a dot indicates whether it means h, g, or a coupling of some still-different order, such as the cubic vertex in (93).
- 9
We could of course add any term proportional to to without changing the action, i.e. we could put any coefficient a in front of the term in (193) and the action would remain unchanged. Choosing a = 2 would correspond to including the anticommutator term from (191). However, the choice a = 1, i.e. , will prove convenient when we want to invert this operator to find the unperturbed superfield correlation function.