On the Interpretation of Stratonovich Calculus

The Ito-Stratonovich dilemma is revisited from the perspective of the interpretation of Stratonovich calculus using shot noise. Over the long time scales of the displacement of an observable, the principal issue is how to deal with finite/zero autocorrelation of the stochastic noise. The former (non-zero) noise autocorrelation structure preserves the normal chain rule using a mid-point selection scheme, which is the basis Stratonovich calculus, whereas the instantaneous autocorrelation structure of Ito's approach does not. By considering the finite decay of the noise correlations on time scales very short relative to the overall displacement times of the observable, we suggest a generalization of the integral Taylor expansion criterion of Wong and Zakai (1965) for the validity of the Stratonovich approach.


Introduction
Stochastic dynamical models are basic to the understanding of the role of random forcing in a wide range of scientific and engineering systems (e.g. [2][3][4]). The central theoretical approaches arise from the Einstein and Langevin studies of Brownian motion [5,6], which provide mathematically different but physically equivalent and complimentary descriptions of the fate of a body (a pollen particle in water observed under a microscope) under the influence of random non-deterministic collisions (water molecules). Einstein determined the time evolution of the probability density of particles by solving the Fokker-Planck equation whereas Langevin wrote down an explicit deterministic momentum equation for a particle augmented by a Gaussian white noise forcing which perturbs the particle trajectory. The approach of Langevin now constitutes a canonical 'stochastic differential equation' (SDE) for a daunting scope of systems, but since its introduction the mathematical and physical interpretation of the noise term has been discussed and debated.
Whether viewing the problem from configuration space through the solution of the Fokker-Planck equation, solving a Langevin equation, or generating statistical realizations by performing Monte-Carlo simulations of the particles under consideration (e.g. [7]), a consistent interpretation and calculational scheme of the noise structure is needed. The approach developed by Itô [8] rests upon the Markovian and Martingale properties; the former captures the concept of a 'memoryless' process, wherein the conditional probability distribution of future states depends solely on the present state and the latter that, given all prior events, the expectation value of future stochastic events equals the present value. These properties have the advantage of simplifying many complicated time integrals but the disadvantage of requiring a new calculus which does not obey the traditional chain rule. In contrast, the approach of Stratonovich [9] does not invoke the Martingale property and preserves the chain rule and allows white noise to be treated as a regular derivative of a Brownian (or Weiner) process, W t .
A model SDE (to which we shall return later) for a variable x(t) is t wherein the first term is 'deterministic' and the second term is 'stochastic'. The case in which , is constant is referred to as additive noise and when it depends on = ( ) x x t it is multiplicative noise. Now, despite being able to integrate equation (1) in a formal sense as k k k 1 . The issues are (i) the Brownian process W t is nowhere differentiable and because dW dt t is δ-autocorrelated then in any interval on the real line the white noise it represents fluctuates an infinite number of times with an infinite variance, and (ii) the defining limit of the integral in equation (3) depends on the place in ⎡ ⎣ ⎤ ⎦ + t t , k k 1 wheret k is chosen; the choice that provides the seed of the dilemma and the origin of the two different calculi.
In the Itô approach the choice is˜= t t k k , which maintains the Martingale property due to the fact that t k is the present value in the integrand, thereby forcing the expectation value in to take the present value. In contrast, the Stratonovich approach defines rules of calculus. Stratonovich referred to his choice as a 'symmetrization' between past and future. Despite its wide usage, we have not found a physical interpretation of the basis of the past/future symmetrization of Stratonovich calculus. The outline of this paper is as follows. In the next section we summarize our approach and discuss its connection to the dilemma generally and to other studies. We then outline the conventional physical science viewpoint and the perspective of stochastic calculus, before coming to our main point. Finally, we discuss the relationship to other approaches before concluding.

A brief comparison and contrast
Wong and Zakai [1] argued that in any real world system perfect white noise does not exist and that Brownian motion x(t) approximates a description ( ) x t x t n as → ∞ n they recovered Stratonovich calculus. Accordingly, the choice of stochastic calculus resides in the characteristics of the noise and continuity arguments. In finance, it is argued that short time scale processes are truly discontinuous and thus Itô calculus is preferred (e.g. [10]), thereby maintaining the Martingale property. However, in physics the continuous motion of Brownian particles influenced by high frequency white noise has long been considered within the framework of normal calculus [11]. Conceptually, there is no clear distinction between the statistics of water molecules colliding pollen grains and trading options or stocks. Hence, it remains an open exercise to answer the questions of when, how and whether to use continuity considerations as a core criterion to choose either of the calculi being discussed here.
As noted in the introduction, the complimentary approaches of Einstein and Langevin, with a reliance on δ-autocorrelated noise, provide consistent testable predictions of Brownian motion. The combination of this and the additional consistency with Stratonovich calculus led to the suggestion that physical scientists should avoid Itô calculus [12]. Although Van Kampen [13] cautiously suggested that the Langevin equation is intrinsically insufficient for representing systems with internal noise, it is still the case that Stratonovich calculus is appropriate for both internal and external noise and there are an enormous number of problems in which one cannot make this distinction.
Here we focus on an ostensibly physical argument to discuss the origin of Stratonovich calculus. We generalize the theorem Wong and Zakai [1] using the integral Taylor expansion and appealing solely to the L 2 integrability of a function, thereby avoiding discussions of the regularity of a stochastic process. Because the principal difference between the Itô-Langevin and the Stratonovich-Langevin formulations lies in the drift term, we focus on this in our approach and examine the intrinsic nature of short time scale processes approximated by white noise to decide which calculus is more appropriate. As opposed to Turelli [14], who took as a continuous deterministic model an approximation of a discretized model (in population dynamics), we begin with a well defined deterministic model.

The conventional physical science perspective
The term 'white noise' refers to a random signal Γ ( ) t in which all frequencies contribute equally to the power spectral density. For a general stationary random process the autocorrelation where · is the ensemble average, is by definition independent of time and symmetric/even In science and engineering the traditional form of the Langevin equation, slightly different than that in equation (1), is written as where the noise forcing is δ-autocorrelated as We assume that τ is small relative to the time over which the macroscopic system evolves, but following Risken [15]. Inserting (6) into (5) leads to x t , we can treat equation (5) iteratively and then take the ensemble average, which is Therefore, By the definition of white noise discussed above, Γ half of the contribution of the delta function is considered such that the value of the integral becomes 1 . What is the physical meaning of the half contribution of the delta function? The entire interval We shrink the interval near ′ t as , where ϵ can be interpreted as the size of a subinterval in the Riemann summation discussed above in distinguishing the stochastic calculi. Because the delta function is even Despite this being just a simple mathematical modification, we can interpret it as a representation that the weighting from past is equal to that from future in the neighbourhood of ′ t . The origin of this symmetry comes from the time symmetry of the autocorrelation function  τ ( ) . We will connect this half contribution of the autocorrelation to the Stratonovich integral below, but we note now the above development yields which is exactly same as that obtained from Stratonovich calculus.

The stochastic calculus perspective
Mathematicians have long questioned the validity of the conventional definition of white noise because Brownian motion W t is nowhere differentiable. Namely, defining Γ ( ) t is poorly grounded in normal calculus and we must revisit the interpretation of equation (4) in light of the mathematical definition of Brownian motion W t ; , with zero mean and standard deviation − t s.
Therefore, equation (4) can be rewritten as t On the one hand, using this integral form avoids the issue of the differentiability of W t , but this comes at the cost of having to introduce a different sort of calculus. However, before coming to the issue of the implementation of a particular stochastic calculus, we perform the same expansion as discussed previously, in (7) and (8), which leads to The first term ( ) a x t , can be interpreted as the deterministic response to macro-scale forcing and the second term embodies the interaction between a macroscopic quantity ( ) b x t , and the microscopic accumulation of random noise forcing. On a time τ that is short from the macroscopic perspective, the interaction between two scales manifests itself multiplicatively. The contribution of the micro-scale to the change of the macroscopic quantity x appears as a double integral of the Brownian motion. It is at this point that we must decide upon a specific calculus, and this choice determines the value of the double integral in equation (14).
The double integral can be written as ∫ and approximated by a Riemann sum as . As should be intuitive from the previous discussion, the confusion lies in the choice of ′ t k for ′ W t k . As can be seen below, this choice determines the average of the integral viz., Itô [8] argued that by choosing ′ t k = t k the most important characteristic preserved is the Martingale property. This implies that the expectation value of a future event is equivalent to that of the present value in the subinterval ⎡ ⎣ ⎤ ⎦ + t t , k k 1 . Hence, the expectation value of the double integral is zero. The disadvantage of Itô calculus is that the chain rule of normal calculus is lost; one needs additional terms to preserve the Martingale property during integration. On the other hand, Stratonovich [9] chose ′ t k = + + t t 2 k k 1 , and the integral has a value of τ 1 2 . This approach preserves the chain rule, implying that Stratonovich calculus must be used when we approximate noise processes that are continuous and have piece-wise continuous derivatives (e.g. [1]). When we approximate the rapid fluctuations of a stochastic process as white noise, then we must preserve the normal rules of calculus if we are to capture the dynamics of Brownian motion.

Autocorrelation structure and stochastic calculi: a simple example
In a classical physical world where the continuity is more or less guaranteed, Stratonovich calculus is evidently the most appropriate. However, when we consider the colliding of particles or the buying or selling of stocks or options, these short time-scale phenomena appear to be discontinuous. Such circumstances lead us to question the validity of the continuity of a stochastic process. That said, it may be impossible to find events that have zero autocorrelation. Thus it is a natural question to examine the effect of a non-zero autocorrelation in a multiplicative noise process by investigating limiting behaviors, which may provide some insight into the nature of the appropriate stochastic calculus. Hence we consider the simplest case of multiplicitive behavior; = ( ) is a stochastic process with a finite decorrelation time. For this example, using the same procedure used to arrive at equation (8), we find that Whereas it is always the case that τ τ ≫ c , the value of τ m depends on the fineness n of the subdivision, with τ m increasing as n decreases; coarsening the temporal resolution. So long as τ τ ≫ c m we can still justify the use of the stochastic differential equation but once τ τ ⩽ c m this is no longer the case. Therefore, increasing τ m to values above τ c but still small relative to τ one reaches a value of τ τ ≡ ⋆ m that is sufficiently short that the decay of the noise can be observed but sufficiently large so that the stochastic differential equation is valid. This is achieved as follows. Consider n such that τ τ τ ≫ = c m , which we rewrite as ≫ = The limit ⩾ ≫ ⋆ n n 1 insures that the right hand side converges to and hence It is the mid-point selection rule in the Riemann sum that is associated with the finite noise autocorrelation. However, traditionally this selection rule has been interpreted as the 'magical' choice required in order to recover the normal rules of calculus, but Stratonovich calculus itself does not provide any new dynamical insight that plays a role analogous to the Martingale property of Itô calculus. Hence, we seek a basic understanding of how the mid-point selection rule emerges as a consequence of applying the central limit theorem to colored noise. Our approach is to consider shot noise, which has an autocorrelation represented by a δ-function.
What is the origin of the δ-function? Brownian motion can be generated by a collection of independent random processes. In the Riemann-summation approximation of the integral using τ n with integer n. As discussed in the argument leading to equation (20), so long as it is large the freedom to choose n determined the time scale τ n over which the decay of the noise forcing is observed; when τ τ ≫ n c then the noise signal appears as a discrete packet, appropriately simulated as shot noise (e.g. [19]).
Consider two 'square' signals with amplitude h and time duration w, occurring with probability αw. When both are contained in the same packet and have the same sign (±h) they are positively correlated. Otherwise they are independent and uncorrelated. In the probability domain Ω the autocorrelation  τ where n 1 (n 2 ) is a random variable occurring at the time t ( τ + t ) and ( ) P n n , 1 2 is the joint probability density function of n 1 and n 2 . We need only consider the case when τ is smaller than w, which is Thus, the constant αh 2 corresponds to σ 2 , h w is analogous to W t d d , and hence  τ σ δ τ = ( ) ( ) 2 2 . Therefore, despite our argument emerging from shot noise, it leads naturally to the definition of white noise in which there exists a finite time correlation that is too minute to be realized when viewed from a coarse time-scale, a coarseness which depends on the observer.
Clearly, a similar procedure applies directly to the Riemann-sum so long as the same constraints on the subdivision of the time domain are in place. When τ ≃ n w, two signals of the same sign are positively correlated in the interval. Hence, the random variables n n 1  Moreover, the factor of 1 2 originates in the time order of n 1 and n 2 which is traced to the constraint < n n 1 2 when ( ) P n n 2 1 is calculated. In other words, the present value n 1 is influenced in a time-symmetric manner and this factor of 1 2 reflects this symmetry through a properly ordered time integration.
In this simple argument one sees that the δ-autocorrelated noise (used in the definition of white noise) and the Stratonovich calculus approach have the same origin. Both formalisms use different approaches to capture the accumulated influence of short time-scale correlations of a noise source that are not represented in the long time-scale dynamics. Importantly, temporal continuity of the noise is not a necessary condition. In order to demonstrate this, we used discontinuous autocorrelated shot noise to recover both the δ-function definition of white noise and the mid-point selection procedure of Stratonovich calculus. Moreover, one can make the simple case more realistic by treating h as a random variable and the same logic holds.
Finally, we note that despite our pedagogical example = ( ) x t xF t d d , the results are general. Thus, when we replace x in the right hand side with x , we can generalize our argument to any multiplicative noise case. There are of course a myriad of ways to decide which calculus is most appropriate to the problem at hand. Hence, whereas the integral Taylor expansion method enables one to only focus on the nature of the short time-scale processes, the comparison between the integral of autocorrelation and the variance of the short time-scale process focuses the choice on the specific mathematical model or scientific problem. Of particular interest to us is the question of how the stability of the non-autonomous SDEʼs of interest in climate dynamics are influenced by such considerations [20].

Related approaches
According to our result using shot noise, even an infinitesimal noise correlation in a stochastic differential equation can be interpreted using Stratonovich calculus. One might consider our analysis as a generalised version of the Wong and Zakai [1] approach because shot noise here is not a C 1 function. Considering the ubiquity of colored noise in real systems, Itô calculus might be interpreted as an idealized mathematical procedure, only applicable to true white noise processes that are never realized in nature. Hence, we ask whether there are situations in which Itô calculus can be used for colored noise?
The conceptual model we are dealing with in stochastic differential equations is that we have a principle deterministic process whose fate is influenced by short time-scale fluctuations, which are considered as noise. In building a mathematical model it is common to ignore the influence of the short time scale processes on the deterministic dynamics, but there are situations when this may be known to be a poor assumption such as is in the presence of inertial [21] or feedback [22] effects. Indeed, Kupferman and colleagues [21] studied systems with multiplicative colored noise and inertia to find that if the correlation time of the noise is faster (slower) than the relaxation time, this leads to the Itô (Stratonovich) calculus form of the limiting stochastic differential equation. Similarly, Itô calculus is invoked to interpret experiments wherein the time delay of the feedback is much larger than the noise correlation time [22].
These results may also be reinterpreted within the framework of our shot noise formalism. When the inertial or feedback time scales are much shorter than the noise decorrelation time, one can ignore the former and focus on the latter using the logic that lead us to the Stratonovich calculus interpretation. However, when the inertial or feedback time scales are much longer than the noise decorrelation time, the duration w of the shot noise is no longer equivalent to the decorrelation time scale of the noise. Rather, in order for the principle dynamics to be valid, we must interpret w as the inertial or feedback time scale. Hence, during a time increment w, we must assume that two consecutive random events n 1 and n 2 are independent, = ( ) ( ) ( ) P n n P n P n , 1 2 1 2 , which is analogous to the equation (24). Therefore, despite the colored noise in this case, Itô calculus prevails. Furthermore, this description provides an unambiguous interpretation of the discrete nature of a system [18], which is believed to be a central criterion for the use of Itô calculus.

Conclusion
When a system is described as having multiplicative noise, the criterion for choosing which stochastic calculus is most appropriate-Itô or Stratonovich-has been a source of confusion and great discussion. In most areas of physical science where white noise is defined in terms of a δ-function autocorrelation, Stratonovich calculus is preferred, mainly due to the consistency with the results emerging from the Fokker-Planck equation and the fluctuation-dissipation theorem. In economics, where the Martingale property is considered as the most essential aspect of stochastic random variables, Itô calculus is widely accepted and used in the development of models. On the other hand, in population dynamics, Itô calculus is viewed as a proper continuous approximation of an underlying discrete model in which the Martingale property is guaranteed by construction. However, it would appear more prudent to choose the calculus depending on a clear set of objective considerations.
The core difference between the approaches is seen through the presence of the Martingale property in the multiplicative noise term of the stochastic Langevin equation, and hence manifests itself in the drift term of the associated Fokker-Planck equation. Thus, the principal ambiguity is associated with the ensemble mean of the product of two Brownian motion processes and in the original form it is difficult to assess the validity of the Martingale property. Here we used integral Taylor expansions to pinpoint the source of the deviation between the two calculi, which resides in the characteristics of the short time-scale noise process, the nature of which is described by the autocorrelation. The approach allows one to see the origin of the Stratonovich calculus, which resides in the mid-point selection scheme that is synonymous with including the effect of a finite autocorrelation.
It appears that most realistic signals simulated by white noise have non-Markovian structure, that is, a finite decorrelation time τ c . When one writes down a model stochastic differential equation, one typically initially assumes that τ c is very small relative to the characteristic deterministic dynamical time scales, in which case white noise is a proper approximation. Thus, in such a case, the integral of the noise autocorrelation function and the variance of the noise are of a similar order, and the δ-function autocorrelation is a good approximation that is consistent with Stratonovichʼs calculus. However, when the integral of the noise autocorrelation function is much smaller than the variance of the noise, the δ-function autocorrelation is not an appropriate definition of white noise and Itôʼs calculus is appropriate.
Focusing on the noise itself, in the microscopic limit it is no longer Brownian but is instead represented by discontinuous finite-time signals, or shot noise. A small but finite autocorrelation of shot noise can be characterized by a δ-function, and the accumulation over time of its autocorrelation is represented by the mid-point selection procedure in the Riemann sum of Stratonovich calculus. This demonstrates that the origin of δ-correlated noise and Stratonovich calculus is the infinitesimal autocorrelation of the stochastic noise. Thus, the consistency between the Fokker-Planck equation and Stratonovich calculus has the same physical source, the finite autocorrelation of the stochastic noise that is ignored on time scales long relative to the fluctuation time scales. Therefore, a finite noise autocorrelation is the key criterion for choosing a stochastic calculus, but we must take care in interpreting its origin generally as well as on a case by case basis.