A Distributed Procedure for Computing Stochastic Expansions with Mathematica

The solution of a (stochastic) differential equation can be locally approximated by a (stochastic) expansion. If the vector field of the differential equation is a polynomial, the corresponding expansion is a linear combination of iterated integrals of the drivers and can be calculated using Picard Iterations. However, such expansions grow exponentially fast in their number of terms, due to their specific algebra, rendering their practical use limited. We present a Mathematica procedure that addresses this issue by re-parametrising the polynomials and distributing the load in as small as possible parts that can be processed and manipulated independently, thus alleviating large memory requirements and being perfectly suited for parallelized computation. We also present an iterative implementation of the shuffle product (as opposed to a recursive one, more usually implemented) as well as a fast way for calculating the expectation of iterated Stratonovich integrals for Brownian Motion.


Motivation and mathematical background
In this section, we introduce the mathematical background and motivation for manipulating expansions and iterated integrals.The next subsection introduces the Picard procedure, a simple iterative way to derive local approximation of the solution of a differential equation.Iterated integrals are then introduced and the two are finally combined to define the expansions.

Motivation and notation
Consider the following differential equation: where Y t ∈ R m , X t ∈ R n and f (., θ) ∈ R m×n .The parameters of the function f are collected in the variable θ.We call each X (i) a driver of the differential equation.We assume the functions f j,i , i ∈ {1 . . .n}, j ∈ {1 . . .m} to be polynomials.The initial value of Y t is set to Y 0 .
The objective is to derive a local approximation of the solution in terms of f , Y 0 and the iterated integrals of Y t , i.e.that is integrals of the form . . .

Picard iterations
Picard iterations provide a way for deriving local approximations of solutions of differential equation.They are defined as: Y 0,T (r + 1, j) = n i=1 f j,i (Y s (r, j))dX (i)   s (3) (4) where Y 0,T (r, j) is the j th component of the approximation of Y 0,T = T 0 dY s = Y T − Y 0 after r iterations.Thus, the first iteration gives: Y 0,T (1, j) = t 0 f j,1 (Y 0 )dX (1)  s + . . .
Note that if we are interested in the actual value of the solution at a time t ∈ [0, T ], its Picard approximation is given by Y t (r, j) = Y 0 + Y 0,t (r, j).
One of the main successes in the theory of rough paths was to give precise conditions on X and f for Picard iterations to converge ([3]).

Iterated integrals
Iterated integrals are integrals of the form: s,t = . . .
By definition, integrating an iterated integral produces an iterated integral: If X is a geometric p-rough path, i.e. it can be approximated by paths of bounded variation (see [3] for a precise definition), then the integrals obey the usual integration-by-parts rule, which can be generalized as follows: The shuffle product of two words τ and ρ is the set of all words using the letters in τ and ρ such that they are in their original order.For example 13 42 = {1342, 1432, 4132, 1423, 4213, 4132} but 1324, for example, does not belong to the set.This result holds only in the case of deterministic drivers X, or Stratonovitch integrals.A similar relation exists for Itô integrals but requires a small correcting term ( [5]) making them slightly less practical in this context.A simple transformation allows diffusion processes to be defined with respect to Itô or Stratonovitch integrals ( [1]): It immediately follows from equation (8) that any power of an iterated integral is a sum of iterated integrals and that a polynomial of iterated integrals is also a linear combination of single iterated integrals.For example: The number of terms from the product of iterated integrals grows fast, exponentially if all letters are different.

Expansions
We are now in position to derive expansions for the solution of a differential equation.Picard iterations yield: (1)  s + . . .
) is a sum of iterated integrals, the polynomials f ji (Y s (1) + Y 0 ) are also sums of iterated integrals and so is their integration with respect to X (i) .Y 0,T (2) is thus a sum of iterated integrals and by recursion all Y 0,T (r, .)are.A formal proof in the context of differential equations driven by rough paths can be found in [4].

Example
Consider the Ornstein-Uhlenbeck process: dy t = a(1 − y t )dt + bdW t and y 0 = 0.In this case, X = W t and Y (i) 0 = 0 for i ∈ {1, 2}.Applying Picard iterations, we obtain: (1)  s + T 0 bdX (2)   s = aX (1) a(1 − (aX (1)  s + bX (2)  s ))dX (1)  s The solution of the stochastic differential equation can thus be approximated by a series of iterated integral of the drivers, whose coefficients are a function of the parameters.The iterated integrals capture the statistics of the drivers and are separated from the parameters.This derivation can be readily implemented in Mathematica ([6]1 ) (see [5] for an implementation of the shuffle product) but suffers a major drawback: each product of iterated integrals being a shuffle product, the number of terms produced grows extremely fast (exponentially in the worst cases) and rapidly becomes unmanageable.In the next section, we introduce a re-parametrisation of the problem that circumvents this problem by providing an alternative representation of the expansion which can be processed in a distributed manner, alleviating large memory requirements.
2 Re-parametrisation of the polynomials One thing to note in the derivation of expansions is that each successive iterations requires the explicit linear combination of iterated integrals for the previous iteration; evaluating Y 0,T (r) requires the complete expansion for Y 0,T (r−1).As the number of terms grows, manipulating this object rapidly becomes unwieldy.
In this section, we describe a re-parametrisation of the polynomials which bypasses the need for an expansion in terms of iterated integrals.It provides a more compact representation of the approximate solution and is naturally amenable to parallel processing.For clarity of exposition, we first introduce the approach in the case of a one-dimensional (m = 1) differential equation with n drivers.We assume the polynomials f 1i to be of degree less or equal to q.In the last subsection, the procedure is generalized to m-dimensional differential equations.

One-dimensional case
We first remark that a polynomial P (y) can be written in terms of y − y 0 by writing its Taylor expansion around y 0 : Next we introduce the following new operation for iterated integrals: where β − is the word β with the last letter removed and β end the last letter of β.This is a non-associative, non-commutative operation -in fact, it can be viewed as a non-commutative dendriform.We can rewrite the Picard iteration using this operation.Note that from now on, the interval [s, t] will be fixed to [0, T ] and will be omitted.
Therefore, if we define the objects Q as: the Picard iteration takes the following form: where Y (r) k is the usual product.

Description of the approach
Consider a one-dimensional differential equation with quadratic functions f 1i , i.e. q = 2.The new representation yields: where (Q k ) is the k th object Q to the power .Crucially, this new representation does not require the explicit computation of the shuffle products, keeping the number of terms under control.Moreover, the expression can be expanded into its summands and each summand be processed independently.Thus, we avoid the handling of a large expression and are able to parallelize the computation.
Note that the representation only depends on the maximum degree of the polynomials but not on the number of drivers.The Picard iteration can therefore be done only once, stored in a file and used at a later date for any system that uses the same maximum degree q.
Using this representation, it is possible to derive the stochastic expansions through a few stages: 1. Expand the expression Y (r) into its mononials u.Each monomial u is a function of Q's that uses the non-commutative products.Importantly, the objects Q and the product are only used as place-holders at that stage.For example, the first three monomials for 2. For each monomial u, the objects Q are replaced by their values from the model at hand.Each Q is a weighted sum of the drivers X (i) (eq.17), so each u becomes a polynomial V of X (i) in terms of the non-commutative product.As in the previous step, the product is still only employed as a place-holder and not instantiated.
3. Each polynomial V is expanded into its monomials v.Each v is a function of X (i) and .
4. For each monomial v, the product is instantiated; its actual definition in terms of shuffle product is only used at this later stage.
Each process only requires a fraction of the memory that a direct approach (replacing Q's by the X (i) and using the product's definition) would.Moreover, at all stages, each monomial can be processed independently from the rest, leading to natural parallelization.It is also important to note that the whole expression is actually stored in a file and thus not in memory at any point.The exact details of this procedure are given in section 3.

Generalization
In the multidimensional case, the Taylor expansion of a polynomial requires a larger number of terms that involve cross-products between the components of the vector Y .The objects Q are not indexed by k ∈ {0, . . ., q} but by the set OW m (0, q) of the ordered words of length up to q written with letters {1, . . ., m} and are now defined as: for j ∈ {1, . . ., m}.The constant c(τ ) is the number of different words we can construct using the letters in τ .The Picard iteration becomes: This section describes how this new approach was implemented in Mathematica.
In this part, iterated integrals of the drivers (X (i 1 ,...,in) ) are denoted j (i 1 ,...,in) to follow convention and drivers are numbered from 0 with the first driver representing time.

Shuffle product
Since each product of two iterated integrals is a shuffle product, special care must be taken of its implementation.[5] has shown a way of writing the product in Mathematica: This is a direct and natural translation of the following result: We present a new implementation of the shuffle product.The product is done iteratively instead of recursively and is based on string transformation.To calculate the shuffle product of two words α and β, we first set the string 'aa...aabb..bb', with as many a's and b's as there are letters in α and β respectively (line 2).We then replace all occurences of 'ab' by 'ba' (line 6) and keep on iterating the replacement until none are possible (i.e when the word is 'bb..bbaa..aa') while keeping track of the new words generated (lines 4-6).The set of all words thus created is the shuffle product of two arbitrary words of the same lengths of α and β.Finally, the letters a and b in each word are replaced by their actual values from α and β (lines 7-9).
Figure 3.1 shows the relative performances of the implementation.The product of two iterated integrals with random words is calculated with the recursive definition and the iterative definition.The ratio of the time spent by the former over the latter is recorded and the process is re-iterated 1,000 times for each final-word length.On average, the iterative method is a bit slower on short words but faster for long words (from a total length of 8).Since longer words are more numerous, using the iterative method will be advantageous.

Non-commutative product
The non-commutative and non-associative product is implemented as in Mathematica, an operation with no built-in meaning.We only set a few basic properties for this product:

Picard iteration
The usual Picard iteration is implemented with a helper function PicardIteration as following: For example, Picard [f,x0,4] outputs the stochastic approximation of the SDE with the functions f collected in a list in the first argument.This was used in [4] for a system with linear drift and quadratic variance.
With the new representation in Q, it can be written directly as in eq.2: R is the number of iterations to be calculated and q the maximum degree of the polynomials f .PicardQ1Dim[] produces a very compact representation of the expansion, which needs to be processed further in order to give the same result as Picard[].

Distributed processing of monomials
Going from the compact representation provided by PicardQ1Dim[] to the linear combination of iterated integrals j's is done in a few stages.Each stage modifies the representation of the stochastic expansion in such a way that a) computational requirements are minimized and b) it can be parallelized.As described in section 2.2, the workflow goes as follows: 1. PicardQ1Dim[] produces a polynomial in Q and .
2. Each monomial (in Q and ) is extracted and stored in a list (actually a file).
3. For each monomial, the Q's are replaced by their values in j (eq.17).They are now polynomials in j and .
4. The monomials (in j and ) from each polynomial are extracted and stored in a file.
5. The product is instanciated in terms of the shuffle product (eq.10).The result is a linear combination of iterated integrals j for each monomials.The stochastic expansion is the overall sum of the all those linear combinations.
As can be seen on figure 3.4, the different polynomials are successively expanded in terms of monomials, which in turn are processed independently.Since each expression is effectively broken down in small parts and dumped into a file, a much larger of terms can be computed, as can be seen in the next section.
Since each processing stage follows roughly the same logic, with slightly different transformation rules, we only detail the first one: going from the polynomial in Q and to its monomials.This is done step by step, and not by applying a global rule, since doing so would produce a very large output and run the risk of failling due to memory limits.The data to process is in workingFile.At each step, an entry is read from this file and processed by one of the kernels (lines 20-37).If the result is a monomial, it is added to the final file (line 22).Otherwise, it still requires further processing and is added to a temporary file (lines 25,33,36,37).When all entries in workingFile have been processed, workingFile is replaced by the temporary file (line 40).The algorithm loops back until all monomials have been extracted to finalFile.Note that each kernel has its own temporary and final files, as writing to the same file in parallel typically results in corrupted files and missing entries.Processing an entry consists in finding the highest 'Plus' in the expression and distributing the monomials around it.In this manner, the polynomial is first decomposed into the largest polynomials, which are processed further separately.

Expectation of an iterated integral
It is often of interest to calculate the moments of the solution of the SDE and this can be approximated by computing the moments of the stochastic expansion.Since the expansion is a weighted sum of iterated integrals j, its expectation is simply the weighted sum of the integrals' expectations.If the drivers consist of time and Brownian motions, the expectation of an iterated integral has a simple analytic form that can be arrived at recursively ([5]).
Only two things are required from the user: the definition of the objects Q in a transformation rule, easily obtained by derivation (eq.17), and the number of Picard iterations to be computed.In this case, the three Q's are: = bX (2)   Therefore, the transformation rule corresponding to this system is: If the first driver is time, we can follow the convention that drivers are numbered from 0: A small number (e.g. 4) of Picard iterations is sufficient for a good approximation (see [4]).An optional parameter is the maximum word length of the iterated integrals.
Running the algorithm with these parameters, we obtain the expansion, which is a weighted sum of 676 iterated integrals and already has a ByteCount (size) of 394'288.Its expectation is a much smaller expression: This is confirmed by the previous implementation of Picard iteration with the simple code: However, if we now set the initial value to an arbitrary y 0 instead of 0, the stochastic expansion contains a much larger number of terms.It is calculated in the same manner using Q as: This results in an expansion with 10'710 unique iterated integrals and a ByteCount of 15'841'624.This cannot be confirmed by the simple code as it runs out of memory and crashes before completing.Note that the expansion is usually left in a file whose entries are parts of the linear combination.Thus, it is possible to, for example, compute the expectation of the expansion without having to store it in memory at any point.Moreover, while computationaly expensive, these expansions can be calculated once in a general case and saved in a file for future application without having to recompute them de novo.

Conclusion
Stochastic expansions provide a local approximation of the solution of a stochastic (or deterministic) differential equation.They can be used for a variety of applications, from simulation to parameter estimation.However, as the number of terms grows exponentially with the desired precision, they can rapidly become unwiedly to manipulate.
We presented a new way of calculating these expansions that bypasses the limitation of the usual approach, via a re-parametrisation of the problem and the parallelization of the computation.We have shown that in a simple example our method was able to compute the expansion when a direct approach failed.We also presented two new approaches for efficiently deriving the shuffle product of two iterated integrals and the expectation of an iterated integral, when the drivers are time and Brownian motion.
So far, our approach has been implemented for one-dimensional differential equation2 .However, the theoretical foundation for the multi-dimensional case is available, as presented in section 2.3.Now that the computing requirements have been alleviated, an implementation for the general case is possible.Stochastic expansions will then be available for more complex systems.

Figure 1 :
Figure 1: Log2 of the average of the relative speed of the iterative over the recursive method.

Figure 2 :
Figure 2: Workflow.From the compact representation in Q and to the linear combination of iterated integrals J's.