Tutorial on the generation of ergodic trajectories with projection-based gradient descent

: A vehicle trajectory is ergodic with respect to some spatial distribution if the time spent in a region is proportional to the region's integrated density. One method of generating ergodic trajectories is projection-based trajectory optimisation, a gradient descent method that supports non-linear dynamics and balances trajectory ergodicity with control effort. In this study, the authors survey the existing literature on projection-based trajectory generation, focusing on implementation. They introduce an easy-to-use, open-source software package that generates ergodic trajectories orders of magnitude faster than previously reported. The authors’ goal is to simplify the implementation of projection-based ergodic control so it can be applied widely across disciplines. in the y -direction

state at time t according to trajectory x T trajectory duration f system dynamics; x˙(t) = f (x(t), u(t)) ϕ distribution over X c spatial distribution representing time-averaged statistics of trajectory x c centre of a specific cell in X cells V c size of a single cell in X cells N number of time discrete time steps system is controlled Δt time step between discrete states x discrete trajectory consisting of N + 1 states x 0 , x 1 , …, x N ũ discrete control sequence consisting of N control inputs ũ 0 , ũ 1 , …, ũ N − 1 δx n , δũ n perturbations about the state x n and control input ũ n A n , B n gradients of f with respect to state and control evaluated at timestep n Ã n , B n matrices such that δx n + 1 = Ã n δx n + B n δũ n Objective J objective function J b boundary penalty function q weight on ergodic score R weight on control effort c b weight on boundary penalty Optimisation z descent direction for x; contains a direction z n for each state x n ṽ descent direction for ũ; contains a direction ũ n for each control input ũ n ã n gradient of J with respect to x n and evaluated at x, ũ b n gradient of J with respect to ũ n and evaluated at x, ũ I identity matrix with appropriate dimensions Q D weights on state descents in LQ descent direction optimisation; often Q D = I R D weight on control input descents in LQ descent direction optimisation; often R D = I Q P weight on state deviations in projection operator gains LQR; often Q P = Q D = I R P weight on control deviations in projection operator gains LQR; often R P = R D = I γ step size ρ ∈ (0, 1) Armijo gradient constant τ ∈ (0, 1) Armijo step size shrink factor Other p x , p y x-coordinate, y-coordinate ν x , ν y speed in the x-direction, speed in the y-direction region is proportional to the integrated spatial density of that region. Fig. 1 compares an ergodic trajectory to one that maximises time spent in the highest density region. The ergodic trajectory distributes its time between the modes in proportion to their density, whereas the maximising trajectory just moves to the highest density point.
There are a number of robotic applications where an ergodic trajectory is a sensible strategy. One compelling example is active sensing, where a mobile sensor acts to reduce uncertainty in an environment. Examples from the literature include localisation of hidden targets with mobile sensors [1][2][3]. In these tasks, the spatial distribution is an expected information density that captures the expected reduction in target location uncertainty by making a measurement from a given state. The reduction in uncertainty can be constructed with the sensor model and information-theoretic quantities like Fischer information or entropy. An ergodic trajectory balances the competing goals of exploitation and exploration [2,3]; more time is spent sampling regions with high information density (exploitation), but sensing time is not restricted to the most dense point, making the agent robust to modelling errors (exploration). Of course, an ergodic trajectory represents an open-loop plan, and the expected information density can change as measurements are made. Closed-loop behaviour can be attained by updating the expected information density with new measurements and generating new ergodic trajectories in a modelpredictive fashion [1][2][3].
A discretised trajectory is just a set of points, and generating a set of points that is ergodic with respect to a spatial distribution is trivial -one can simply sample points from the distribution. However, the agent's dynamics must be accounted for; even if it is dynamically feasible for the agent to reach these points, the resulting trajectory might be inefficient. The goal is a trajectory that is ergodic while satisfying the agent's dynamics and minimising control effort.
There are a number of methods to generate ergodic trajectories [4][5][6][7], but the most common has been projection-based trajectory optimisation [8]. This method is a gradient descent on an objective function consisting of ergodic and control costs. Projection-based optimisation is attractive because it allows for non-linear agent dynamics and balances ergodicity with control effort, generating efficient trajectories. A number of extensions have been proposed, including discretisation and allowing distributions over non-Euclidean domains [1][2][3][8][9][10].
This paper summarises prior research on projection-based ergodic trajectory generation, with a focus on details relevant to implementation. Although the focus is on ergodic trajectory generation, this paper also serves as a brief introduction to path planning topics like projection-based optimisation and trajectory integration. Because our goal is efficient implementation, we focus on generating discrete trajectories. We also present an open-source software package for projection-based ergodic trajectory generation. This package is written in the high-performance, highlevel computing language Julia [11]. Preliminary results suggest this package is orders of magnitude faster than previous implementations, meaning it could realistically be used on a robot in real time.
The paper proceeds as follows. Section 2 describes how ergodicity is measured in different domains. Section 3 describes how projection-based trajectory optimisation works. Our software package and examples introduced in Section 4. Additional implementation details are covered in Section 5.

Measuring trajectory ergodicity
Consider an arbitrary domain X ⊂ ℝ s . Let ϕ: X → ℝ be a distribution over X, where ϕ(x) is the density at a point x ∈ X. A continuous trajectory over time horizon T is a function x: [0, T] → X. We use x to refer to the entire trajectory and x(t) ∈ X to refer to the state at the time t. We require a metric capturing the ergodicity of trajectory x with respect to the distribution ϕ.
A popular ergodic metric defined by Mathew and Mezić [12] decomposes ϕ and x into Fourier coefficients. The spatial distribution is decomposed into ∏ i = 1 s (K i + 1) coefficients ϕ k , where the vector k = [k 1 , …, k s ] is a multi-index used to simplify notation; ϕ k is short for ϕ k 1 , k 2 , …, k s and ∑ k is short for The coefficients can be computed with where F k is the Fourier basis function for the vector k.
where δ( ⋅ ) is the Dirac delta function. In ergodic control, we are effectively driving the trajectory's time-averaged statistics c to match the spatial distribution ϕ. For this reason, (2) contains a normalising factor 1/T that ensures c integrates to 1. Likewise, ϕ must be a proper density that also integrates to 1, so we can compare c and ϕ. The metric we consider compares c and ϕ by breaking both into Fourier coefficients, although another metric uses the KL divergence, a common measure of the similarity of two distributions [7]. We should note that use of the Dirac delta distribution in c means we only consider states in the trajectory. However, there may be situations where the agent footprint is larger than a single state. Consider an autonomous painting task in which the robot has a brush with some thickness, or an information gathering task in which the agent observes all states in some radius of the agent's state. Then it might make sense to define c so it includes all states in the footprint at any given time. Using different agent footprints has recently been explored [7], but the majority of ergodic control work -especially for information gathering -has used the Dirac delta footprint. Further, the projection-based method is formulated to use the Dirac delta, so we use it here.
Decomposing the distribution c into Fourier coefficients is similar to decomposing ϕ, but we can leverage the fact that δ( ⋅ ) will be zero at all points except those along the trajectory. Therefore, the trajectory's coefficients c k are The ergodic metric proposed by Mathew and Mezić [12] is the sum of the squared differences between the distribution and trajectory coefficients Proceeding, we drop the argument to c k to simplify the notation. The scaling factor Λ k weights low frequency features more heavily [12] and is defined as The ergodic metric is a functional, or function over functions.
Here, ℰ is a function over trajectory x, which is a function yielding sensor state at the time t. Minimising trajectory functional is a common problem in optimal control, allowing existing optimisation schemes to be leveraged. Further, ℰ is differentiable, allowing it to be used in gradient descent techniques. The real definition of trajectory ergodicity is ℰ(x) → 0 as T → ∞ [12]. However, we call trajectory x ergodic when ℰ(x) is small, even for a finite horizon T. One way to validate the implementation of the ergodic metric is to construct a distribution ϕ′ from the trajectory coefficients c k : If the ergodic metric is low, ϕ(x) ≃ ϕ′(x), ∀x ∈ X and the distributions should be visually similar.

Decomposition in Euclidean space
Equation (1) shows how to decompose a distribution ϕ, but we still need to provide a basis function F k . Suppose ϕ covers a rectangular domain X ⊂ ℝ s with the lower left corner at the origin so that Mathew and Mezić [12] suggest the basis function where x i is the ith element of point x and h k is a normalising term defined as [12] h k = ∫ For example, if the domain is two-dimensional (2D) (exists in ℝ 2 ), then Trajectory generation requires computing the gradient of this basis function. If the domain exists in ℝ 2 , the gradient is

Decomposition in SE(2)
The special Euclidean group SE(2) contains the set of planar translations and rotations. A state in SE(2) consists of a 2D position and a heading θ measured north of west. Spatial distributions over SE (2) are common in mobile sensing tasks, where the measurement model depends on robot position and orientation. The mobile sensor might be a ground vehicle or a flying robot with an orientation-dependent sensor like a camera; what the sensor sees depends on the vehicle's location and orientation. Then the expected information density covers a region in SE(2), denoting the value of each robot position-orientation combination. Miller and Murphey [9] suggest using the following basis function for SE(2) domains: where i is the imaginary unit, J k 1 − k 2 is the (k 1 − k 2 )th-order Bessel function and (r, ψ, θ) is the agent's state expressed in polar coordinates. That is, if the state x = [p x , p y , θ] ⊤ , then When using distributions over ℝ s , c k and ϕ k are scalars, so the norm in (4) simply functions as an absolute value. However, the basis function for SE(2) defined in (10) returns complex numbers. Therefore, when using distributions over SE (2), the Euclidean norm in complex numbers should be used [9] ∥ where the functions Re and Im yield the real and imaginary parts of their arguments, respectively. The multi-index k takes a slightly different meaning in the context of SE(2) [13]. The indices k 1 and k 2 take integer values between −K 1 , …, 0, …, K 1 and −K 2 , …, 0, …, K 2 . In theory, k 3 can take any non-negative value. To limit the number of coefficients used, we limit k 3 to positive integers up to some K 3 .
The inverse Fourier transform used to reconstruct a distribution from coefficients ϕ is also more complex than in the Euclidean case where ∑ k 1 , k 2 is the sum over all k 1 and k 2 , and F k is the complex conjugate of F k . If we limit k 3 to positive integers as described above, the inverse transform simplifies to We found the approximation introduced by limiting k 3 to integer values to be reasonable -reconstructed distributions roughly matched the originals. To find the gradient with respect to state x = [p x , p y , θ] T , we start with the partial derivatives of each state variable where exp( ⋅ ) = exp(i[k 1 ψ + (k 2 − k 1 )θ]) and the derivatives are The derivative ∂F k /∂p y is identical to (15) The partial derivative with respect to θ is Putting together the partial derivatives, the gradient of the SE(2) basis function is

Discretisation
So far, we have discussed ergodicity in continuous domains and trajectories. However, most real systems are discretely implemented; consider active sensing tasks, where sensors often output discrete measurements at some sampling frequency. We want these measurement locations to be ergodic. Furthermore, discretising will enable faster trajectory generation. Therefore, we focus on discrete trajectories. Computational improvements can also come from discretising the density ϕ. The domain can be discretised into a set of cells X cells . We assume each cell has the same size V c , with the notion of size depending on the number of dimensions in X (size is width in 1D, area in 2D, volume in 3D etc.). Denoting the centre point of a cell as x c , the Fourier coefficient c k can then be approximated as Just as with the continuous case, the distribution must be normalised so that ∑ x c ∈ X cells ϕ(x c )V c = 1. Of course, the approximation improves as the number of cells increases, but increasing the number of cells increases computational expense. Trajectories are often discretised in path planning and control problems. We use x to denote a discrete trajectory consisting of N + 1 discrete points with Δt time between them. We use x n ∈ X to denote the state at timestep n (or time nΔt), for n = 0, …, N. The set of N discrete control inputs is denoted ũ; for all N = 0, …, N − 1, ũ n is the control input applied at t = nΔt for a duration of Δt.
Using this trajectory discretisation, the trajectory Fourier coefficients c k can be defined. Prabhakar et al. [10] suggested using a left Riemann sum We prefer a slightly different discretised approximation for c k : This version makes explicit that we are looking for an ergodic distribution of discrete points. This makes sense in certain contexts, such as information gathering, where mobile sensors often receive measurements discretely at some frequency. We also add the final state x N to the ergodic score; otherwise, there is no reason to apply any control at a time step N − 1, in which case x n − 1 and ũ n − 2 are the last state and control inputs that matter in the optimisation, effectively reducing the horizon to N − 1. However, some users might want to remove the first or final states from the ergodic score. For example, if no measurement is made in the first state (perhaps measurements are only made after moving to a state), we might not count it towards the ergodicity. It is possible to remove the first or final states from the ergodic metric, as long as the sum in (21) is divided by the total number of points used in the sum.

System dynamics
When designing a trajectory, the system's dynamics need to be satisfied. We describe a system's dynamics by showing how the system state evolves in time as a function of state and control input The dynamics can also have an explicit time dependence, but we ignore it here without loss of generality.
In discrete trajectories with points temporally spaced Δt apart, the state update is This update is the forward Euler method, an explicit integration scheme. It is efficient to compute but can introduce errors, as we discuss in Section 5.3. In order to optimise trajectories efficiently, it sometimes helps to linearise about a nominal trajectory. Suppose we have a continuous trajectory x and with the corresponding control sequence u. We can linearise about this trajectory to see how perturbations -small changes in the trajectory -propagate. The perturbations δx(t) and δu(t) are small changes in the state and control at the time t. If the perturbations are small, we can approximate their propagation as where A(t) and B(t) represent the linearisation at the time t: Note that f is a vector-valued function, so ∇ x f is a matrix whose entry in row i and column j is We use the subscript on ∇ to denote which argument to differentiate with respect to; we drop the subscript if there is only one argument. We can write a perturbation update for the discrete case using the forward Euler method δx n + 1 = δx n + A n δx n + B n δũ n Δt, where A n and B n are simply evaluated at the time nΔt: We can simplify the discrete perturbation update where the matrices Ã n and B n are and I is the appropriately-sized identity matrix.

Objective and constraints
The goal in ergodic trajectory generation is to produce trajectories that yield a low ergodic metric while using little control effort. This goal is captured with the objective where q weights the ergodic metric, R weights control effort and J b is an optional boundary penalty applied if the sensor leaves the domain. In practice, the boundary penalty is often needed. The issue arises from the periodic nature of the basis functions in the ergodic metric [3,10]. A result of this periodicity is that a trajectory can achieve a low ergodic cost while spending no time in the search domain over which ϕ is defined. This phenomenon is well described in Section 3.7.3 of [3]. A quadratic barrier function is briefly suggested in previous work [3]. Instead of constraining the trajectory to remain in the domain, we penalise deviations from the domain [14]. To keep the trajectory in a rectangular domain where c b > 0 weights the boundary penalty and x n, i is the ith element of state x n .
The weights q, R, and c b are chosen for the desired performance. When q is large compared to the elements of R, a preference is given to minimising the ergodic metric. When the elements of R are large compared to q, a preference is given minimising control effort. If the resulting trajectory leaves the domain, c b can be increased.
There are two constraints on the optimisation. The first is that the trajectory adheres to the vehicle's dynamics. The second constraint is that the first state of the resulting trajectory must match the initial state x 0 . The optimisation problem can then be defined as

Solution method
Unfortunately, the optimisation problem defined in (31) is not convex and there exists no efficient way to find the global optimum. However, we can use an iterative gradient descent method -specifically, a projection-based descent. The projection method is attractive because nonlinear constraints are removed during calculation of the descent direction, making the computation much simpler. However, the resulting trajectory might not be dynamically feasible, so it is projected back into feasible space.
Before starting the optimisation, the spatial distribution ϕ must be decomposed into Fourier coefficients ϕ k . This decomposition can be computationally expensive and must be included when evaluating the time to generate a trajectory. After spatial decomposition, the optimisation is started with an initial trajectory and control sequence, which are iteratively modified until a local minimum in the objective function is reached. Often, this initial trajectory is generated by applying a constant control input. However, the initial trajectory can change the local minimum reached, so it can make sense to run the optimisation from different initial trajectories.
Each iteration consists of five steps. First, the matrices Ã n and B n are computed for each state and control in the current trajectory. Second, the descent direction is computed. Third, gains for the projection operator are computed. Fourth, a descent step size γ is selected. Finally, the step size and descent direction are combined with the current trajectory to yield a candidate trajectory. This candidate might not be dynamically feasible, so it is projected into a feasible trajectory. This trajectory is then modified in the next iteration unless a convergence criterion is met. This criterion is typically a directional derivative threshold.

Descent direction
The descent direction describes how the current trajectory and control sequence should change to decrease the objective function. The trajectory descent direction is denoted z and consists of N + 1 directions z n , one for each state x n in x. Likewise, ṽ is the descent direction for the control sequence ũ, with one direction ṽ n for each input u n . The total descent direction (z, ṽ) is derived from the gradient of the objective and can solve with the following linear quadratic (LQ) problem: where ã n and b n are the gradients of the objective J with respect to state x n and control ũ n , respectively. Each element in the initial state descent direction z 0 is constrained to zero because the sensor's starting state is fixed and cannot be changed. The descent direction is constrained to the perturbational dynamics from (29) because the descent direction is effectively a perturbation from the current trajectory. Equation (34) has been theoretically justified [3,8,15], but we briefly provide some intuition here. Our goal is to perform gradient descent on the objective in (31). Because the gradient points in the direction of greatest increase of the function, we should step in the opposite direction, which is { − ã n } and { − b n }. However, the resulting trajectory might not satisfy the agent's dynamics. Therefore, we start with the optimisation problem of minimising ∑ ã n ⊤ z n + b n ⊤ ṽ n , which is minimised when {z n } and {ṽ n } point along the negative gradients. Generally, imposing nonlinear constraints on an optimisation problem makes it much harder to solve, so we apply the linear constraint from (29), assuming the descent direction is a perturbation about the current trajectory. Later, a projection operator will correct any errors by imposing the dynamic's constraints. The objective ∑ ã n ⊤ z n + b n ⊤ ṽ n is unbounded below because the linear constraint does not prevent the descents from being arbitrarily scaled. Therefore, we penalise the descent magnitude with the quadratic cost ∑ z n ⊤ Q D z n + ṽ n ⊤ R D ṽ n . We assume the positive semi-definite matrix Q D and positive definite matrix R D are constant instead of using Q n and R n that depend on time step n -in most cases, there is no reason to vary these weights with time. It is tempting to choose Q D and R D to reflect the weights on ergodic and control costs, but the weights q and R already show up in ã n and b n , respectively. In practice, Q D and R D are usually set to identity matrices.
Let us now explicitly define the gradient ã n : The gradient of the ergodic metric is where ∇F k (x n ) depends on the basis function used. When using the SE (2)  The boundary penalty gradient with respect to x n is This gradient is zero if x n is within the domain. The control gradient b n is b n = ΔtRũ n .
Once the gradients are computed, a Riccati-like backwards iteration can be used to compute the solution to the LQ problem Once the Riccati gains are computed, the descent directions for each individual state and control input are

Projection operator
For the moment, let us assume we have a step size γ > 0 and ignore the process by which we attain this step size. This step size can be used with the descent direction (z, ṽ) to create a candidate trajectory α and candidate control sequence μ: The candidate trajectory and control sequence might not be feasible -they might not satisfy the system dynamics if the system is nonlinear. Therefore, the candidate (α, μ) is projected into a feasible space with the projection operator P: P(α, μ, x 0 ): ũ n = μ n + K n (α n − x n ) where the gains K n come from an LQ regulator (LQR) problem. If the agent's dynamics are non-linear, the projected trajectory will deviate from the candidate trajectory. However, the candidate trajectory was computed to reduce the objective function so we want to minimise these deviations. If we use δx n and δũ n to denote deviations from the candidate trajectory and control sequence at a time step n, we can write the following LQR problem: subject to δx n + 1 = Ã n δx n + B n δũ n , n = 0, …, N − 1; The gains K n can be solved with the same Riccati equations used for the LQ problem, but substituting the projection weights Q P and R P for the descent weights Q D and R D . If Q D = Q P and R D = R P , the resulting gains K n from the LQ and LQR problems will be equal, eliminating the need to actually solve the LQR problem. Setting Q D and R D to identity matrices usually provides acceptable results [16], and we suggested setting the LQ gains to identity matrices in Section 3.4; therefore, we arrive at Q D = Q P and R D = R P in practice. Regardless of how the LQR gains K n are obtained, they are then plugged into the projection operator from (42).
Recall that matrices Ã n and B n are linearisations about the trajectory at the beginning of the current iteration. Let us call this the nominal trajectory. In the projection operator, we are not minimising perturbations about this trajectory but rather the candidate trajectory. However, we assume this candidate is just a perturbation about the nominal trajectory and that the linearisation is still valid [16,17].
If the system dynamics are linear, the candidate trajectory will be feasible, because the descent (LQ) problem uses linearised dynamics. The candidate trajectory will then be a fixed point of P; the projected trajectory equals the input trajectory [15]. Therefore, the projection is unnecessary if the agent has linear dynamics.

Step size and termination
Once a descent direction is found, a step size is needed. There exist simple ways to pick a step size; if γ 0 is an arbitrary constant and i is the optimisation iteration, then γ = γ 0 /i or γ = γ 0 / i will converge to a local optimum. These simple options are efficient to compute but might lead to excessive iterations.
Backtracking line search has been used effectively in existing projection-based ergodic trajectory research [1,3,[8][9][10]. The candidate step size is initialised to γ 0 and multiplied by a constant τ ∈ (0, 1) each iteration. For example, τ = 0.5 halves the candidate step size each iteration. Iteration terminates when the Armijo condition [18] is reached. This condition is where ρ ∈ (0, 1) is an arbitrary constant and the directional derivative, or slope, along with the descent direction is where ã n and b n are the gradients from (35) and (38). This backtracking search is an inexact line search, meant to efficiently find a step size that is 'good enough' and sufficiently decreases the objective function. The idea is to start with a large step size and reduce it until the Armijo condition is satisfied; this process is guaranteed to lead to convergence in the trajectory optimisation [17]. With enough iterations, the Armijo condition will eventually be reached. If the value of ρ were 1, the right side of (44) would simply be a first-order approximation of the left side. Because ρ < 1, the right side of (44) will be larger than the left for sufficiently small γ [14].
Once the step size γ has been selected, the descent is carried out and projected into feasible space, and we assign the resulting trajectory as our new x, ũ: x, ũ := P(x + γz, ũ + γṽ, x 0 ) .
A first-order necessary condition for local optimality is that the directional derivative in the descent direction (45) equal zero. Therefore, iterations can be carried out until the magnitude of this term falls below some threshold [3]; thresholds on the order of 1×10 -3 have been used [8]. Alternate termination conditions include maximum iterations or ergodic thresholds. However, there is no guarantee that a specified ergodic threshold will be reacheddue vehicle dynamics or the spatial distribution, it is possible that any feasible trajectory will have a high ergodic cost. We present ErgodicControl.jl, a software package that generates ergodic trajectories using the methods described in this paper. This package is written in the Julia programming language, a high-level programming language with good performance [11]. As a result, our software package offers two major benefits. The first benefit is ease of use. In contrast to C code, Julia code can be run interactively and plotted easily. Below is some example code that could be run as a script or entered line by line into the Julia shell. Fig. 2 shows the resulting plot. This code is only an example, and the package might change in the future. Current code and documentation can be found at https:// github.com/dressel/ErgodicControl.jl.
It is also easy to augment the code with new types and abilities. For example, the user can introduce new dynamics models by extending the abstract Dynamics type. The user must extend functions defining how these dynamics are to be linearised. This is feasible in a language like C, but easier in a higher-level language like Julia. All examples in this paper are generated with the software package, highlighting its flexibility.
The second major benefit of ErgodicControl.jl is that it is much faster than previously reported results. The computational expense of projection-based trajectory generation is cited as a drawback and important area of future work [3,10,19]. Long compute times are especially detrimental in tasks that require recomputing trajectories in real time. Recomputing trajectories in a model-predictive fashion has been suggested for active sensing tasks like localisation; the agent plans a trajectory based on the current expected information density and recomputes as new measurements are made and the information density changes [2].

Speed comparison
To compare the speed with existing software, we use a double integrator example presented by Prabhakar et al. [10]. This example was a 30 s trajectory with time steps of Δt = 0.6 s designed to be ergodic with a double Gaussian in ℝ 2 . The ergodic and control weights were q = 1 and R = 0.01I. The Fourier decomposition used K 1 = K 2 = 5 coefficients. This number seems to capture the spatial distribution well while keeping computation reasonable so we use the same number in all examples shown in this paper.
Prabhakar et al. [10] report that optimising discrete trajectories is much faster than optimising continuous trajectories because simple Riemann sums are easier to compute than more complicated integration schemes. However, we also discretise the domain into 100 2 cells and integrate with (20). Prabhakar et al. [10] report around 50 s to run 100 iterations of the gradient descent, while our code performs 100 iterations in roughly 0.2 s -over two orders of magnitude faster. We also only run 100 iterations to compare with previous work; the trajectory reaches a directional derivative criterion of 1×10 −3 after 43 iterations and 0.09 s. The resulting trajectory looks similar to those in Fig. 3.
Much of this speed improvement is likely due to our use of Julia, whereas previous work used Mathematica [10]. We might see further improvements with a C++ implementation. Further, there are other methods for generating ergodic trajectories that do not use projection-based optimisation and can be performed more quickly. However, our speed improvement proves projection-based ergodic trajectories can be generated in real time. To our knowledge, there are no other software packages that achieve this. As a result, users can now generate projection-based ergodic trajectories for applications that require computing trajectories in real time.
However, there is still room for improvement. For example, trajectory generation in SE(2) is much slower, because of the computationally expensive Bessel function. Trajectory generation for an SE(2) example with the domain split into 50 3 cells takes roughly 3 s; half is spent on the domain's Fourier decomposition and half is spent performing 56 iterations to convergence. This time is too slow to compute in real time, and future work might try to minimise the effect of the Bessel function calls.

Effect of domain discretisation
Domain discretisation is one factor in our code's improved performance. A reasonable concern is the effect of this discretisation -does the resulting coarse numerical integration over the domain affect the trajectory generated? To investigate, we performed the same example from above with 10 2 cells and 1000 2 cells. The resulting trajectories can be seen in Fig. 3. It does not seem that the discretisation level has much effect on the trajectory. However, the time required to compute the 1000 2 trajectory is longer; 100 iterations took 0.35 s to compute, about half of which was spent decomposing the spatial distribution. In contrast, both the 10 2 and 100 2 trajectories took 0.2 s, and almost none of this time was spent decomposing the distribution because the numerical integration was much coarser. Although this test suggests domain discretisation level has little effect on the resulting trajectory, it is not conclusive; discretisation might introduce noticeable errors with distributions that are less smooth or when more Fourier coefficients are used.

Boundary penalty
A final example shows the importance of the boundary penalty in certain cases. We generate a trajectory for a Dubins car [20] with a velocity of 0.3 m/s and a minimum turn radius of 0.1 m. The car  Fig. 4 shows the trajectories generated identically except that on the left, no boundary penalty is applied. In this case, the car leaves the domain because applying control (rotating the car) has some cost; the car can instead apply no control and leverage the periodic basis functions. The basis functions make it seem that Gaussian distributions are mirrored across the boundary; by applying no control, it can traverse these mirrored distributions and achieve an ergodic trajectory. But when the penalty from (32) is applied and weighted with c b = 1, the car stays in the domain.

Additional implementation details
We have covered the basic process of generating ergodic trajectories, but there are a number of details and extensions that are relevant to certain problems. We cover some of these here.

Workspace and agent state space
So far we have ignored an important detail concerning the problem domain. Sometimes, the agent's state space differs from the domain over which our distribution is defined. That is, our agent has some state space X a such that X a ≠ X. Then the agent trajectory is a function x: [0, T] → X a ; in the discrete case, we can say x n ∈ X a ≠ X. This mismatch is not uncommon. For example, consider a distribution over a domain X ⊂ ℝ 2 and an agent with double integrator dynamics in ℝ 2 . The agent state space X a ⊂ ℝ 4 consists of two position and two velocity state variables. This scenario might occur in a mobile sensing task where measurements only depend on the agent's 2D position, but we need the velocities to propagate the agent's dynamics. Another example is a Dubins car with trajectories ergodic to a distribution over ℝ 2 -the car's state has 3D (2D position plus heading) whereas the distribution's domain consists only of a 2D position. We call X the workspace or distribution domain and X a the agent's state space.
In the above examples, X a is just an augmentation of X; velocity or heading variables are simply added to the position variables. However, in other applications, the mismatch is more significant. Consider a robotic arm with several links. We might want the end effector of this arm to move ergodically with respect to some distribution in Euclidean space, but the arm's configuration space might be expressed more conveniently in the arm's link angles [3]. The ergodic metric depends on the end effector's position in the workspace, but controlling the system requires the link angles.
To rectify the mismatch between X and X a , we need a differentiable function g: X a → X mapping the agent's state space to the workspace. In the double integrator example, with agent state x(t) = [p x , p y , ν x , ν y ] ⊤ , the mapping is leaving only the position variables. Likewise, a Dubins vehicle with the state x(t) = [p x , p y , θ] ⊤ would have the following mapping: The joint example is more complicated and involves solving the Euler-Lagrange equations, but a relation g(x(t)) can be derived [3]. The agent states must be mapped into the distribution domain before feeding them to the ergodic metric. Instead of ℰ(x) and c k (x), we have ℰ(g(x)) and c k (g(x)), where g(x) applies g to each state x n in x. An extra term appears in the ergodic gradient where x n ′ = g(x n ).
Consider the 2D double integrator example. The extra term is The gradient of the basis function with respect to the mapped state g(x n ) is a vector of length two; that is, ∇F k (x n ′) ∈ ℝ 2 . Leftmultiplying by ∇g(x n ) simply pads this vector with two zeros, leading to a vector of length four. The agent state has 4D, so the gradient with respect to this variable must be of length four. This example shows that augmentation mismatches are trivial to correct. We can simply take the gradient with respect to agent state variables that make up the distribution domain and set the partial derivatives with respect to the augmented variables to zero.

Shifting rectangular domains
Recall that in Section 2.
If the rectangular domain is shifted, the boundary penalty and gradients are also changed in a similar way.

Integration
So far we have only integrated our dynamics with the simple forward Euler method. This integration scheme is explicit -only current state information is used to generate the state at the next time step. Using a large step size Δt can lead to numerical issues when integrating with the explicit forward Euler method because it does not conserve energy; each step in the integration can artificially add energy to the system, leading to unstable trajectories [10].
To understand how energy is added, consider a mass attached to a spring. Initially, the mass starts with zero velocity and some displacement from the spring equilibrium. If generated with the explicit Euler, the next state will have the same displacement -the initial state had no velocity -but non-zero velocity because the initial displacement induced a spring force. The potential energy is the same, but kinetic energy has been added as the velocity is no longer zero. Our integration has increased the total energy of the system, even though no external force acts on it. Likewise, suppose the mass started at the spring equilibrium but with non-zero velocity. Integration leads to the same velocity as there is no displacement and spring force in the initial state, but the position changes. Now potential energy has been added and kinetic energy remains the same, increasing the system's total energy. Integrating over many steps would incur more energy errors, leading to oscillations that increasingly grow in amplitude.
If the system is linear, we can avoid these integration errors by performing an exact integration. This method assumes control is constant over the step (which we assume) and allows us to change the continuous time matrices A and B into discrete forms Ã and B [21]. The exact integration requires the matrix exponential To use exact integration, we would use these Ã n and B n matrices in our LQ constraints. The projection operator is unnecessary as the dynamics are linear. Because this method is an exact integration over the time step, it does not produce integration errors. Unfortunately, it only works for linear dynamics.
If the agent has nonlinear dynamics, symplectic integration can be used, which is more accurate than forward Euler [10,22]. The symplectic Euler is particularly relevant in systems where we can partition the state into a position p and velocity ν [22]. Suppose we have equations that describe how the position and velocity change ṗ = f p (p, ν, u), In the forward Euler, we update the velocity and position explicitly; we only use current state information to generate the next. In the symplectic Euler, we update the velocity implicitly and the position explicitly. That is, our velocity at the next step also depends on the velocity at that next step, as demonstrated below: p n + 1 = p n + Δt f p (p n , ν n + 1 , u n ), ν n + 1 = ν n + Δt f ν (p n , ν n + 1 , u n ) .
The symplectic Euler is also called the semi-implicit Euler because the only velocity is solved for implicitly.
To use symplectic integration, we replace the forward Euler with the symplectic Euler in the projection operator's state update shown in (42). We also need a linearised version of (51) for the LQ and LQR constraints. This can be done by splitting the linearised dynamics into block form These blocks form the function f δν : We can then apply the semi-implicit update from (51). Recall that we solve for δν n + 1 implicitly, so we end up with δν n + 1 on both sides of the equation δν n + 1 = δν n + Δt A 21 δp n + A 22 δν n + 1 + B 2 δu n .
We can solve for δν n + 1 : We can define the bracketed parts as blocks in new matrices Ã n and B n that describe the discrete update δν n + 1 = Ã 21 δp n + Ã 22 δν n + B 2 δu n .
We can perform the same procedure with the position update, substituting for δν n + 1 and deriving new block matrices. The resulting update is where the block matrices are as follows: At time step n, we can take the continuous matrices A n and B n and use these equations to create Ã n and B n , which are then used in the LQ and LQR constraints. Prabhakar et al. [10] presented a good example demonstrating the importance of symplectic integration. The trajectory is 50 steps long with a step size Δt = 0.6 s (for a time horizon of 30 s). The system is an undamped oscillator with the following dynamics: The agent starts from rest at (0.0, -0.49) and the trajectory is initialised using a constant u n = [1e − 4, 1e − 4] ⊤ . Using explicit, exact, and symplectic integration, we generate trajectories, terminating the optimisation with a directional derivative criterion of 1 × 10 −4 . Fig. 5 shows resulting trajectories when using explicit, exact, and symplectic integration. Explicit integration artificially adds energy to the system. Fighting this extra energy is difficult, so the resulting trajectory deviates wildly from the domain, leveraging the periodicity of the ergodic metric. It is tempting to simply impose a boundary cost and stick with explicit Euler integration. This solution can keep the trajectory in the domain, as Fig. 5 shows, but the system's energy is still artificially increased. Table 1 shows the control and computational costs for each trajectory. When using explicit integration and a boundary penalty, the control cost is much higher, because the control inputs are fighting against kinetic energy that has been artificially introduced. Further, solving takes much longer, because the solver must balance the competing interests of artificial energy and bounding the domain. It is important to note that symplectic integration is not always needed, but it becomes important as step size increases or in systems with dynamics sensitive to artificial energy increases.

Time-evolving spatial distributions
Suppose a spatial distribution evolves with time in a known manner. The spatiotemporal distribution, denoted ϕ(x, t), yields a spatial density ϕ(x) at a given time t. We can generate trajectories that are ergodic with respect to ϕ(x, t) by extending ergodicity to time-evolving distributions. To do so, we simply add time as a variable in the workspace [3]. For a rectangular domain in ℝ s , the resulting basis function is where k is of length s + 1, with the last element corresponding to the newly added time dimension. The distribution decomposition is similar to the time-invariant version shown in (20). Because the trajectory is discrete, one spatial density is needed per step. The decomposition is normalised by the number of spatial densities to normalise the spatiotemporal distribution, leading to As noted by Miller [3], the optimisation is unchanged beyond the computation of ergodic metric and gradient. Fig. 6 shows an ergodic trajectory for a single integrator and a time-evolving distribution.

Multi-agent trajectories
Suppose we have a multi-agent system consisting of N a agents and we wish their combined motion to be ergodic with respect to some spatial distribution. We denote the set of individual agent trajectories {x} and use x n j to refer to the state of the agent j at time step n. The multi-agent system's trajectory coefficients are an average of the individual trajectory coefficients The ergodic gradient becomes Otherwise, the optimisation is unchanged. The LQ and LQR problems can be solved with respect to the overall system state, which is just a concatenation of individual agent states. The gradients at each time step can also be concatenated. The constraint and reward matrices are block diagonal matrices constructed with the individual agent matrices. The multi-agent system can be heterogeneous, with each agent having different dynamic models. An example multi-agent trajectory is shown in Fig. 7. The multi-agent system consists of two single integrator agents, one starting from (0.5, 0.02) and the other from (0.2, 0.98). The resulting trajectory is efficient -each covers the distribution mode closest to its starting point, resulting in an ergodic multi-agent trajectory with low control cost.

Spectral multiscale coverage
In addition to presenting a commonly used ergodic metric, Mathew and Mezić [12] derived a simple algorithm for designing ergodic trajectories, called spectral multiscale coverage (SMC). SMC consists of simple feedback laws that are only valid for multi-agent systems of single or double integrators. Most importantly, SMC has low computational cost and generates trajectories quickly, so we briefly cover it here and compare it to projection-based optimisation.
The SMC feedback laws were designed to maximise the rate of decay of the ergodic metric at the end of an infinitesimal time horizon. For single integrator systems, the feedback law leads to a control input u j (t) for an agent j at the time t and is defined as  where u max is the maximum agent velocity and B j (t) = tN a∑ k Λ k c k (t) − ϕ k ∇F k (x j (t)), where N a is the number of agents in the system and c k (t) is the decomposition of the trajectory executed so far. Denote the current time step n c , so that t = n c Δt, the decomposition is In the double integrator case, Mathew and Mezić [12] decompose agent j's state into the agent position p j (t) and velocity ν j (t), leading to the feedback law where F max is the maximum force that can be applied and c ν is a parameter describing a penalty on kinetic energy. When c ν is high, the agent will have slower velocities. SMC is fast and can easily be computed in real time. However, SMC suffers from two major drawbacks. The first is that it only works for two dynamics models -single and double integrators. Second, resulting trajectories typically require far more control effort than those from projection-based optimisation. This inefficiency arises from the greedy nature of the algorithm and its lack of a control penalty. To see the effects of SMC's greed, consider an agent starting at one mode of a bimodal distribution, where each mode has equal density. Because it has already spent some time at the mode, the agent will move to the second mode to balance the time spent at the modes, thereby reducing the ergodic metric. However, once the agent has spent more time at the second mode than it had at the first, it will return to the first mode to rebalance. A more efficient approach would spend half of the time in the first mode and then move to the second for the remaining time. However, this approach takes the entire trajectory into account, not just the next step. Mathew and Mezić [12] call this effect multiscale -the agent first covers the distribution coarsely and then starts to cover it more finely.
There are applications where this multiscale effect is desirable, such as uniform coverage problems, where the goal is to evenly sample points in some domain [12]. Given enough time, an agent following an SMC trajectory will visit all states with non-zero density. In contrast, a projection-based trajectory is generated with a specific horizon in mind and balances ergodicity with control effort, leading to trajectories that cover a space more sparsely. This feature helps explain why robotic painting is better solved by SMC [19]. Replicating an image requires covering some areas very densely, even at a high control cost, and it is hard to coax the projection-based method to this behaviour [19]. Further, many discrete points are needed to faithfully recreate an image, adding to the already high computational cost of projection-based optimisation.
ErgodicControl.jl also generates SMC trajectories, and Fig. 8 shows an example of the multiscale effect. Trajectories are generated for an agent with single integrator dynamics and 400 time steps. When using SMC, the agent oscillates between the distribution's two modes, which is an inefficient way to cover them. However, the entire trajectory is computed in only 0.001 s, whereas the projection-based trajectory reached directional derivative criteria of 1 × 10 −4 in 0.48 s.

Conclusion
Projection-based trajectory optimisation is not the only technique for generating ergodic trajectories. However, it has been the focus of much more research and its various improvements are spread across several publications. This paper is our attempt to condense this work and cast light on implementation details that could fluster new users. We also present a software package, ErgodicControl.jl, which generates ergodic trajectories with projection-based optimisation faster than previously reported. Our goal is to automate ergodic trajectory generation so it can be applied to new applications without having to 'reinvent the wheel'. The package can be found at https://github.com/dressel/ErgodicControl.jl.
Future work will leverage our software package to evaluate ergodic control in a number of different applications. Future work will also focus on adding functionality to the software package. Added functionality will include new ergodic trajectory generation methods, including a sampling-based motion planner [7] and a potential field-based method [6].

Acknowledgments
This work was funded by National Science Foundation grant DGE-114747.