Finite-Horizon, Energy-Optimal Trajectories in Unsteady Flows

Intelligent mobile sensors, such as uninhabited aerial or underwater vehicles, are becoming prevalent in environmental sensing and monitoring applications. These active sensing platforms operate in unsteady fluid flows, including windy urban environments, hurricanes, and ocean currents. Often constrained in their actuation capabilities, the dynamics of these mobile sensors depend strongly on the background flow, making their deployment and control particularly challenging. Therefore, efficient trajectory planning with partial knowledge about the background flow is essential for teams of mobile sensors to adaptively sense and monitor their environments. In this work, we investigate the use of finite-horizon model predictive control (MPC) for the energy-efficient trajectory planning of an active mobile sensor in an unsteady fluid flow field. We uncover connections between the finite-time optimal trajectories and finite-time Lyapunov exponents (FTLE) of the background flow, confirming that energy-efficient trajectories exploit invariant coherent structures in the flow. We demonstrate our findings on the unsteady double gyre vector field, which is a canonical model for chaotic mixing in the ocean. We present an exhaustive search through critical MPC parameters including the prediction horizon, maximum sensor actuation, and relative penalty on the accumulated state error and actuation effort. We find that even relatively short prediction horizons can often yield nearly energy-optimal trajectories. These results are promising for the adaptive planning of energy-efficient trajectories for swarms of mobile sensors in distributed sensing and monitoring.


Introduction
The ability to generate energy-efficient trajectories that take advantage of the inherent motions of a background flow field has significant implications for monitoring large bodies of water with intelligent mobile sensors [1][2][3], furthering our understanding of the climate and natural ecosystems [4][5][6]. Developments in this area also present economic opportunities for cost reduction in industries that rely heavily on maritime transport and shipping. Self-powered mobile sensors typically have complex performance tradeoffs, limiting size, weight, and power (SWAP). Further, most mobile sensors will only have partial and imperfect information about the ambient flow field, resulting in a finite-horizon predictive window to make decisions about its trajectory. Improving the generation of energy-efficient trajectories that intelligently leverage the flow field to go with the flow may have significant benefits in extending the duration and reach of these mobile sensing platforms. This work provides an extensive analysis of trajectories generated through a finite-horizon model predictive control (MPC) optimization of a mobile sensor in a time-varying background flow across a wide range of system parameters. Further, we establish connections between the control performance and efficiency with the alignment of these trajectories along coherent structures in the background flow.
Currently, there exists an extensive literature that has investigated various algorithms for trajectory generation for such transport applications. For example, graph search algorithms and stochastic optimization have been investigated for path planning [7][8][9]. Assimilating in-situ observations obtained by mobile sensors in an adaptive fashion into ocean models has also been explored, for example with mixed integer programming algorithms [10,11]. Coordinated control of ocean gliders for adaptive ocean sensing has been exhaustively studied in Monterey bay [12][13][14][15]. Algorithms inspired from computational fluid dynamics have also been used to explore coordinated control of swarms in flow fields [16][17][18][19][20]. However, there has been relatively little work in developing a deep understanding of the connection between the dynamics of the flow field and the nature of the optimal trajectories within the flow fields, with a few notable exceptions [21][22][23][24].
A key challenge in exploring this connection is the complexity of fluid flow fields, which typically involve the existence of multiple scales in space and time.
To understand the complexity of fluids, techniques from dynamical systems are often employed. Lagrangian coherent structures (LCS) have emerged as a robust and principled approach to uncover invariant manifolds that mediate the transport of material in unsteady fluid flows [25][26][27][28][29][30][31]. Specifically, LCS define the transport barriers in a flow field where passive drifters are attracted to or repelled by. There has been considerable work in the development of algorithms to accurately and efficiently compute these structures from data [27,30,[32][33][34][35][36][37][38][39]. The finite-time Lyapunov exponent (FTLE) is a scalar value that characterizes the divergence from a trajectory over a finite time interval, and is often used to compute LCS. The FTLE method has been successfully applied to domains of bio-propulsion [40], medicine [41,42], the spread of microbes [43], and the study of aerodynamics [44,45].
The ideas from both trajectory generation and the theory of LCS have been related in the past [21][22][23][24]. An predecessor of this was the planning of space missions using invariant manifolds [46]. In the context of ocean transport, Inanc, Shadden, and Marsden [21] showed that the optimal trajectories of autonomous agents generated using a receding-horizon optimal control algorithm overlap with Lagrangian coherent structures. Moreover, Senatore and Ross [23] exploited this idea further to generate energy optimal paths by controlling the agents to track the background LCS. Recent papers have further explored the connections between optimal control and LCS [47][48][49] in the context of path planning in the ocean. However, there is still a need to better understand how the prediction horizon and relative cost of actuation in the autonomous agent optimization relate to the use of coherent structures in the unsteady background flow.
In this work, we investigate the explicit connections between finite-horizon energy-optimal trajectories of a mobile sensor and the underlying background flow dynamics. We specifically analyze how key parameters of the MPC-based optimization affect how the resulting autonomous agent trajectory utilizes unsteady fluid coherent structures for energy-efficient transport. This analysis is performed on the double gyre flow field, which is a testbed to understand mixing and transport in the ocean. A summary of our methodology is shown in Figure 1. The choice of MPC is particularly relevant in this work, as both the FTLE and MPC rely on finite-time horizons in their computations. To explore this connection, we perform an exhaustive search through several of the trajectory optimization parameters that are important to practitioners, including the prediction time horizon and step size for MPC, the relative cost of actuation versus state tracking error, and  Figure 1: Overview of the proposed methodology for analyzing the connections between finitehorizon energy optimal trajectories and the FTLE field. A self-propelling agent is controlled to transit from a starting location to a goal location through a finite-time horizon energy-optimal trajectory in a time-varying double gyre flow field. The resulting agent trajectory, along with the finite-horizon predicted trajectories at each time step, are shown and colored-coded based on instantaneous energy expenditure (top left). The trajectory history (solid) and the future forecast trajectory bundle (dashed) at an example time instant are shown (top right); the instantaneous FTLE ridges are also shown below these with blue indicating the repelling LCS and red indicating the attracting LCS. As can be observed from the snapshots taken at four particular times (bottom), the energy expenditure along the planned trajectory, given by the color of the dashed line), and the shape of the finite-horizon trajectory depend on the evolution of the local FTLE ridges. the maximum agent velocity. We find that there are strong correlations between the presence of background FTLE ridges and the actuation energy expenditure at the corresponding locations along the trajectory.
The remainder of this work is organized as follows. In Section 2, the core methodology of MPC and FTLE are discussed. MPC will be the primary optimization algorithm used to generate trajectories, and these will be analyzed using FTLE fields. Section 3 describes models for the mobile sensor dynamics and actuation, along with the dynamics of the unsteady double gyre background flow field. The main results are presented in Section 4, including in-depth analysis of trajectories generated across a wide range of system parameters. In particular, the time horizon of the MPC optimization, the relative cost of actuation versus state tracking error, and the frequency of the background flow oscillation are all investigated. Section 5 provides a summary of results and a discussion of limitations with suggestions for future work. Appendix A also provides additional plots and analysis of the data that was not presented in the main text.

Methodology
In this section, we introduce two approaches for analyzing and generating optimal trajectories for a mobile sensor in an unsteady background flow: FTLE fields and MPC. First, we introduce the computation of FTLE fields [25,27,30] for passive tracer particles to extract Lagrangian coherent structures from a time-varying flow field. This method is particularly important to characterize the uncontrolled behaviour of drifters in terms of finite-time attraction and repulsion behaviours. Next, we introduce the preliminaries of finite-horizon MPC, which is an online control optimization algorithm that optimizes a cost function defined over a finite-time prediction horizon. We will use MPC for trajectory optimization of a mobile sensor in an unsteady background flow. MPC is a natural choice, since the mobile sensor will have limited actuation authority, and information about the flow field will only be approximate as it is limited to a finite-time horizon.

Finite-Time Lyapunov Exponents
Given a vector field v (x(t), t) : R n × R → R n , the dynamics of a passive drifter is given by Here, t ∈ R represents time, and x(t) ∈ R n is the position of the drifter, where typically n = 2 or 3, depending on the dimension of space. The FTLE field can be used to determine the LCS of an unsteady vector field [25,27,30]. The LCS are curves or surfaces in the domain where nearby trajectories x(t) are strongly attracted to or repelled from, making them time-varying analogues of stable and unstable invariant manifolds in dynamical systems theory [50]. The FTLE algorithm is as follows. First, a grid of drifters is initialized at time t 0 and numerically integrated through the flow field v(x(t), t) for a fixed amount of time (i.e., the time horizon) T ∈ R, resulting in a flow map Φ t 0 +T t 0 : R n → R n : The flow map operator Φ t 0 +T t 0 takes each drifter at an initial condition x(t 0 ) and returns its new position x(t 0 + T ) after it is advected through the vector field for a time T . Next, the Jacobian matrix of partial derivatives of the flow map, DΦ t 0 +T t 0 , is computed using finite differences for each drifter in the grid, represented by the coordinates i, j ∈ Z + , such that where x, y ∈ R are the horizontal and vertical components of the position vector x(t). This flow map Jacobian is used to compute the Cauchy-Green deformation tensor, given by where * represents the matrix transpose, not to be confused with the duration of integration T . Finally, the largest eigenvalue λ max of ∆ i,j for each drifter i, j is used to compute the FTLE field: Alternatively, σ i,j can be computed as the largest singular value from the singular value decomposition (SVD) of DΦ t 0 +T t 0 . It is important to note that for unsteady flow fields, the FTLE field will also vary in time, so that at each new time step a new grid of particles must be reinitialized and advected through the flow. This procedure is typically quite expensive, although there are algorithms to eliminate redundant calculations [32,33].
Lagrangian coherent structures are often computed as ridges of the FTLE field, which requires an additional step of computing the Hessian of σ i,j for ridge extraction. FTLE based on drifters integrated forward in time, T > 0, results in coherent structures that repel drifters. Similarly, FTLE based on drifters integrated backward in time, T < 0, results in coherent structures that attract drifters. These can be seen in Figure 1 as red and blue curves, where the red curves are attracting and the blue are repelling. FTLE fields and the resulting LCS are related to almost invariant sets from statistical dynamical systems [51][52][53][54]. In particular, LCS act as separatrices in the flow, segmenting different regions where passive tracers remain trapped [55]. FTLE and LCS have also been used extensively to analyze ocean flows [56][57][58], for example to model the spread of pollution [59]. More broadly, FTLE has been used to coherent structures and mixing in a wide range of other flows [60][61][62][63][64][65]. In this work, we will use FTLE fields generated from passive particles to investigate the trajectories of active mobile sensors, to understand how and when these sensors exploit structures in the flow field for energy-efficient transport.

Model Predictive Control
The dynamics of mobile sensors operating in real environments are often strongly nonlinear and subject to hardware constraints, time delays, non-minimum phase dynamics, instability, and restrictions on actuation capability. These limitations make the use of traditional linear control approaches challenging, motivating the powerful model predictive control optimization [66][67][68][69] described here. In this work, we use MPC to generate trajectories for a mobile sensor in an unsteady background flow and investigate how these trajectories vary with the optimization parameters.
In general, the dynamics of a nonlinear system with actuation u ∈ R m can be written as where g is the controlled vector field. In the context of this paper, the state x can be either the position of the agent, as in the previous section, or both the position and velocity of the agent. MPC is a powerful method for calculating the actuation u by formulating an iterative optimization problem that minimizes a cost function over a finite-time horizon. The controller enacts this optimal actuation policy for a short time, often for a single time step, and then the optimization problem is recomputed initialized at the current state. In this way, MPC is quite robust to model uncertainty and disturbances, as the optimization is continuously being reinitialized as new information is available about how the system actually responds to the actuation. Computing over a finite-time horizon might also make MPC more flexible and faster than a global optimization technique, especially for chaotic systems, which may result in stiff long-time optimizations. These benefits make MPC more versatile and widely used over other traditional trajectory generation algorithms. Finally, the FTLE and MPC computations are both performed over a finite time horizon, suggesting the potential for a connection between the outputs of the two algorithms.
Typically, the optimization cost for MPC can be formulated as subject to the system dynamics in (6) and control constraints imposed by physical limitations: Here, u min and u max are the minimum and maximum values the components of u can take, respectively. For example, the actuators may be unable to produce thrusts beyond a certain value. The state error is given by e(t) = x(t) − x goal . The finite-time horizon over which we forecast our model for the optimization is T H ∈ R + ; this term is similar to T , the advection time used to calculate FTLE. R ∈ R m×m is a positive definite matrix that quantifies the penalty on actuation effort, and Q 1 ∈ R n×n and Q 2 ∈ R n×n are positive semi-definite matrices that quantify the penalty on deviations of the state from the goal throughout the trajectory and at the final time step, respectively. For computational purposes, (7) is often discretized. The sampling time step is ∆t, the discretization of dτ . It is possible to improve the computational speed and convergence of the algorithm with a warm start, which uses the trajectory computed in a previous instance as the initial guess for the trajectory in the next instance [70].

Model Problem
We now discuss the models used to simulate the agent dynamics and the unsteady flow field the mobile sensor operates within. We also provide specific parameters that are used for all numerical experiments.

Sensor Dynamics
In a two-dimensional setting, a simple kinematic model for the dynamics of the mobile sensor is given by adding the velocity due to actuation, u(t), to the background flow velocity v(x(t), t) : The state x(t) = [x, y] ∈ R 2 is the position vector. The key assumption in this model is that, without control, the velocity of the sensor, dx/dt, matches the velocity of the background fluid flow. Thus, the uncontrolled mobile sensor can be considered as a passive Lagrangian drifter, and (9) degenerates to (1) when u(t) = 0 all all times. Moreover, it assumes that the sensor can generate its own relative velocity u(t) = [u x , u y ] ∈ R 2 in addition to the flow-induced velocity. It is possible to develop more sophisticated models for the mobile sensor dynamics that include inertial and rotational dynamics; in Zhang et al. [22], it was shown that trajectories based on such models also show strong correlation with the presence of background LCS.

Double Gyre Flow Field
We will investigate the motion of the mobile sensor above in the unsteady double gyre flow field described here. The double gyre flow is an analytically defined, periodic vector field that is often used to study mixing and coherent structures related to those found in geophysical circulations.
In particular, the double gyre represents a typical large-scale ocean circulation phenomenon often observed in the northern mid-latitude ocean basins. This circulation is quite dominant and is persistent, consisting of sub-polar and sub-tropical gyres. As a major type of ocean circulation, several main features of the double gyre phenomena have been identified through analyzing observational data and numerical simulations [71][72][73].
The double gyre velocity field is derived from the stream function where the time dependency is introduced by with time dependent coefficients Here, dictates the magnitude of oscillation in the x-direction, ω is the angular oscillation frequency, and A controls the velocity magnitude. Unless stated otherwise, the parameters used for the double gyre flow field are as in Shadden et al. [27], where A = 0.1, = 0.25, and ω = 2π/10. The resulting velocity field is given by

Specific Control Objective
By combining the mobile sensor model and the double gyre flow field, the dynamics of the sensor are given by d dt The objective is to move a mobile sensor from a starting location at coordinates x start = [2, 1] to a goal location at x goal = [0.5, 0.5]. The cost function is given by

Results
In this section, we examine energy-efficient trajectories for an active mobile sensor generated using MPC across a range of hyperpameters, including the prediction horizon, penalty weights on the state error and control effort, and the double gyre oscillation frequency. Our goal is to understand the sensitivity of the trajectory to parameters and to uncover performance tradeoffs, for example with the time horizon of optimization. We find a large sweet spot where effective, energy-efficient trajectories are generated. Further, we establish connections between the optimal mobile sensor trajectories and the Lagrangian coherent structures of the underlying flow field. the relative cost of actuation, and varying this parameter is important to understand performance tradeoffs when the mobile sensor has a limited actuation budget. As R/Q is increased, corresponding to actuation being more expensive, the agent actuates less, and the state tracking error increases. This increase in state tracking error tends to correspond to larger steady-state limit cycles about the goal state. The weighted actuation cost J u = Ru T u∆t increases with R/Q, as we fix Q = 1 and increase R; however, the unweighted actuation u T u∆t decreases with R/Q. Importantly, the trend of cost versus R/Q is not strictly monotonic, and there are discontinuous jumps corresponding to bifurcations in the orbit; the non-monotonic behavior and bifurcations are more pronounced for other T H in the Appendix. For small R/Q values such as R/Q = 2 and R/Q = 3, the agent moves around the goal state in a tight orbit, and this orbit continuously expands as R/Q increases, as shown for R/Q = 15. However, between R/Q = 25 and R/Q = 26 the trajectory undergoes a rapid qualitative change, where the radius of the orbit around the goal state jumps.

Trajectories with Different Relative Actuation Cost, R/Q
It is interesting to note in Figure 2 that the R/Q = 2 agent has an initial loop in the right basin, while the R/Q = 3 agent does not. This behavior is counter-intuitive, as the R/Q = 2 agent should expend control more freely, and thus more aggressively seek the goal state. As shown in Figure A.1 in the Appendix, the more aggressive agent does move away from the starting state faster initially; however, it becomes trapped on the side of a repelling LCS farther away from the goal location and must make an entire orbit around the right gyre before approaching the goal state. The maximum agent velocity is smaller than the maximum gyre velocity, so even the most aggressive agents are unable to break out of the right gyre without precise timing. This type of bifurcation also occurs for fixed R/Q = 2 by varying the time horizon, as in Figure 3. In this case, the behavior is more consistent with intuition, as the longer time horizon trajectories avoid being trapped in the right gyre.
Previous work [21] suggests that low-energy trajectories tend to coincide with the LCS of the background flow. In our example, even for R/Q = 2 and R/Q = 3, the mobile agent can be seen aligning with and exploiting the coherent structures. For example, in the top left of Figure 2, the R/Q = 2 sensor moves along on the intersection of the attracting and repelling LCS as it orbits the goal state. In the next section, we will see that the agent also precisely times its actuation before and after crossing a repelling LCS to take advantage of the background drift.

Instantaneous Energy v.s. FTLE Ridge
Given the existing connection of low-energy trajectories and FTLE ridges, we are interested in how the energy is utilized along a trajectory. Figure 4 shows how the agent 'schedules' an increase in actuation to cross a repelling (blue) FTLE ridge. After crossing, the agent decreases its actuation, as it is naturally repelled from the blue ridge and attracted by the red ridge into the left basin. Similar timing and utilization of the FTLE ridges is observed for a wide range of time horizons and control aggressiveness.

Periodic Orbits
We observe that controlled trajectories often form periodic orbits around the goal state, as seen in Figures 2, 5, and 6. Because the background flow field is periodic, the agent would require constant actuation to stay fixed at the goal state. Instead, the agent trajectory tends to form a periodic orbit around the goal, balancing state tracking error and control expenditure. Typically, this orbit is larger for agents with a tighter energy budget (i.e., for larger R/Q). Many past studies have focused on trajectory planning where the final state is fixed at the goal. However, given the constantly evolving background flow field and its dominant effect on mobile sensor dynamics, it is important in practice, to consider the cases where the final state cannot be fixed. Figure 6 also indicates that the shape of the final periodic orbit depends on the frequency of the double gyre oscillation, with the frequency of the agent orbit synchronizing with the gyre frequency.  Figure 5: The mobile sensor settle on periodic orbits around the goal state (left) and the magnitude of the Fourier transform of the instantaneous energy spent by the mobile sensor (right). We observe that the time series of the energy spent is periodic with frequencies at integer multiples of the double gyre oscillation frequency, which correspond to the peaks in the right plot.
Gyre oscillation frequency

MPC Parameter Sweep
We now present an exhaustive sweep through two of the most critical parameters for MPC, the prediction horizon T H and the cost function penalty ratio R/Q, for different gyre oscillation frequency ω. The first two parameters are related to the power and prediction capability of the mobile sensor, and the third parameter characterizes the unsteadiness of the background flow. We perform a full parameter sweep for the time horizon (T H ∈ [0, 10]) and the cost penalty ratio R/Q ∈ [0, 100], for the double gyre frequency ω ∈ [π/6, π/3]. For each parameter value, we compute the state tracking error and the (unweighted) actuation energy expenditure, integrated along the entire trajectory. Figure 7 shows the results from the MPC parameter sweep. For all time horizons and gyre frequencies, we observe that the trajectories sweep out a Pareto front in control expenditure versus state tracking error as R/Q is varied logarithmically from 0 to 100. The bottom row of Figure 7 shows three representative trajectories along the Pareto front. As R/Q is increased, there is often a sharp drop in control cost with a relatively small increase in state tracking error, suggesting that there are energy-efficient trajectories that achieve relatively good tracking performance. However, we observe a break point in this monotonic trend, beyond which increasing R/Q results in rapid deterioration of the state error with relatively little decrease in control cost. This break point corresponds to the scenario where the motion of the sensor is dominated by the background flow, : Multiple simulations were carried out at each R/Q ratio spaced logarithmically, from 0 to 100, time horizon ranging from 1 to 10, and gyre frequency ranging from 2π/4 to 2π/14. The data presented here is in the form of scatter plots for each gyre frequency with each color representing the Pareto optimal tradeoff curve between the total energy spent along each trajectory and the sum of deviations from target along the trajectory. The trajectories shown in the bottom row correspond to the highlighted purple circles (1,2,3) in the Pareto optimal corresponding to ω 4 = 2π/10. and the chaotic nature of the flow field dominates the state and energy errors. This phenomenon is more evident for smaller time horizons. It is observed that longer prediction horizons produce trajectories that are more energy efficient with smaller state errors. This is expected, as longer time horizons include more information about the flow field in the optimization. This trend is weaker for small-to-moderate R/Q and is more pronounced for larger R/Q. The shape of the Pareto curve also changes with the double gyre frequency. This shape change is particularly evident for moderate frequencies, suggesting a "resonance" in the interaction of trajectories with the background flow. Resonance with changing gyre frequency has been explored in the context of inertial particles in the double gyre flow [31].

Sensor Velocity
To gain further insight into the dependency of sensor actuation velocity on the background flow velocity, we compare their distributions along the resulting trajectory at different R/Q values. Figure 8 shows histograms of the magnitude and orientation of the sensor actuation velocity versus the background flow velocity, for a range of R/Q values. It can be observed that more aggressive agents with smaller R/Q have larger actuation velocity magnitudes and tend to move perpendicular to the background flow. Agents with larger R/Q, corresponding to more conservative actuation policies, tend to have smaller actuation velocity and align their actuation in the direction of the flow field to take advantage of the background flow. Except in the most aggressive R/Q = 1 case, the mobile sensor rarely uses the maximum control velocity. Additional plots with the xand y-components of the agent velocity are presented in Figure A.11 in the Appendix.

Discussion and Conclusions
In this work, we have investigated the behavior of finite-horizon optimal trajectories for a controlled mobile sensor in an unsteady double gyre flow field, as both the control and flow field parameters were varied. In particular, finite-time model predictive control was used to generate energy-optimal trajectories for a range of parameters, particularly the prediction horizon and the relative penalty between the state error and control effort. The double gyre oscillation frequency was varied to study its influence on the resulting trajectories. We have constrained the maximum actuation velocity to be less than the largest background flow velocity such that some degree of intelligent planning is required to reach energy optimally. Through a quantitatively exhaustive study, we have uncovered several interesting trends and established connections between the finite-horizon optimal mobile sensor trajectories and the coherent structures of the underlying flow field. By varying the relative cost of actuation and deviations in the state (i.e., R/Q), the control cost and state error sweep out a Pareto front, and there is often a sweet spot where relatively good state tracking performance can be achieved with low actuation costs. These energy-efficient trajectories tend to align with the Lagrangian coherent structures to take advantage of the unsteady background flow. Importantly, we find that it is often possible to generate effective, energy-efficient mobile sensor trajectories with a relatively short prediction horizon, which is promising for the future design of trajectories with limited or partial knowledge of the background flow field.
We observe a rough trend of lower state error when control is less expensive, which agrees with the intuition that the agent is able to more directly pursue the goal state by actuating more aggressively. However, this trend is not monotonic, as there are several cases where slightly decreasing the control cost results in worse state tracking performance. These non-monotonic changes in the cost versus R/Q correspond to bifurcations in the agent trajectories, which either correspond to longer trajectories, or to discontinuous jumps in the shape and size of the periodic orbit around the goal state. These bifurcations are more common for smaller prediction horizons, which is also consistent with the intuition that smaller prediction horizons may lead the agent to get trapped by unfavorable flow structures. Similarly, for a fixed relative control cost, there are bifurcations in the optimal trajectory with variations in the time horizon. These bifurcations are relevant in the context of generating mobile sensor trajectories using model predictive control, as small changes in the weights can lead the drastically different trajectories. Upon closer inspection, these bifurcations correspond to the agent trajectory passing through a Lagrangian coherent structure, after which the two trajectory behaviors diverge.
It is also important to note that the energy-efficient trajectories typically result in periodic orbits around the target position, since the unsteady double gyre is periodically oscillating. Previous studies in trajectory generation have mainly focused on solving boundary value optimizations for trajectories keeping the start and end points fixed. Our results show that these assumptions can be relaxed, and moreover, it is possible to reach periodic steady states with little actuation even when the uncontrolled drifter dynamics are chaotic. These periodic orbits correspond to, often desirable, station keeping or hovering behavior. This work has several implications for the control of individual mobile robots and swarms of robots in geophysical flows. The ability to generate energy-efficient trajectories that take advantage of the background flow with a short prediction horizon is promising for practical applications. The ability to maintain close periodic orbits around the goal state may also enable efficient long-time monitoring. For example, fix-wing unmanned aerial vehicles must often loiter over an area for sensing and monitoring. Variations in the shape of periodic orbits and the Pareto optimal curves over different R/Q ratios with the gyre oscillation frequency have implications for ocean applications, which exhibit a wide range of spatiotemporal scales with varying oscillating frequencies. We also observed an increase in the expenditure of the sensor's actuation energy as it approached background LCS. This result is beneficial in the context of identifying background coherent structures by observing the energy expenditure patterns of controlled agents. This is an important problem with ongoing work [76][77][78]. These results are also potentially useful in the design of scalable navigation algorithms for mobile sensor swarms where the objective is to maintain cohesion or connectivity between agents.
This work motivates a number of interesting future directions. Our results indicate that it is possible to design nearly optimal, energy-efficient trajectories, even with short prediction horizons for the model predictive control; however, it was assumed that the background flow was known perfectly for this short horizon. It will be important to further explore the robustness of these trajectory optimizations to more realistic scenarios with partial, noisy, and uncertain information about the background flow. This analysis may benefit from recent works that have investigated the sensitivity of FTLE calculations to uncertain flow field data [79,80] as well as how FTLE can be used to propagate uncertainties through chaotic flow maps [81]. Because the optimization result depends strongly on how the MPC trajectories interact with LCS of the background flow field, it may also be possible to incorporate knowledge about the LCS more directly to the optimization. Even with uncertain or partially-observed flow field information, often the LCS are quite persistent, and it may be possible to develop time-varying maps of the coherent structures in different geographical regions, for example off the Horn of Africa or in the Gulf of Mexico. In addition, it will be interesting to explore the use of other coherent structure and modal decomposition identification techniques [82,83]. Further study is also required to characterize the dynamics and coherent structures of the controlled vector field of the agent given a specific control policy. In addition, all the results in this paper were developed through the study of the double gyre flow field. It will be interesting to perform similar investigations for a variety of flow fields. For example, it will be important to explore how these results change when the flow exhibits a wider range of multiscale behavior in space and time. Extending the analysis to three-dimensional flows will also be critical.

A Appendix
Here we present additional information that provide a more detailed analysis of the performance of MPC trajectories for various parameters. In addition to these extra figures, we point the reader to the online videos.
In Figure A.1, we see the evolution of trajectories with R/Q = 2 and R/Q = 3 with a time horizon T H = 4 to explain why the R/Q = 2 trajectory initially appears to perform worse than the R/Q = 3 trajectory in Figure 2 from the main text. In particular, it appears that the more aggressive R/Q = 2 agent ends up on the wrong side of the blue LCS, which forces it to take a full revolution in the right gyre before making it to the left gyre where the goal state resides. We observe this phenomena in several different parameter regimes, where small changes in the parameters may cause agents to get forced into extra orbits in the right gyre. Figures A.2-A.10 provide similar information to Figure 2 in the main text, but with different time horizons. Even for a short time horizon of T H = 1, the most aggressive controllers achieve relatively good state tracking performance. However, the cost versus R/Q curves for T H ≤ 3 are considerably less monotonic than those for T H ≥ 4, indicating several more bifurcations in the trajectory shape. For T H ∈ [4,7], the behavior is fairly regular, exhibiting the same qualitative bifurcation behavior. Interestingly, there is a trend of bifurcations occurring later for larger T H in this range, as the longer time-horizon controllers are able to achieve slightly better trajectories for larger R/Q values.
Finally, Figure A.11 provides the histograms of the xand y-components of the agent velocity, complementing the data in Figure 8 from the main text. , where, u x , u y are the x and y-components of its actuation respectively. At each instant, the sensor is also moving over a background double gyre current vector whose components v x , v y are given by equation 12. The top row is a histogram of the x-component of control actions u x taken (in red), against the x-component of the background current velocity v x (x s , y s , t) (in black), where, x s , y s are the sensor coordinates at time t. The second row is a similar plot for the y-component. We observe that the gyre takes values beyond the actuation capacity of the sensor, which highlights the under-actuated nature of the problem. Also, at low R/Q ratios, the distribution of control actions follows a distribution with two peaks at +/ − 0.1, which corresponds to a situation similar to bang-bang control. As we increase the R/Q ratio, the distribution of control actions move to a single peak centered around zero corresponding to the use of very little control effort when compared to the background velocity.