Survey of optimal motion planning

: Optimal motion planning becomes more and more important these days and is critical for motion planning algorithms developed in the academia to be applicable to real-world applications. The objective of this article is to survey the current state-of-the-art on optimal motion planning algorithms in terms of their three main components: the decision variables and problem formulation, constraints, and objectives. A selection of proposed techniques is reviewed along with a discussion of their effectiveness and limitations. The side-by-side comparison presented in this survey helps to gain insight into the strengths and limitations of the reviewed approaches and assists with system-level design choices.


Introduction
Autonomous robots are becoming increasingly important in both industry and everyday life. In industry, rising labour costs are motivating manufacturers to consider using more robots in factories. For example, the average minimum wage in China increased by >20% in 2012, while the supply of manufacturing robots has also increased by 51% in China [1]. Europe and USA exhibit similar trends: autonomous robots are being designed to make workers more productive and make manufacturers more competitive in terms of price and quality.
The tremendous improvement in the design and availability of intelligent robots over the last decade is based on progress in many related areas, including computer vision, artificial intelligence, machine learning, control, sensor systems, and mechanical systems. Compared to traditional industrial robots, one important feature of the modern intelligent robot system is the motion planning, which can be informally described as follows. Given a robot with a description of its dynamics and the environment, and a set of goal states, the motion planning is to find a sequence of control inputs so as to drive the robot from its initial state to one of the goal states while obeying the rules of the environment, e.g. not colliding with the surrounding obstacles. The capability of efficient and safe motion planning is essential for robots to work reliably in dynamic environments along with humans.
The concept of motion planning was first presented in [2], and a large variety of algorithms have been proposed in the past decades. For the detailed history and summarisation of the motion planning, please refer to classical textbooks such as [3][4][5].
The initial focus of the motion planning research is concentrated on finding a complete planning algorithm, where an algorithm is said to be complete if it terminates in finite time, returning a valid solution if one exists, and failure otherwise. Many complete algorithms have been proposed for low-dimensional robots (with state space dimensions ≤ 3) in the past decades, and the representatives are geometric methods such as bug algorithm [6], visibility roadmaps [7], and their variants. The main idea of these methods is to build a graph structure over the 2D or 3D environment, and then use graph-search algorithms such as A* to find the shortest path over the graph either greedily or exhaustively. Unfortunately, the problem of complete planning is known to be very hard from the perspective of computational complexity. For instance, a basic version of the motion planning problem called the piano movers problem is proven to be PSAPCE-hard [8].
Practical planners came around with the development of celldecomposition methods [9] and potential fields [10,11]. These approaches relaxed the completeness requirement to the resolution completeness, i.e. the ability to return a valid solution if one exists and if the resolution parameter of the algorithm is sufficiently small. These planners demonstrated good performance in complex environments in terms of the success rate and the computational time. However, their practical applications were mostly limited to robots with state spaces up to five dimensions, because the decomposition-based methods suffered from an exponential number of cells, and potential field methods suffered from local optima.
To plan the trajectory for robots with high degrees-of-freedom (DOF), such as the industrial robots (usually six to seven DOFs) and humanoid robots (usually 30 or more DOFs), one main revolution to the motion planning is the development of samplingbased algorithms [12,13], and representatives are probabilistic roadmaps (PRM) and rapidly exploring random trees (RRT). These methods demonstrated to be very effective for motion planning in high-dimensional spaces and continuously attracted attention over the last decade, including some very recent variants and improvements [14][15][16]. Instead of using an explicit representation of the environment, sampling-based algorithms rely on a collision checking module to provide information about the feasibility of candidate trajectories and to connect a set of points sampled from the obstacle-free space in order to build a graphic roadmap of feasible trajectories. The roadmap is then used to construct the solution to the original motion planning problem. Informally speaking, sampling-based methods bypass the curse of dimensionality problem by avoiding explicit construction of obstacles in the state space. It provides a relaxed guarantee called the probabilistic completeness in the sense that the probability that the planner fails to return a solution if one exists decays to zero as the number of samples approaches infinity. The rate of decay of the probability of failure is exponential under the assumption that the environment has good 'visibility' properties [12,17].
The early work on motion planning focused on finding a trajectory that satisfies the constraints imposed by the environment of the application. However, a feasible solution is not sufficient in most applications where the quality of the solution returned by the motion planning algorithm is also important. For instance, in many industrial applications, the users are interested in solution paths with the minimum time to execute, so as to achieve the highest productivity. This requirement results in the problem of optimal IET Cyber-Syst. Robot., 2019, Vol. 1 Iss. 1, pp. [13][14][15][16][17][18][19] This is an open access article published by the IET and Zhejiang University Press under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) motion planning, i.e. computing motion plans that are of the minimum cost with respect to a given cost functional, such as the length of a path or energy consumed. This problem has been proven to be very challenging even in basic cases [18].
There are some other limitations of the early motion planning work discussed above. First, it is difficult for these approaches to take into account the complex constraints imposed by the robot dynamics, though some early work was able to deal with simple kinodynamic constraints (e.g. the velocity or acceleration bounds) [19,20]. The lack of consistency is another major limitation of the early sampling-based methods, especially for those multi-query algorithms like RRT and its variants [13]. Here, the consistency requires that given two queries with similar initial/goal states, the resulting trajectories should be similar. It is well-known that sampling-based planning is not consistent, which significantly limits their application in industrial application where the predictability of the trajectory is very important. Both these limitations can be handled by the optimal motion planning methods.
Informally speaking, optimal motion planning methods formulate the motion planning task as an optimisation problem, which includes three components: i. a suitable representation of the problem in terms of the choice of decision variables; ii. a suitable representation of the constraints; iii. a suitable representation of the objectives.
Given the optimisation problem, we can then solve it using either out-of-box optimisation solvers or using specially designed optimisation approaches. Here, we will discuss all the optimal motion planning approaches according to their different choices in terms of the three optimisation components: variables, constraints, and objectives, as summarised in Table 1.
Here, we will discuss three mainstream methods for optimal motion planning, including i. grid-based methods with motion primitives and state cells; ii. sampling-based methods that are asymptotic optimal; iii. trajectory optimisation techniques.
In particular, we will focus on analysing the advantages and disadvantages of these approaches, and whenever possible, provide potential solutions to the limitations of previous optimal motion planning algorithms.
The rest of the paper is organised in the following manner. We discuss the decision variable representation of the optimal motion planning problem in Section 2, investigate different ways to formulate planning constraints in Section 3, and then summarise different ways to describe the objectives in Section 4.

Decision variables and representation for optimal motion planning
The grid-based optimal motion planning algorithms [21][22][23][24][25][26] divide the configuration space into finite cells based on a regular structure (often a grid) of some size or resolution and thus are frequently referred to as search-based planning methods. Each cell represents the set of all configuration space states that fall within it. A representative state such as the cell centre is chosen to determine whether a cell is free or not. A graph is constructed by assigning a vertex to each cell, and by adding edges between vertices that are close in the configuration space and can be connected by a feasible motion. Vertices and edges are only added when they are found to locate in the collision-free region of the configuration space. All edges can have costs associated with them, providing a way to minimise a cost function along the path instead of just finding any path. Using this representation, the optimal motion planning problem is turned into finding a path optimisation problem over the graph. Note that even with a finite number of vertices, the constructed graph can often be too large to fit into the memory. As a result, the graph is often generated in an online and on-demand fashion, where vertices/edges are verified and allocated only when the graph search algorithm research them. In this way, even for problems with an infinite number of states, the search-based method can find the goal by only allocating memory for a small portion of the configuration space. There are many extensions of the grid-based search. For instance, they were extended to run in an anytime fashion [27,28] to deal with dynamic environments [29], and to handle systems with differential constraints [30]. These have also been successfully demonstrated on various robotic platforms. However, optimality guarantees of these algorithms are only ensured up to the grid resolution. In addition, the number of grid points grows exponentially with the dimensionality of the state space in the worst case, thus the worst-case running time of these algorithms can be slow.
Sampling-based optimal motion planning algorithms can be considered as a variant of the grid-based optimal motion planning, and the main difference is that the vertices are generated by random sampling instead of fixed grid cells. These algorithms avoid explicitly representing configuration space obstacles, which can be computationally expensive. They instead sample vertices and connect them with local paths in the collision-free state space resulting in a graph optimisation problem with the joint configuration as the decision variables. Due to the randomness, the sampling-based approach returns zigzag trajectories which are not acceptable for many applications. Some RRT variants have used different heuristics to improve the path quality but are not provably optimal [31]. One important progress was achieved through the use of random graph theory to rigorously show that with appropriate parameters and an A * -alike rewiring procedure, roadmap-based approaches can achieve asymptotic optimality [32]. In particular, each new sample must be tested for connection with at least a logarithmic number of neighbours as a function of the total number of nodes using a steering function. The resulting algorithms PRM* and RRT* motivated a set of new methods that are asymptotically optimal, including the anytime [33], incremental [34], lazy [35] variants, and some other heuristics [36]. The roadmaps of these asymptotically optimal methods can be slow to construct and quickly grow too large for storage for fast online queries, especially for relatively high-dimensional instances. To overcome this challenge, techniques that guarantee asymptotic nearoptimality using sparse roadmaps have been proposed, including [37][38][39][40][41].
A summary of the different obstacle representations used in the optimal motion planning algorithms is provided in Table 2.
Trajectory optimisation is an established method in robotics to generate a high-quality path from an initial trajectory that may be in-collision or dynamically infeasible. It has also been used as a post-processing phase to remove redundant or jerky motion in trajectories generated by traditional planning algorithms [5]. Trajectory optimisation works by solving a constrained non-convex optimisation problem over the trajectory space in terms of a sequence of states. The objective usually is the smoothness of the trajectory. Common inequality constraints include obstacle avoidance, joint limits, joint speed limits, inverse singularity etc., and common equality constraints include the end-effector pose (e.g. reaching a target pose at the end of the trajectory) and orientation constraints (e.g. keeping a held object upright). For under-actuated, non-holonomic planning, additional equality constraints are added to ensure consistent kinematics and dynamics. Many approaches use the waypoint-based representation for trajectory with each waypoint being a robot configuration and use cost-gradient information for minimisation. Such methods Table 1 Taxonomy of optimal motion planning algorithms in terms of the decision variables, optimisation objectives, and optimisation constraints Variables Objectives Constraints include CHOMP [42], STOMP [43] and Trajopt [44]. It is also possible to use a parameterised spline to represent the trajectory as a sequence of spline segments, and this representation has been used in humanoid robot motion planning [45,46]. There is extensive, closely related work in optimal control, which focuses less on collisions and more on systems with complex dynamical properties. Some of the most notable work includes differential dynamic programming [47,48], iterative LQR [49], approximate inference control [50], and optimisation-based control involving contacts [51][52][53][54]. Recently, the relationship between the trajectory optimisation and simultaneous localisation and mapping (SLAM) has also been explored [55].

Constraints for optimal motion planning
In this part, we first analyse three typical types of constraints that are widely used: the collision-free constraints, the pose constraints, and the dynamics constraints. We will discuss how to deal with general constraints. After that, we will briefly discuss how find feasible solutions to these constraints.

Collision-Free constraints
Both sampling-based and grid-based methods use collisionchecking procedure to guarantee all the vertices to be collision-free and use continuous collision checking or other local planning procedures to guarantee all edges to be collision-free. Thus, the collision-free constraints are implicitly satisfied in the resulting graph. Thus, we here only discuss how the trajectory optimisation formulates the collision-free constraints. A successful collision-free motion planner for general problems should take into account the global geometric information of obstacles. For instance, the sampling-based planners use the sampling procedure to explore the entire configuration space and obtain the global geometric information step by step. Such global exploration makes the sampling-based planners probabilistic complete. Almost all trajectory optimisation methods only utilise local geometric knowledge about the environment, and thus are not complete, i.e. they cannot guarantee to perform well in all scenarios.
For instance, Trajopt [44] determines its collision avoidance strategy based on the gradient information. Such gradient can be misleading in many scenarios (with Fig. 1 as a simple example) since instead of using the entire geometric information about the obstacle, it only leverages the local penetration depth information between the robot and the obstacles.
While geometric methods and CHOMP [42] can succeed in the simple scenario in Fig. 1, they are not general enough to resolve all situations violating the collision-free constraints. CHOMP requires the pre-computation of a potential field in terms of the signed distance to the obstacles, where the potential field can be computed using two different approaches: i. calculate a distance function for geometric obstacle primitives (boxes, spheres, and cylinders); ii. discretise the entire workspace and calculate distance value for each volume using Euclidean distance transformation (EDT) algorithm.
In this way, CHOMP is able to leverage the global geometric information about the obstacles. However, it can be trapped in the local optima, especially for scenarios with non-convex obstacles. As a result, selecting a good initial guess plays an important role in improving the performance of CHOMP [56]. Note that the reasons why Trajopt and CHOMP need good initial guesses are different. For Trajopt, the distance gradient may not be sufficient to find a feasible solution, and randomness is required to generate enough 'perturbation' such that the initial guess can locate in some attraction region where the gradient can lead the trajectory into a collision-free status. CHOMP fails mainly due to the highly non-linear property of the distance function, which makes the global optimisation challenging.

Pose constraints
In contrast to the collision-free constraint, trajectory optimisation methods outperform the sampling-based planner when dealing with pose constraints. The main reason is that the pose constraints induces a lower dimensional manifold, i.e. one with zero measure in the configuration space. As a result, the sampling-based planning algorithms, even when using various heuristics, are difficult to generate a sufficient number of samples to reconstruct the connectivity in the constrained manifold [57]. In Table 3, we summarise the different strategies that sampling-based algorithms can use to deal with pose constraints.
Next, in Table 4, we compare the sampling-based methods and trajectory optimisation methods while solving the pose constraints.

Dynamics constraints
Dynamics constraints are usually the most challenging one for motion planning [19], and there are a wide variety of differential dynamics constraints including non-holonomic constraints and kinodynamic constraints.
Non-holonomic constraints mean the non-integrable speed constraints. This concept was first introduced by Laumond [58] and is mostly about kinematics which can be formulated using action spaces and configuration transition functions. Kinodynamic constraints were introduced by Donnald et al. [59], which added second order or high-order constraints to configuration spaces. Interesting work about motion planning and optimisation taking into account non-holonomic constraints includes [60], which used sinusoids to steer cars. One solution to deal with the nonholonomic constraints is using discrete time models and reachability graphs.
To deal with kinodynamic constraints, a popular solution is to define motion primitives. For example, Frazzoli [61] used motion primitives to manoeuvre drones; Lamiraux [62] used motion primitives to expand RRT trees, where the node expansion procedure was also optimised. Hsu [63] adapted his EST method to deal with kinodynamic constraints, where the expansion of motion primitives was performed in the configuration-time space and this method is able to deal with moving obstacles.
Besides kinodynamic constraints, there exists work that considered other challenging dynamic constraints such as the friction force [64].
For the drone dynamics, another popular solution besides motion primitives is to use mixed-integer linear programming (MILP), which is first introduced in [65]. In this approach, the environment is segmented into several convex regions [66,67], and a piecewise polynomial function is used to generate a trajectory that is both free of collision and feasible in dynamics.
The optimal motion planning taking into account the dynamics constraints has many applications in humanoid robots, where balancing is one of the most important factors. There are many classical work about this topic, including [45,46,53,68,69]. Some recent work on soft robot control also benefited from the progress in this area [70].

General constraints
The collision-free constraint and pose constraint (for high-DOF robots) share three properties: i. too complex to have an explicit and analytical representation of the constraint manifold in the configuration space; ii. the distance to the constraint manifolds is a non-linear function; iii. both essentially an inverse kinematics problem.
In Table 5, we further investigate the constraints involved in the motion planning in terms of two factors: the measure of the constraint manifold in the configuration space, and whether the distance to the constraint manifold in the configuration space is well-defined.

Find feasible solutions
Given the constraints, one remaining problem is how to find the feasible set for the optimal motion planning problem, i.e. the set of trajectories that satisfy all the constraints. This is a non-trivial problem due to the non-linearity of many constraints. From the comparison in Table 5, it is obvious that heuristics play an important role in the performance of the motion planning. We here provide a brief comparison between the heuristic sampling-based approach and the trajectory optimisation technique using a concrete example.
Urmson et al. [31] proposed a heuristic sampling-based algorithm (hRRT), which biases samples to the state with higher quality by shaping the probability distribution of samples. It emphasised the importance of an appropriate probability distribution in the configuration space for fast convergence. Fig. 2 shows a situation where traditional RRT algorithm may take a long time to find a solution while hRRT can be very fast.
The difference between RRT and hRRT arises because similar to the trajectory optimisation, the sampling-based algorithms also suffer from the lack of global geometric information about the environment when the planning starts, and have to spend a lot of time exploring the configuration space in a local propagation manner. A suitable heuristics can encode the global geometric

/ -
Note: Both projection and trajectory optimisation are heuristic methods, and thus they are both prone to local minima since the pose constraint is non-linear.  information implicitly and thus can significantly improve the search performance of the sampling-based methods. Many heuristics have been proposed in previous work to help the planners satisfy general constraints in an efficient manner. Most of these heuristics are designed for the collision-free constraint, which is considered as the most challenging 'hard' constraint in motion planning. In particular, OBPRM [71], Gaussian sampling [72], retraction-based planners [17,73], and methods specially designed for narrow passages [12,74]. Adaptive sampling strategies have also been proposed that evolve while more information about collision-free configuration space can be inferred via sampling. For instance, [75,76] approximated the free space using a set of size-varying balls around nodes in the RRT representation. Reference [77] approximated the configuration space with a set of prior samples, either collision-free or incollision. Reference [78] extended the adaptive sampling approach in [77] to non-holonomic motion planning by defining the utility of local paths. Reference [79] constructed roadmaps in both the free configuration space and the configuration space obstacles, for generating more samples in narrow passages.

Objectives for optimal motion planning
Traditional sampling-based algorithms (e.g. RRT, PRM) do not take path cost or quality into account, and most of the time they generate non-optimal value [32]. Luna et al. summarised three ways to improve the quality of sampling-based solution paths [80] as Table 6 lists.
Both [80,81] demonstrated that given a time budget for planning, offline heuristic techniques for solution optimisation can outperform specialised planners which compute the globally optimal path (RRT*) in certain scenarios. Optimisation techniques are also combined with the RRT solution in [82].
For grid-based methods, costs are assigned to graph vertices or edges, and then the resolution-based optimality is guaranteed by the underlying graph search algorithm.
For trajectory optimisation, the construction of appropriate objectives is the main way to achieve a path with desirable properties.

Time-optimal trajectory optimisation
An intuitive solution to take into account time is to use a mixed configuration-time space. Most sampling-based methods could be adapted to the configuration-time space by introducing metrics to deal with the asymmetric time dimension [5]. However, when there is bound in speed, the adaption becomes difficult. Researchers studied the problem using motion states and velocity tuning [83], or decoupled coordination [84,85], but these methods were not complete. Another way is to define action spaces and configuration transition functions. Robot configurations are supposed to change following actions and their correspondent configuration transition functions.
Generally, solving a time-optimal motion planning in an environment with obstacles is not an easy task. However, there is a series work [86][87][88][89][90] on solving the time-optimal trajectory generation in an open space, and the focus is how the complex dynamics determines the time-optimal behaviour of the robot. By leveraging this time-optimal trajectory following procedure, some two-steps are proposed to avoid obstacles in a time-optimal manner [91,92]. In the first step, a smooth and collision-free trajectory is planned with simplified or zero dynamics taking into account. In the second step, the planner will try to follow the first-step trajectory using the time-optimal following property to compute a trajectory that is both collision-free and time-optimal. Similar techniques have been used for time-optimal RRT replanning in [93].

Other objective functions
There are some other objective functions that are also widely considered in the optimal motion planning, including the trajectory smoothness [42,44,94], the energy efficiency [95,96], which heavily used the tools in Lie groups.

Summary and conclusions
The past three decades have seen increasingly rapid progress in the optimal motion planning technology. This rapid progress has been enabled by major theoretical advances in the computational complexity, random sampling, and optimisation techniques. Different ways to formalise the optimisation problem have been proposed, including graph/grid formulation and waypoint trajectory optimisation. A large number of different constraints, including challenging collision-free constraints, pose constraints, and nonholonomic constraints, have been tackled. A huge number of objective functions have been proposed to achieve trajectories with different desirable properties. However, there are still many open problems left, such as how to incorporate collision-free constraints in an efficient and effective manner; how to take into account the dynamics constraints in the optimisation framework without twostep sub-optimal solutions. We hope this survey can serve as references for further research in the optimal motion planning area.

Acknowledgments
This work is supported by GRF 17204115 and 21203216.