Computational load reduction of the agent guidance problem using Mixed Integer Programming

This paper employs a solution to the agent-guidance problem in an environment with obstacles, whose avoidance techniques have been extensively used in the last years. There is still a gap between the solution times required to obtain a trajectory and those demanded by real world applications. These usually face a tradeoff between the limited on-board processing performance and the high volume of computing operations demanded by those real-time applications. In this paper we propose a deferred decision-based technique that produces clusters used for obstacle avoidance as the agent moves in the environment, like a driver that, at night, enlightens the road ahead as her/his car moves along a highway. By considering the spatial and temporal relevance of each obstacle throughout the planning process and pruning areas that belong to the constrained domain, one may relieve the inherent computational burden of avoidance. This strategy reduces the number of operations required and increases it on demand, since a computationally heavier problem is tackled only if the simpler ones are not feasible. It consists in an improvement based solely on problem modeling, which, by example, may offer processing times in the same order of magnitude than the lower-bound given by the relaxed form of the problem.


Introduction
In the last decades, computers assumed an increasing proportion of tasks previously assigned to humans. Repetitive chores such as an automobile assembly in a production line can already be almost fully automated. Nevertheless, other tasks depend on human judgment, such as an aircraft guidance. In such a case, sometimes unpredicted events require immediate responses that demand fast replanning to obtain alternative feasible routes.
This work proposes a computationally efficient solution to the problem of calculating the trajectory of mobile agents in an environment populated by obstacles. Linear programming by itself cannot directly model this type of problem, since the solution domain is nonconvex. However, recent research on agent trajectory planning [1][2][3][4][5][6][7][8][9][10] shows that it is still possible to use linear programming subject to integer variables constraints to optimize the trajectory of an agent in such an environment. We use mixed-integer linear programming with the assumption that the totality of space available should not be considered as a valid possibility for an agent in a trajectory optimization problem at any and every moment. In the previous example of the car travel, the headlights privilege lighting the ground near the car over the more distant areas. Even though these also require illumination, they are not so relevant as the first ones for steering the agent. In Fig 1 we have an example of the use of binary variables in an instance of a trajectory planning problem. Initially in the position indicated by index k, the agent takes the envelope, represented in continuous lines, of each obstacle that is represented in dashed lines to obtain the trajectory along steps k + 1, k + 2, . . ., k + 4. In such a case, each obstacle would have a set of binary variables, indicated below the figure, that denote at each timestep whether the agent is respectively in the Left-hand side, Below, in the Right-hand side or Above each obstacle. Therefore, 28 binary variables would be necessary at every timestep to avoid the obstacle set in such a case.
This problem of obstacle avoidance is NP-hard [11][12][13], and the current approaches found in the scientific literature are tailored for each specific circumstance. For example, [14] obtains the optimal solution after exploring the complete solution space, which may demand, for complex environments, an intense computational effort that might be unattainable in a small time slice. With the assumption that it is possible to know in advance the position of fixed obstacles in the environment, we propose a strategy to obtain a sub-optimal trajectory for the agent through obstacle clustering. Basically, to the traditional mixed-integer algorithms of obstacle avoidance we combine an iterative deepening implicit tree search in the subproblems of cluster avoidance, with a non-increasing clustering distance. As the number of obstacles gives an upper-bound to the number of clusters, by reducing the clustering distance, we detect the best existent solution for the trajectory planning problem.
The aforementioned complexity of such problem means that, as the number of obstacle increases, trajectory optimization becomes considerably more costly. A natural plan, then, would be to reduce the number of obstacles considered throughout the process. In possession of scheme in Fig 1, it would be possible to cluster obstacles 1 and 2, once the agent moves Above both, which sets the fourth column for them along the entire trajectory. Note that this decreases the size of the problem to be solved, since a single cluster would replace both obstacles.
The immediate reaction of a human being in traffic, when driving a car and trying to escape a collision between two other cars in front of his/her own, is to promptly cluster the obstacles ahead, like other cars or lamp posts, and seek complementary territory, free of obstacles, for navigation. As previously noted, this will be one of the approaches taken in this paper. In such a case, the fastest way to compute a safe path is to cluster nearby obstacles and initially seek to drive the car into an open nearby position. The question that would remain, in such a case, would be about how far apart two obstacles must be to be clustered together, and the strategy one would choose would again follow the intuition that one should first look for safe positions in the wider open areas.
However, other approaches can be additionally adopted to reduce the complexity of such problem. Consider in Fig 2 the complete map of the environment for the obstacles of Fig 1, where Q represents the terminal set. Among other simplifications, after step k + 2, the entire trajectory of the agent shall only move away from obstacles 1 and 2, for example. Considering them as obstacles to be avoided, when it is already predicted that a possible collision will not happen, is a source of computational waste. In addition, take for example obstacles 14 and 20. Right after step k + 1, the entire trajectory of the agent is predicted to stay below both obstacles and to the right-side of obstacle 20. The expense of limited computational resources on irrelevant calculations such as these is also useless, since they are already predicted not to aid in the computation of the final trajectory. These are a preview of the techniques we will use here.

Motion planning and Mixed-Integer programming
In the literature, examples of motion planning are abundant, and regardless of the medium or vehicle used, they may be split into two separate layers: path planning and trajectory planning. Path planning methods build routes without time parametrization, as in [15] or [16], which define crossover and mutation operators on genetic algorithms, or in [17], which uses a wavelet-based decomposition for easing the computational load of a multiresolution path planner.
Trajectory planning methods, on the other hand, search for routes that respect the movement constraints along time. For example, while in [18] a linear regression model predicts the evolution of human mobility in regions of a megalopolis, [19] uses statistical modeling to investigate the hierarchical structure of accident causes in autonomous vehicles and [20] proposes a proportional-integral-derivative (PID) controller for the real-time robotic stabilization of a robotic arm to act upon a dynamically moving human with a tumor.
However, it is possible to combine both path and trajectory planning by using Mixed-Integer Programming. The MIP approach is a general problem-solving framework that involves both discrete and continuous variables. In the case of agent movement, for example, we may use the former to model the activation of brake mechanisms while the latter calculates the speed and yaw angle in a curve. By constraining the domain of some variables to integers only, it becomes an approach much more general than a Linear Program (LP).
In particular, as in [21], for certain measures only Boolean variables are considered (thus assuming values 0 or 1). This procedure allows the system dynamics to be modeled by transforming propositional logic equations that derive from it into mixed-integer inequalities, which can be computed by the existent mixed programming solvers, such as CPLEX [22], Gurobi [23], Xpress [24], Xpress-MP [25] or ParaXpress [26].
Applications of MIP in the literature are plentiful, whether in terrestrial [1, 5-8, 18, 19], aquatic [27][28][29], aerial [2,4,9,16,[30][31][32] or even spatial [33] environments. For instance, the authors propose in [14] an approximate model of aircraft dynamics using linear constraints and they apply a MIP approach to the trajectory planning of airplanes. The model ensures collision avoidance for each aircraft and guarantees the desired hard constraints fulfillment. Then, [34] applies a constraint tightening strategy to obtain a robust solution that guarantees finite-time arrival into an arbitrary target set. In spite of unknown disturbances, the central idea is to hold a "border" for feedback action as time goes by.
We assume in this manuscript that the computation time demanded in trajectory planning grows with the number of binary variables used for obstacle avoidance. This is not always valid, but can be used as a rule-of-thumb for improving computational performance. In [35], such number is used to express the complement of the polytope regions, which associate a unique number to each obstacle. As the sequence of the number powers in base 2 is super increasing, according to [36], any integer can be coded in log 2 (N + 1) binary digits, which is the number of binary variables necessary to distinguish between N different regions.
In other words, [35] sets a global limit to the encoding itself. It is important to recall that the direct MIP solution for the trajectory optimization problem in an environment with obstacles is hard because in every timestep each obstacle face introduces non-convexity into the solution space [11][12][13]. As a consequence, for every obstacle with N f faces, N f non-convex constraints shall be added at each timestep to ensure that the desired trajectory remains outside the obstacle. As a result, to obtain further processing speedup, every technique obtained after [35] should involve a pre-processing step, performed separately with respect to the optimization. For example, [37] proposes a strategy which requires a pre-planned path to define intermediate target sets, known as waypoints. Alternatively, [38] obtains large ellipsoidal regions of convex obstacle-free space in intricate environments with a greedy convex segmentation technique and [31] provides an entirely collision-free path with reduced number of integer variables. In turn, [39] divides large and complex environments into smaller segments through many pre-processing steps, [40] uses convex optimization to obtain target defect areas and [41] builds a convex lifting which partitions the space and descends to convex optimization.
However, we often demand a dynamically-built method in actively-changing scenarios, and this is a special contribution offered here. For instance, [42] develops a resembling approach with a three-stage algorithm: first it computes a collision-free path through the environment, next it generates convex polytopes that contain such route and then it poses a MIP to determine the dynamically feasible path. Yet, we know that peripheral, possibly distant, obstacles are not as significant as those circumvented along a path, which opens way to achieve performance gains in scenarios with many obstacles.
Additionally, it is worth noting that the approaches here presented do not count on pre-processed trajectory segments. But, as it will be clear, the results we achieved introduce a feasible and versatile way of solving the trajectory planning problem, more specifically through obstacle clustering.

Materials and methods
In this article we study the problem of maneuvering an agent into a target region in a two-dimensional environment. The agent has state x, compounded by its position r and velocity v, and must not collide against obstacles. The dynamics of the agent is represented in Eq (1): Model Predictive Control (MPC) is used to perform the maneuver of the system in Eq (1). The core of the control problem is to choose optimally the predictionsx½k þ jjk� and the corre-spondingû½k þ jjk� for each timestep j 2 {0, . . ., N}. To attain the target set Q in finite time, we take the horizon length N[k] as a decision variable within the optimization, which represents the predicted time of entering it. This paper deals with the task of maneuvering an agent with linear dynamics minimizing the cost function x½k þ N½k� þ 1jk� 2 Q½N½k� þ 1�; ð3dÞ r y 2 R, position along a coordinate axis (perpendicular to the first) in a horizontal plane regarding an arbitrary origin; v x 2 R, velocity regarding the r x position; v y 2 R, velocity regarding the r y position; a x 2 R, acceleration regarding the v x velocity; a y 2 R, acceleration regarding the v y velocity; dimension of the random obstacle environment in the same axis as the one of r z , where r z refers to either the same axis of r x or r y ; b z 2 R, relative coordinate of the obstacle center in the same axis as the one of r z , where r z refers to either the same axis of r x or r y ; h z 2 R, average length of a random obstacle in the same axis as the one of r z , where r z refers to either the same axis of r x or r y ; B z 2 R, coordinate of the center of the obstacle regarding an arbitrary origin, where z refers to either the same axis of r x or r y ; f 0z 2 R, position of the smallest coordinate of the environment regarding an arbitrary origin, where z refers to either the same axis of r x or r y ; f 1z 2 R, position of the largest coordinate of the environment along the same coordinate axis of f 0z ; m z 2 R, length of a border which defines the region that contains the obstacle set, where z refers to either the same axis of r x or r y ; N b ob 2 N, number of bygone obstacles in a territory; N e ob 2 N, number of exterior obstacles in a territory

Time minimization
It is possible to transform a variable horizon problem involving time minimization into a fixed horizon one with the use of integer variables. With the assumption of a polytopic set Q of  terminal constraints, in which D m 2 R N Q �4 and H m 2 R N Q . To perform the time minimization task, the terminal constraint in Eq (3d) can be rewritten by making use of auxiliary binary variables b[j] which are determined as b[j] = 1 if N[k] = j and b[j] = 0 otherwise. The following constraints impose, for some j such that b[j] = 1 and in interplay with the cost function to be redefined in Eq 7, that the agent must be within Q, where N s is a fixed maximum horizon that must be larger than than the optimal N[k]: To ensure that the binary variable b assumes the unitary value only once over the horizon N s , the following constraint is defined: As a result, the variable-horizon cost function in Eq (2) can be rewritten by using a binary variable vector b to ensure the correctness in a fixed horizon approach: the control is zero for all steps beyond the chosen maximal horizon � N . This is a consequence of the relaxation of Eqs (3a) and (3c) after N[k] in the minimization of Eq (2).
In Fig 4 there is a scheme that represents the relaxation of the time minimization binary variables in a trajectory planning problem. Note that N ? [k = 2] = N ? [k = 1] − 1, i.e. the optimal horizon at k = 2 is one unit smaller than the one at k = 1. The constraints in Eq (5) are relaxed To accomplish it, the scalar M t must be chosen such that M t > d m ix À h m i , for all admissible x [21]. That is, M t must be chosen large enough for every x reachable in � N steps, to serve as a barrier that allows numerical solvers to correctly modulate the domains of action of the continuous variables through the binary variablesb½k þ jjk�; 0 � j � N s À 1. Such variables act as sidings that separate the action domain of each continuous instance of the problem and allow the solver to perform global optimization calculations by evaluating multiple local domains at once. Note also that, for k = 2, the binary variableb½k þ � N jk�, which marks the arrival to Q in the last step of the simulation horizon, is clear, so that Eq (7) � Eq (2) in this problem instance.

Obstacle avoidance
The optimization of trajectories in two-dimensional territories, clear of obstacles, presents an inherently convex search space, but the insertion of an obstacle into the region may render the problem not convex. Hence, we describe the commonly adopted remodeling that follows.
As a common requirement in an agent guidance problem, its formulation includes an obstacle avoidance task. Any polytopic obstacle O can be represented by: We avoid collisions against obstacles by imposing that, at each time step, the position of the agent is outside of at least one face of each obstacle. This is done through the binary variable b O f ;j , which sets if the agent is outside face f of obstacle O at time step k + j.
As we need x½k�= 2O � R 4 ; 8k 2 N, we should have where Obstacle avoidance in a trajectory planning problem can be attained by using the strategy of Alg 1. Initially we load the simulation parameters, and while the agent state x[k] does not attain Q, based on N s and the positions of the obstacle set R ob , we assemble the matrices A and B that contain the constraints of the problem, plan the trajectory by solving a traditional MIP problem, and finally evaluate and update the system state.
Alg 1. Closed-loop receding horizon maneuvering with Obstacle Avoidance. However, as each obstacle partitions the search space into N f disjoint regions, the overall performance in environments with tens of obstacles is severely deteriorated with regard to the initial obstacle-free case.

Inter-sample avoidance
In addition to guaranteeing obstacle avoidance in the sampled time steps, inter-sample avoidance is achieved by applying the avoidance constraints from step j + 1 at the preceding timestep j, as proposed by [43]. This imposes additional constraints to the problem, but employs no additional binary variables, as Eq (10) shows.

Graph theory background
In this work, we assume that each obstacle is a static object with known coordinates. Let R ob � R N f N ob be the the smallest and largest coordinates of the projection of each obstacle of O into the position space. Let also T be a territory with target set Q in an environment around the agent that contains N ob obstacles in a given clustering region, with a corresponding maximum clustering distance d max c , chosen by the trajectory planner. Then, T can be mapped into an undirected graph G ¼ ðO; OÞ, where the order of the graph is jOj ¼ N ob and |O| indicates the number of obstacle pairs that maintain a distance d � d max c . As an example, two nodes O 1 and O 2 corresponding to obstacles O 1 and O 2 in O with points p 1 2 O 1 and p 2 2 O 2 will be connected if where the function r d (�) returns the position of �.
The fulfilling of Eq (13) for each pair of obstacles is recorded by the adjacency matrix A, a square matrix formed by N ob lines in which a ij = 1 if and only if the vertices (obstacles, in this case) O i and O j are connected, and a ij = 0, otherwise.

and only if there is a path between the obstacles corresponding to nodes O i and
O j with length l � r [45].
If lines-or columns, as A and C are symmetrical for undirected graphs-i and j in C are equal, then O i and O j will have a path to the same nodes. In this case, they will belong to the same connected component, or cluster, as we will name it from now on.

Remark 1.
To verify if the obstacles corresponding to nodes O i and O j belong to the same cluster, it is enough to compare the i-th and j-th lines (or columns) in C. If they are equal, then the obstacles belong to the same cluster.
As an example, the terrain that contains a set of N ob = 6 obstacles (Fig 6a), can be represented by the graph in Fig 6b, which is described both by the adjacency matrix A and the connectivity matrix C in Eq (14).
The resulting clustering procedure is computationally efficient, since in order to obtain the connected components of O it is enough to assemble A, with a simple set of comparisons between the obstacle positions, and to obtain C, with only the use of matrix multiplications. From Eq (13), the interobstacle distance is obtained as the Hausdorff distance between the sets defined by the points inside obstacles O i and O j .
Here, as R ob contains the stacked coordinates in the x-and y-axis of the lower left-hand side and the upper right-hand side corners of every non-intersecting obstacle, to determine the interobstacle distances Δr o between every two obstacles, it is enough to make a simple set of comparisons and operations with the position of the corners of every obstacle pair. It is worth noting that, as long as the environment does not change and is supposed to be static, Δr o can be computed offline and used throughout the agent movement, once it depends only on the environment topology. It is also worth remarking that the clustering distance d c is the generalization of d max c for multiple obstacles, as next section will highlight. It is possible to check the copertinence of two obstacles to the same cluster with Alg 2, and there is a representation of this clustering strategy in Alg 3. Here, Δr ao is an array in which each element Δr ao [i] is the value of the distance between the agent and obstacle O i . The main idea is to obtain the matrix C, matrix compounded by N c lines and N ob columns which indicates for the agent position r and the obstacle set position R ob if two obstacles are neighbors in a given clustering region. The first for loop identifies the innermost clustering region each obstacle belongs to in order to set the respective clustering distances d c [μ], for each μ-th obstacle. The next two nested for loops build the adjacency matrix A and as there are no loops in G, we must add A to the identity matrix before raising it to (N ob − 1) − th power, so as to propagate the links that pass through a node and obtain the maximum number of obstacles clustered together in the environment. Next we obtain the connectivity matrix C and, as we are only interested in the identification of the connected components of obstacles, we finally set the connectivity of these as 1.
Alg 3. Obstacle clustering. These tools allow us to build a clustering strategy that directly associates the reduction in the computational effort to the trajectory planning problem in the presence of obstacles, through a more refined clustering only for closer obstacles. This is the subject of the next subsection.

Dynamic obstacle clustering strategy
The approaches proposed by [30,38] consider the interobstacle relative positions to find convex regions of obstacle free space, which is analogous to considering the interobstacle distances Δr o to perform the clustering procedure, before optimization per se. In the trajectory planning problem, the computational speedup that these approaches obtain is a consequence of taking into account, in the avoidance modeling, the clusters themselves instead of the obstacles. As those potentially consist in a smaller number regarding these, less binary avoidance variables and constraints will possibly be used, fact which produces a problem that can be solved with less computational effort.
This work adds to this rationale the relative agent-obstacle position Δr ao . The strategy we adopt segments this distance in regions around the agent, and flexibilizes the effective interobstacle clustering distance d c in each of these regions. For obstacles in a region closer to the agent, a more subtle clustering with a smaller value of d c is adopted. For more distant obstacles, a coarser clustering may be used with a greater value of d c , as these obstacles are not in the immediate future of the agent. In this way, two obstacles can be clustered together while distant from the agent, but they would possibly separate into distinct clusters if the agent approaches them along its trajectory. Once less binary avoidance variables and constraints would be used compared to the single clustering distance case, a smaller total computational effort would be necessary, saving computational time to find the desired trajectory.
To make the exposition easier, here we will consider only polytopic (more specifically, rectangular) obstacles with sides parallel to the axes. Nonetheless, it must be remarked that the technique proposed in the present paper can be extended for any form of polygonal obstacles.

Close obstacles clustering.
This section describes the first clustering strategy this paper proposes. A typical scenario for trajectory planning is shown in Fig 7. Each clustering region is represented as a circle around the agent with the respective clustering radii R ic , R sc and R oc . The clustering distances d ic , d sc and d oc are depicted at the bottom of the figure.
After obtaining the connectivity matrix C, which describes the interconnections among close obstacles, the next step is to identify the coordinates of the clusters, as Alg 4 shows. This is done by boxing the convex hull of the connected components of obstacles in the environment, according to the value of d c chosen in the clustering algorithm. After that, it is just a matter of solving the optimization problem by considering cluster avoidance, with coordinates R κ , instead of obstacle avoidance, with coordinates R ob , with the constraints proposed in the previous section. The scheme in Fig 8 represents this strategy with three different agent positions. Initially, the agent estimates a path that goes down the largest cluster, to the east (Fig 8a). However, as the agent starts moving, some obstacles of this cluster enter into the surroundings-zone (Fig 8b) and the algorithm splits it up in two smaller clusters, which frees a corridor for the agent to reach the target (Fig 8c).

Alg 4. Cluster coordinates extraction.
We depict the algorithm that represents this strategy in Alg 5. Basically, to the usual problem of trajectory planning, with instructions shown in black, we include an obstacle clustering

PLOS ONE
phase, shown in blue. Initially, we load the simulation parameters, i.e. obstacle positions, clustering distances, and so on, and obtain the Adjacency and Connectivity matrices A and C. In possession of C, it is straightforward to obtain the cluster configuration R κ and perform a traditional trajectory planning to reach the target region P Q . In Fig 9 there is a flowchart of the main variable data types produced throughout the different steps of the Close Obstacles Clustering algorithm. To the regular flow of an obstacle avoidance algorithm, represented by the feedback on state x[k], we prepend the Clustering step, represented by the green box, which produces the coordinates of the cluster configuration R κ from the obstacle positions R ob . Notice that the complete underlying clustering logic described along this section in Alg 2, 3, 4 and 5 is here implicit, and that in Fig 9 both

Complementary strategies for reducing the number of obstacles
In this section we propose additional strategies to reduce the number of obstacles along the trajectory planning problem. Specifically, both position history as forward movement predictions can help speedup the trajectory planning process.

Bygone obstacles rebuttal strategy
After optimization, a trajectory between the initial position r[k] and the target set is found. If all the predicted positions of the agent are closer to P Q than a given obstacle whose projection into the position space is given by O b , then such obstacle may be replaced by a single constraint independent of the obstacle avoidance binary variables, which may then be removed from the optimization problem from k + 1 onwards. In such a case, we first determine a straight-line perpendicular to the segment � pf connecting point p in O b and point f 2 P Q that produces the minimum distance between O b and P Q , and then we translate this straight-line to pass at the point p. The half-plane defined by this line that contains the current position is guaranteed to contain all predicted positions by construction and defines the hard constraint that can replace the bygone obstacle in the subsequent trajectory, and ensures that the agent does not collide with it.
Therefore, we can replace the obstacle of projection O b itself (and its binary avoidance vari- that ensure the agent will be confined to a region away from the obstacle. Here, s O b contains the coefficients of the straight-line perpendicular to � pf and c O b contains the constant terms that ensure that the perpendicular line passes at the point p that is closest to P Q .
The illustrative example in Fig 10 represents the inherent concept to this strategy. Each figure depicts in continuous lines concomitantly both the current and the succeeding positions of the agent and the scheme exhibits the situation after all the movement updates in each step.
In Fig 11 there is a mixed representation of this approach with the Close obstacles Clustering strategy in three different agent positions, as the agent approaches the target. For clarity purposes, we omit both the clustering regions and the clustering distances.
It is important to note that this approach only considers an obstacle as bygone if the predicted agent-target distances are smaller than the obstacle-target ones in all future steps. This policy works then by reducing the search space through the gradual removal of obstacles at the edges of the environment. Alg 6. Bygone Obstacles Rebuttal. We represent the algorithm that underlies this strategy in Alg 6, where distðr½k�; P Q Þ returns the euclidean distance between r[k] and P Q . In case of an obstacle rebuttal, the coordinates of its projection are removed from R ob and both N ob and o must be decremented, so that the number of obstacles and the subsequent stacked obstacle are correctly evaluated in the next iteration. The inclusion of this algorithm in the planning process is highlighted in blue in Alg 7. Through the assurance that the agent has moved away from the obstacle and will do so  in all the predicted trajectory, it is possible to rebut bygone obstacles while ensuring the existence of a feasible solution to the optimization problem at the next sample times. The flowchart in Fig 12 represents the data produced throughout the main steps of the Bygone obstacles rebuttal algorithm. To the previous Clustering algorithm of Fig 9 we prepend a bygone obstacle rebuttal step, represented in blue, which removes the bygone obstacles and adds the respective hard constraints to the problem to be solved thenceforth.
This strategy replaces the initial N f + 1 mixed-integer avoidance constraints of each bygone obstacle with only one hard constraint, and allows for removal N s N f binary variables for each bygone obstacle that is replaced by a simple inequality.

Exterior obstacles contempt strategy
Once the agent obtains a path to the target P Q , if all the predicted positions of the agent lay on the same side of some obstacle whose projection into the position space is given by O e , then such obstacle may be replaced by the hard constraint corresponding to this very side. The example in Fig 13 illustrates this idea. It is worth emphasizing that, while in Eq (9) f ranges from 1 to N f , in the case of the exterior obstacle of Eq (16),f assumes a single value, which is that of the constraint that is always A simple check upon the correspondence of the values that the binary avoidance variables assume for a given obstacle side along the complete prediction horizon would represent a sufficient condition for identifying an obstacle as exterior. However, there are cases in which an obstacle can be exterior and not have all binary variables fixed True to a certain side throughout the whole prediction horizon, due to the redundancy resulting from the overlap of valid regions in the presence of a polygonal obstacle.
For example, at k = k 0 + 1 in Fig 13a, if the solver sets the variable of the valid region to the left-hand side of the obstacle instead of the one above it, the obstacle would not be identified as exterior at k = k 0 + 2 in Fig 13b. In such a case, there are obstacles that could be removed at a certain instant, but would not. In fact, we do not rely only on the binaries to verify the obstacle contempt condition, but in fact we calculate whether the condition of the constraint is respected for each cluster by the predicted positions of the agent at every instant. The choice of which constraint to adopt in the case of the overlap of more than one valid region becomes then a design decision. Thus, as the agent circumvents a cluster and proceeds in its path to the target, the obstacles inside it become exterior and a single hard constraint is added replacing the whole cluster.
In Fig 14 there is a mixed representation of this approach with the Close Obstacles Clustering strategy in three different agent positions.  The pseudocode that identifies exterior obstacles is represented in Alg 8. It consists basically in the comparison of the sides of each cluster, that were previously identified in the clustering step, with the most extreme predicted trajectory coordinates, either at the left-hand side, below, at the right-hand side or above, where _ is the boolean OR operator. Here, ext is an array of boolean variables, in which ext[c] = True if cluster c is exterior. Then, to replace the exterior obstacle it is enough to add the hard constraint that corresponds to the new exterior cluster in the planning problem to be solved thenceforth, with the proper contempt of the obstacles that belong to this cluster from the obstacle set, as Alg 9 shows. Here, function Get Active Exterior Constraint() returns the coefficients of the exterior constraint that replaces the exterior obstacle. We represent the pseudocode of this strategy in Alg 10, with the contempt of exterior obstacles highlighted in blue, while the flowchart in Fig 15 represents the data produced throughout the main steps of the Exterior obstacles contempt algorithm.
Once a cluster is identified as exterior, this strategy replaces the initial N f + 1 mixed-integer avoidance constraints of each exterior cluster with only one hard constraint for the whole cluster, and allows for removal N s N f binary variables for each cluster that is replaced by a simple inequality.

Iterative clustering distance tuning
To decrease the number of binary variables in the optimization problem, the Close obstacles clustering strategy may entail the elimination of spaces initially available for navigation. This, in turn, might lead to an infeasible optimization problem, even when in the absence of clustering a feasible trajectory could be determined. In this section we will describe a scheme that circumvents this problem and allows to obtain a feasible trajectory to the final region, when one exists.
To the previous cluster avoidance algorithm in Alg 10 we append an iterative deepening search in the clustering avoidance subproblems, with a non-increasing clustering distance highlighted in blue. In Alg 11, if we initialize the clustering distance d c with a value large enough, the problem could be infeasible. As a result, the if condition of line 7 would hold value True, d c would be reduced according to the expression in Eq (17) and the continue statement of line 9 would break the loop execution. As the clustering distance is held as a global variable, in the next loop iteration, the algorithm would make its prediction with the updated value, and the execution would continue until the algorithm was able to find a feasible path to the target set. On the other hand, if the value of d c did not produce initially an infeasible problem, then a trajectory to the final region would have already been found.
Here, 0 < s r < 1 is taken as a shrinking rate for d c . Small values of s r imply a high cutback on d c and a smaller number of clustering attempts until the strategy finds a viable path to Q, potentially in a configuration with many clusters in the environment. On the other hand, high values of s r entail a gradual reduction on d c and allow the attainment of a cluster configuration that is closer to the minimal number of clusters, possibly at the expense of a greater number of infeasible solution trials.
Alg 11. Closed-loop receding horizon maneuvering with the Iterative Clustering Distance Tuning Algorithm. The flowchart in Fig 16 represents the data produced throughout the main steps of the Iterative clustering algorithm. To the previous Exterior clustering flowchart of Fig 15 we add the continue statement, represented in blue, which decreases the clustering distance in the avoidance subproblem to be solved thenceforth.

Simulation scenarios
The agent model is that of a particle [14,34,37] moving in a plane with axes r x and r y orthogonal to each other, and we define the position vector as r T = [r x r y ]. The inputs are the accelerations a x and a y , which result in velocities v x and v y aligned respectively with the r x and r y axes. In state-space, the continuous-time model is _ x ¼ A c x þ B c u, with state vector x T = [r x v x r y v y ] and control vector u T = [a x a y ]. Then, matrices A c and B c are given by Eqs (18) and (19). : ð19Þ The plant model must be discretized to be used as the internal controller model, since the MPC controller is implemented in discrete-time. Then, by Zero-Order Hold discretization  (20) and (21).
Here we assume the units are: m for length and s for time. : ð21Þ The following parameters are adopted in the controller settings of the agent: The weight γ was chosen to adjust a compromise between fuel expense and time minimization, so that the choice of γ makes both contributions comparable in the cost function.
The initial state and terminal set that we use here are summarized in Table 1 for each clustering strategy. The agent starts at rest and is required to reach the terminal set with low speed. Representative maps are shown in Fig 17a, with the blue dot representing the initial condition of the Unclustered, Clustering, Bygone and Exterior strategies and the red dot for the initial position of the Iterative strategy, while in Fig 17b we represent the random map used. In the Iterative strategy simulations, the agent was brought closer to the central obstacles to increase the chances of obtaining initially an infeasible problem. In order to engender a reference to the performance of the algorithms that this work proposes, we employ a version of the problem with the obstacle avoidance constraints relaxed, as a benchmark to the simulation times. Such version, denominated as Relaxed from now on, is achieved by changing to 0 the right-side of Eq (9b), what allows, as a valid solution state, none of the avoidance variables to be activated at every time step.
Also recall that, as a simplification, we approximate each obstacle by its rectangular envelope with sides parallel to the axes. If necessary, [34] provides a more general formulation. Finally, the CPLEX toolbox from IBM ILOG was used for solving the MILP problem in Matlab environment.  Then, an obstacle in the environment with sides parallel to the axes and center in [B x B y ] T can be randomly generated according to the expression in Eq (22),

Random obstacles generation
where For the generation of uniformly distributed pseudorandom numbers, we used the rand function in Matlab environment.

Unclustered scenario
For comparison purposes, we obtained the average time to calculate the complete trajectory that results from the obstacle avoidance procedure in an unclustered scenario, as [34] proposes. For the Unclustered, Clustering, Bygone, Exterior and Iterative strategies in maps of Fig  17a and 17b we obtained each of these times as an average of 30 simulations. The resulting path for the Unclustered strategy is represented in Fig 18a, with a zoom of the curve drawn among the obstacles in Fig 18b. The average simulation times along each iteration of Alg 1 are monotonically decreasing, as   (Fig 20d). Once there are still some obstacles of the cluster to the south of the agent in the inner-zone, the algorithm considers d ic as the active clustering distance and does not merge the clusters to the south and to the southwest of the territory, totaling 13 clusters in step 8.
In Fig 20e, the clusters to the south and to the southwest of the agent are lumped together, and the same also happens to the clusters to the north and to the northwest of the territory in the following steps (Fig 20f and 20g). Once there is no significant change in the environment, the initially planned trajectory is strictly followed, with https://doi.org/10.1371/journal.pone.0233441.g020 J ? ½k þ 1� ¼Ĵ½k þ 1jk� ¼ J ? ½k� À gkû½kjk� k 1 À 1, whereĴ ½k þ 1jk� is the predicted solution cost for time k + 1 given k, J ? [k + 1] is the optimal cost at time k + 1 and kû½kjk� k 1 is the ℓ 1norm of the control effort applied at time step k.
In this approach, the obstacle set is used only as input of the clustering procedure, without any update to its components. This means that the computational load remains approximately fixed throughout the maneuver, except for the natural clustering variations due to the change in the relative position between the agent and the obstacles. This is evident in the representation of the number of obstacles N ob at the right of every plot in Fig 20a-20h.
With the Clustering strategy, Fig 21a shows the average time spent on each algorithm iteration. The clustering procedure has as main consequence, regarding the computational times along each algorithm iteration, an expressive cutback when compared to the ones of the Unclustered case in Fig 19. By clustering two close obstacles, we remove the need for additional branches made by the branch-and-bound algorithm along the path of an agent that circumvents these obstacles.
The average clustering time along the whole trajectory of the agent is t K;k � 0:18 s. The average total simulation time, i.e. the average of the summation of the times of all optimizations realized until the target set was reached is t c � 3.79 s. The Clustering strategy has an average speedup of 16.26 regarding the Unclustered strategy.

Bygone obstacles rebuttal strategy
In Fig 22 we have the clustering strategy with the rebuttal of bygone obstacles as the agent moves towards the goal area, i.e. running Alg 7.
In the beginning, the trajectory of Fig 22a is the same as the one of Fig 20a, once this strategy only removes obstacles that are already distancing from the agent and will do so thenceforth. This approach gradually replaces these obstacles by simple convex constraints once the agent is predicted to always move away from them, as the drop in N ob to the right-hand side of The average time spent on each algorithm iteration is shown in Fig 21b. Along all steps of the trajectory, the average time necessary to identify the bygone obstacles is t B;b � 0:01 s and the average clustering time is t K;b � 0:08 s. With the Bygone obstacles rebuttal strategy, the average total simulation time, i.e. the average of the summation of all the optimization times until the target set was reached is t b � 3.14 s, with an average nominal speedup of S b � 19.64 with regard to the Unclustered strategy.

Exterior obstacles contempt strategy
In Fig 24a, the Exterior obstacles clustering strategy (Alg 10) initially identifies N e ob ¼ 13 exterior obstacles. We represent them in green-yellowish color in the timestep the algorithm identifies them and in yellow color in the subsequent steps.
One can see that the exterior obstacle contempt strategy complements the effects of the bygone obstacle rebuttal one. The obstacles that belong to exterior clusters are promptly removed after the first step, and while the agent circumvents both the central and northwestern clusters, some obstacles of these clusters become bygone obstacles before they are treated as exterior ones. See the obstacles in the lower-half of the central cluster and to the left-hand side  Along all steps of the trajectory, the average time required to identify the exterior obstacles is t E;e � 0:00 s, the average identification time of bygone obstacles is t B;e � 0:00 s and the average clustering time is t K;e � 0:05 s. With the exterior obstacles clustering strategy, the average total simulation time, i.e. the average of the summation of all the optimization times until the target set was reached is t e � 1.61 s, with an average nominal speedup of S e � 38.33 with respect to the Unclustered strategy.

Iterative clustering distance tuning
The results for the Iterative Clustering Distance Tuning Strategy in Alg 11 can be seen in Fig  25. The agent initial state is set as x 0 = [0.5 0 3.5 0] T (i.e. the agent is brought closer to the obstacle set) and the first guess for the clustering distance is increased to d c 0 ¼ ½3 6 9� T , to ensure the agent is initially surrounded by a cluster. The shrinking rate is taken as s r = 0.75. Initially, the algorithm groups the obstacles into a cluster configuration that covers the entire territory, which is not shown in Fig 25. Since no movement is possible, as a single cluster surrounds the agent and prevents it from choosing a viable action, the algorithm shrinks the clustering parameters. However, a second infeasible problem arises, which demands another reclustering procedure.
The result is the maneuver of Fig 25a, with the agent moving south to leave the avoidance area of the largest cluster. As the agent begins to circumvent it, after Fig 25b, its central-south obstacles enter into the surrounings-zone. The agent makes a less sharp curve with the new open area between the clusters and the trajectory ends with N p ob ¼ 1 and N e ob ¼ 49. The average times needed along the first and the second reclustering steps, which comprise the time CPLEX needed to conclude that the problem is infeasible, are t R;i;1 � t R;i;2 � 0:01 s. Along all steps of the trajectory, the average time required to identify exterior obstacles is t E;i � 0:00 s, the average identification time of bygone obstacles is t B;i � 0:00 s and the average clustering time is t K;i � 0:05 s.
With the Iterative clustering distance tuning strategy, the average total simulation time, i.e. the average of the summation of all optimization times until the target set was reached is t i � 1.03 s, with an average speedup of S i � 59.53 with regard to the results of the Unclustered strategy.

Iterative clustering distance tuning in the random obstacle scenario
We used the Close, Bygone, Exterior and the Iterative Clustering strategies in another environment, with obstacles randomly generated according to Eq (22)  The algorithm adjusts the initial clustering distance 4 times, what takes respectively t R;r;1 � t R;r;2 � t R;r;3 � 0:00 s and t R;r;4 � 0:01 s in each reclustering operation, which includes the time CPLEX needed to deduce that the problem is infeasible, until the 50 obstacles are arranged in Fig 26a into 7  With the Iterative clustering distance tuning strategy, the average total simulation time, i.e. the average of the summation of all simulation times until the agent reaches the target set is t r � 2.63 s, with an average speedup of S r � 13.43 in relation to the results of the Unclustered strategy in the random scenario. A comparison of the trajectory that these strategies obtained can be seen in Fig 27 and is discussed in detail in the Discussion section.

Obstacle classification
The obstacle composition for the cases studied in this work can be seen in Fig 23. The main tendency we can observe is that the more elaborate the strategy, the faster the substitution of binary variables of obstacle avoidance. While the Bygone obstacle clustering strategy in Fig 23a replaces all binary avoidance constraints in k = 14, the addition of the Exterior clustering strategy achieves the same in k = 10 and the adoption of the Iterative tuning heuristics accomplishes it in k = 7. These strategies cooperate in the substitution of binary avoidance variables, and the preponderance of any of them over the others is intrinsically related both to the clustering distances adopted and the obstacle topology in the environment, as well as the order in which they are carried out. However, one practical limitation of these algorithms is the need for full knowledge of the obstacle environment topology upon which they will operate.

Evolution of the cost function and its prediction
In Fig 28 there is a representation of the cost function evolution, in continuous lines, and its prediction, in dashed lines, along all the simulations presented here.
Along all the steps of the Close, Bygone, Exterior and Unclustered strategies, the cost function J ? [k] corresponds exactly to the predictionĴ ½k þ 1jk� made in the previous step, as the dashed horizontal extensions in the value ofĴ ½k þ 1jk� show. However, in the Iterative strategy, both for k = 3 in Fig 28d and for k = 6 in Fig 28f, when an obstacle occupies an inner clustering region, the clustering algorithm frees previously clustered space that becomes available for the agent movement, which makes the value of the cost function be less than the prediction previously made.
Such fact was observed here only in the Iterative clustering strategy, but it is worth noting that it is a consequence of the distribution of the d c values in the clustering regions around the agent. With a fine parameter tuning, the Close strategy itself would identify the same phenomenon.

Trajectory comparison
In Fig 27 we can compare the trajectories that each strategy obtains in the map of Fig 17a. While the Unclustered strategy optimizes a trajectory that overcomes the obstacles that compose the central cluster through an internal path, the beginning of the Clustering, Clustering + Bygone and Clustering + Bygone + Exterior strategies is exactly the same, moving north to circumvent it.
After surpassing the central cluster at time t = k 0 + 8, the trajectories slightly separate themselves until they rejoin at the lower left corner of Q, in t = k 0 + 14. In this case, the Exterior strategy takes a path that prioritizes the vertical displacement to the detriment of the horizontal one, whereas the Bygone and the Clustering strategies take paths more similar to a straight line toward the target.
The Iterative clustering strategy in black, on the other hand, chooses the first available path freed in the iterative deepening search among the obstacles. Thereby, the trajectory of the agent overcomes the central cluster underneath.
In Fig 27, the comparison of the paths the Clustering, Bygone, Exterior and Iterative strategies obtained against the one optimized by the Unclustered strategy alone shows that the more comprehensive the changes in the original planning problem, the more the final trajectory potentially diverges from the optimal path. In such case, there is an increment of approximately 2.8%, 2.8%, 2.8% and 21.2% in the the value of the cost function with the adoption of these strategies.
In Fig 29 we have a comparison of the trajectories that the Iterative and the Unclustered strategies optimize in the random scenario of Fig 17b. The Iterative strategy obtains the nozzle-shaped trajectory shown in black, which shows that the path produced can substantially differ from the optimal one, as a consequence of the different constraints that make up the problem. Here, there is an increment of approximately 30.7% in the value of the cost function with the adoption of the clustering strategies.
We will see in the next subsection what are the outcomes of the adjustments made in the original planning problem, specially regarding the optimization times.

Computational time
The strategies proposed in this paper have two main effects in the agent trajectory. The first one is their impact on the resulting trajectories, as discussed in the last subsection. The second  each strategy spends to obtain not only the Clustering configuration, but also the Bygone and Exterior constraints and the overall time spent on Infeasible iterations.
The trajectory of the Relaxed strategy is a straight-line from the initial agent position to the target set. In this case, the agent behaves as if there is no obstacle to be avoided. Naturally, it is not possible to attain a faster approach. Despite not being drawn in Figs 27 and 29, its solution time can be used as a benchmark to obtain the lower-bound on the optimization times.
From the data in Table 2, the main inference is that the extra time spent in the techniques we propose is negligible when compared to the total simulation time. The CPLEX optimization time falls progressively as more elaborate clustering strategies are used because the effect of each new component adds up incrementally to the previous ones.
The Effective speedup column measures the typical behavior of each strategy in practical situations. It is no use to guarantee speedup only in the problem solving phase, the so-called Nominal speedup here, if we add overhead in other steps, transferring computational load to stages not monitored and previous to the main computation. It is necessary to decrease the total solution time, and from Table 2 we can see that the Nominal speedup propagates to other solving phases, which renders an Effective reduction in the computing times of approximately 52, 148, 446 and 996 times for the Close, Bygone, Exterior and Iterative strategies, respectively.
All the strategies presented here offer both substantial Nominal and Effective speedups regarding the Unclustered scenario. Besides that, the Exterior and Iterative strategies also offer simulation times in the same order of magnitude than the lower-bound offered by the Relaxed strategy, which shows the effectiveness of the techniques proposed.
The same results for the Unclustered and Iterative Strategies in the Random environment of Fig 17b are shown in Table 3. Finally, the times spent on reclustering steps in the Iterative strategy are represented in Table 4. represented in Table 4.

Conclusion
In this work we proposed a combination of approaches that guaranteed performance improvement through the reduction of the computational load in large scale trajectory planning methods with obstacle avoidance techniques. While traditional methods find the globally optimal path via a complete search in the solution space, the technique we proposed pruned areas that belonged to the constrained domain, which eased the exponential burden related to obstacle avoidance. This improvement dealt directly with the computation of the complete path of an agent, i.e. without the definition of any waypoint, which could have resulted in faster computational times. Through a deferred decision-based technique, we tackled the computationally heavier problems only if necessary, reducing the solution search time.
We proposed an algorithm that takes into account the pertinence of each obstacle, based on its temporal relevance to the agent guidance problem. This strategy offered computational speedup based solely on problem modeling, i.e. independent of the computer architecture or the variable encoding chosen, which naturally opened space for further improvements on subsequent research.
With the clustering of obstacles, the number of regions to be avoided can be reduced offline and has the theoretical lower limit of one. The reduction in the number of regions to be avoided entails a lower computational load online. Since the clustering algorithm is cheap and runs offline, this yields overall better computational times. However, the clusters impact performance in terms of the cost function, as the optimal solution without clustering might not be feasible anymore. Therefore, one identifies a compromise between optimality and computational burden. The iterative clustering distance tuning enables to automatically find clustering distances that are quasi-minimal to obtain feasibility of the optimization problem with clustering.
The bygone obstacles rebuttal and the exterior obstacles contempt strategies require a previous solution to the optimization problem, therefore they cannot be run offline before the optimization. On the other hand, these two strategies remove obstacles that do not affect the optimal trajectory, therefore optimality is maintained.
As future research proposals we can relate: extension to multiple agents; optimization of the cluster configuration; development of an obstacle avoidance strategy with anytime algorithm capacity; use of other techniques, such as complex networks to identify the clusters or metaheuristics to generate convex regions to speedup the MIP solving phase; use of different encodings to improve the solver performance with regard to the obstacle avoidance; and the study of the effect of uncertainty in the model through the use of robust optimization and fuzzy programming.