Two-Step Asynchronous Iterative Formation Control for Heterogeneous Vehicles in Highway Scenarios

In this article a novel algorithm for the on-line control of formations of heterogeneous vehicles is proposed for highway traffic scenarios. A two-step iterative strategy relying on both a discrete and a continuous space representation, using Dynamic Programming and reference tracking via Model Predictive Control, is formulated. It allows the creation and arbitrary reshaping of vehicle formations in a unified context, providing an applicability in a wide range of practical situations. Only position measurements and basic communication capabilities for global coordination are required for the automated vehicles. Scalability and flexibility are achieved by the underlying decoupled structure of the algorithm, while an asynchronous functioning is enforced due to trigger signals that are exchanged between the vehicles and the coordinator. Different scenarios are simulated and discussed to evidence the effectiveness of the proposed strategy.

Aerial Vehicles [6], [7] and Automated Underwater Vehicles [8] proved formation control to be fundamental in succeeding at many different tasks which inherently require groups of agents to cooperate (e.g. exploration, surveillance, etc.). To this end, different paradigms can be resorted to such as, for instance, 1D and 2D formation control (with, e.g., leader-following or behavior-based algorithms), and consensus control (see, e.g., [9], [10]).
In the automotive field, anyway, up to now the greatest effort in the multi-agent context has been devoted specifically to platoon control (see, e.g., [11]- [17]). This yields better road exploitation (through an increased traffic density), improved safety and comfort, and in many cases better energy efficiency [18], as, e.g., for heavy-duty vehicles [19], [20].
The development of technologies enabling Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communication introduced the concept of Connected and Automated Vehicles (CAVs), which can effectively communicate to achieve coordinated formation-level objectives. As a natural consequence, it appears evident that extending formation control to the 2D case could greatly extend the range of possible applications and improve further the overall performance. In this framework, formation creation and management systems could be considered as part of a much broader smart transportation infrastructure, where a multi-level architecture enforces efficient and reliable traffic control [21], [22].
Despite the apparent importance of developing effective 2D formation control strategies in highway scenarios, to the best of the authors' knowledge little research effort has been devoted up to now in this particular direction. Specifically, although some techniques currently available for non-holonomic mobile robots could in principle be applied to the automotive case (see, e.g. [23]), the high velocities and the peculiar constraints imposed by the highway context need additional conditions to be addressed properly. Additionally, the need to deal with heterogeneous vehicles poses usually non-trivial issues in the majority of the solutions currently available.
In this paper, a novel algorithm is proposed for on-line creation and dynamic reshaping of heterogeneous formations in highways scenarios, which relies on a two-step iterative procedure. The controlled vehicles are required to be CAVs, or semi-automated vehicles able to exhibit CAV capabilities on demand. A global coordinator must exist, that is able to communicate with every involved vehicle and perform basic coordination operations. This latter can easily be integrated in a This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ multi-level traffic control architecture which computes on-line the required formation size, shape, position and velocity. Please consider that a very basic and preliminary implementation of the proposed algorithm has been published in [24], where its validity has been assessed experimentally.
The core idea of the present work resides in the construction of a virtual reference frame attached to a virtual leader, which enables the construction of a (virtual) grid covering a specified part of the highway and allowing for the subsequent two-step iterative procedure. The two stages involve a discretized and a continuous space framework which allow to flexibly address different problems: a) In the first step, the considered part of the highway is spatially discretized and Dynamic Programming (DP) is employed to find the best path for each vehicle in order to reach and join a specified formation. The involved cost functions can be tuned at runtime to pursue particular requirements or encourage/discourage certain behaviors, for the sake of performance and safety. For instance, one may want to discourage lane changes to reduce lateral maneuvers, or encourage joins of vehicles from lanes which may become unavailable in the near future. Additionally, an arbitrary number of DPs can be specified in order to improve the formation creation performance in terms of its duration, without affecting any other aspect of the strategy. b) The subsequent step involves the creation of smooth trajectories along the previously computed (optimal) paths, which are then followed making use of Model Predictive Control (MPC) and single-track vehicle models. At this level, safety and comfort can be considered, while taking into account physical constraints, which are specific to the individual vehicles. The proposed novel approach features the following main properties: i) The most critical operations (for controlling individual vehicles) are carried out in a safe environment where only a small subset of controlled vehicles has non-zero relative velocity with respect to the rest of the formation. ii) Great structural flexibility is given by both the iterative and mixed discrete-continuous nature, which make it straightforward to include more advanced features and expand the already present ones or tailor them to specific needs. iii) The inclusion of a mechanism to arbitrarily reshape the formation allows one to consider many practical cases (for instance, the restriction of available lanes due to working sites, or the dynamic joining or leaving of particular vehicles), while the use of an MPC formulation enables high-speed maneuvers.

II. PROBLEM STATEMENT
Let us consider a highway H = {1, 2, . . . , N l } as a set of N l lanes of uniform and constant width w l , indexed in ascending order starting at the rightmost lane. A number N of vehicles, which are required to implement autonomous driving features involving both longitudinal and lateral dynamics at high speed, is considered in the set of indices V = {1, 2, . . . , N} that should create the formation. Such vehicles start initially uncontrolled and proceed, scattered on the highway according to usual traffic patterns, with velocities v i (t) > 0, i ∈ V.
Assumption 1: All quantities necessary for the local control loops in the individual vehicles are available. In particular, each vehicle must access its own velocity (both along the local x and y axis) and yaw rate. Additionally, the position of each vehicle in a global reference frame (e.g. provided by GPS) must be made available to the coordinator. Due to the need for communication between the vehicles, a reliable information exchange must be guaranteed by either V2V or V2I architectures, or a mixture of the two.
Remark 1: In practice, sensors and observers are employed to access such measurements, as, e.g., proposed in [25] and [26]. The adoption of such strategies is out of the scope of the present work.
A generic framework is introduced where at some point in time t 1 a signal generated by a high-level controller triggers the creation of a formation in a certain area of the highway, with a specific shape. The formation proceeds with a predefined constant speedv > 0.
The shape of the formation can be arbitrarily defined, provided that an underlying grid-like pattern is respected, where the road is partitioned into cells as shown in Fig. 1. This means that the required formation must be representable by a matrix G = {g hj } of dimension L v × N l (with g hj being the element in the h-th row and j-th column), where L v is the number of cells along the lanes. It must be large enough to cover all the vehicles in V to be considered. Such a matrix can be defined by using zero entries except at positions, where a vehicle must be placed to construct the formation. For instance represent a square formation on 2 lanes (G 1 ) and a lattice formation on 3 lanes (G 2 ), respectively. The columns represent the individual lanes, i.e. the element in the first column and first row corresponds to the foremost cell on the first lane. The matrix G can be changed to dynamically address different objectives.
In order to translate the matrix representation of the formation in a spatial description, the width of the lanes w l is used as cells width. An additional parameter d > 0 specifying the intervehicle distance of the desired vehicle's centers is considered for the longitudinal direction. It is equivalent to the length of the desired cells in the formation. Fig. 1 shows a schematic representation of the concept, where the red dots correspond to the elements of the considered matrix G.
Assumption 2: The length L v d of the considered part of highway covers all of the vehicles to be controlled. The number of available cells in the formation equals the number of vehicles N to be controlled.
Assumption 3: All of the vehicles present in the considered region must actually be under control, with the ability to perform the required communication and control tasks.
Remark 2: Assumption 3 enables one to effectively implement the proposed strategy, but can sometimes be slightly relaxed in practice. For instance, the considered highway H may include only a contiguous subset of physical lanes, so that the ones not considered can be populated by uncontrolled vehicles. Also note that all vehicles are required to implement a normal operating mode -emergency situations will not be considered in the present work. A normal operation mode is defined such that no unexpected behavior of the vehicles (e.g. failures) and no critical situations on the lanes (e.g. due to unaccounted obstacles) occur.
Flexibility and scalability are two crucial characteristics for multi-agent control systems, and therefore are here addressed as specific requirements. The former relates to the possibility for the algorithm to be effective in a broad range of scenarios as well as to the possibility of implementing changes in parts of the algorithm without affecting the others. Scalability relates to the possibility to effectively extending the approach to an arbitrarily large number N of considered vehicles and lanes, which should not require major changes in the control architecture both from an algorithmic and an infrastructural point of view.
To formally state the problem, a global reference frame XY is introduced for the highway such that the lanes are parallel to the X axis and the edge of lane 1 coincides with Y = 0 (see Fig. 1). Additionally, a reference frame xy is considered with the same orientation as XY but moving in the positive X direction with velocityv(t) > 0. In both frames, the y-position of the center of the h-th lane is as depicted in Fig. 1. The grid is formally defined in terms of its cells, that are indexed in the space S = {1, 2, . . . , N l } × {1, 2, . . . , L v }. The spatial coordinates of the cell centers in the frame xy are given by for (h, j) ∈ S. Hence, the problem can be stated as follows: Problem 1: Find a scalable (for N → ∞) and flexible control strategy for the set of vehicles V with positions p i (t) = [p x,i (t) p y,i (t)] T with respect to the frame xy such that they converge to a position-based formation specified by the reference formation matrix G. Specifically, after having defined the desired formation in space as the set with h,j g hj (t) = N , a strategy must be found such that (5) and, at the same time, withv(t) being the required constant speed for the formation.
In addition, the constraints have to be respected without collisions between vehicles, independently of geometrical and dynamical vehicle characteristics (heterogeneous formations).

III. PROPOSED STRATEGY
The proposed formation control algorithm starts at time t 1 , where a higher-level control layer selects the parameters G, d, andv in order to pursue a particular objective on a specified section of the highway. The strategy is composed of two sequential, independent phases: (i) an initialization phase (described in III-A), to be performed once for a set of controlled vehicles, and (ii) an iterative phase composed of two successive steps (presented in III-B) which remains active throughout the whole control time to provide reshaping capabilities.
An event-triggered approach is proposed to provide an asynchronous activation of the different steps (see dotted lines in Fig. 2), while a sharp separation is put in place between global procedures (cheaper, carried out by the coordinator) and the local ones (more expensive, implemented in a distributed way at the vehicles level). Specifically, the general strategy is designed such that vehicles receive position references from the coordinator, and trigger it back when specific events occur. A high-level schematic representation of the proposed procedure is presented in Fig. 3 and explained next.

A. Initialization Phase
The initialization phase is started as soon as the coordinator receives an update signal from the high-level controller, which also provides the formation parameters. This phase is crucial in enabling the subsequent iterative phase, and thus must be performed once for every different selection of vehicles V.   (1). The grey cells must be covered to obtain the desired formation.
Among them, the vehicle with largest (global) X coordinate is designated as the virtual leader and referred to as vehicle 1. Then, a virtual grid, which constitutes one of the key concepts of the proposed strategy, is built in the moving reference frame xy of the formation (see Section II for its definition). The grid is attached to it in such a way that the x position of the leader p x,1 coincides with the center of one cell according to the first row of matrix G (see, e.g., g 11 in Fig. 4). This means that the grid moves with the velocity of the leader, i.e.v(t) = v 1 (t).
All the vehicles are subsequently required to attain and keep the reference velocityv, which is constant in this phase, and send trigger signals to the coordinator when this velocity is reached. Once the triggers are received from all of the vehicles (meaning that they exhibit no relative velocity with respect to the virtual grid), an association map Θ : S → V is defined based on the cell positions in x direction, i.e. p x , as stated in (3) to assign target cells to individual vehicles. This is done by evaluating where Variable p x,i represents the position of vehicle i in x direction.
To ensure that only one vehicle is assigned to one cell, the assignment process starts in the very first cells of the formation, see g 11 , g 12 and g 13 in Fig. 4. All vehicles, which cannot be assigned to a specific cell (because already a different vehicle was assigned to the cell), are associated with the next cell towards the end of the formation, and the assignment process is repeated for that new cell. Hence, all vehicles on a lane are uniquely assigned to an individual cell.
Note that the vehicles communicate their position to the coordinator, which performs the construction of suitable maps Θ and Ω. This process occurs at constant velocityv, so that all vehicles maintain their communicated positions independently of the time taken for computations and communications. At this point the coordinator sends to all involved vehicles (except the leader) a reference position to be tracked, which corresponds to the center of the assigned cell. Again, when convergence to a neighborhood of the (moving) reference position is attained (see 'init_distance_threshold' in Table I), the coordinator is notified by the respective vehicle. At the end of the initialization phase at time t 2 > t 1 , every controlled vehicle is placed, without speed difference with respect to other vehicles, at the center of a cell and the next (iterative) phase is started.

B. Iterative Phase
Each iteration in the iterative phase is composed of two distinct steps, namely the generation of reference positions for the vehicles to compose and keep the formation (computed by the coordinator and described in III-B1) and a reference tracking procedure (reported in III-B2), carried out by the vehicles independently. Specific features of the proposed approach can be controlled by means of the parameters shown in Table I, and described subsequently.

1) First
Step -Reference Generation: The generation of optimal paths through the grid cells is performed for all the involved vehicles through Dynamic Programming (DP) [27].
In particular, the goal cells are chosen among the currently non-occupied cells which are marked as to be covered to build the formation, starting with successive rows of matrix G. This means that the formation is created from the foremost part of the defined shape to avoid "holes," i.e. non-occupied goal cells, in between the vehicles in the formation. For example, the goal cells are selected in the order of g 11 , g 12 , g 21 , g 22 in Fig. 4.
The algorithm follows several steps to determine the vehicles to move into the formation: i) One DP problem is solved for all vehicles in the considered area of interest on the highway for the first goal cell (a detailed formulation of the DP is presented below).
ii) The vehicle with the lowest cost is selected to proceed to this first goal cell. Hence, all cells from the actual position of this vehicle to the goal cell are blocked, i.e. marked as not available. For the case of identical costs for two or more paths, one can select any one of them because they are equivalent in terms of the defined cost function. iii) If parameter 'max_parallel_dps' is larger than one, the next goal cell is chosen and a second round of DPs is solved for all remaining vehicles considering that the blocked cells are not available any more. The next minimum-cost vehicle is selected to move to the second goal cell and the corresponding cells on the way to the goal are marked as not available. iv) This procedure of assigning goal cells, solving the DPs, selecting the next vehicle to move and blocking the needed cells is continued until either the maximal number of DPs is reached or no feasible path to the actual goal cell is available. v) If the selected vehicles have finished their transition towards the formation (details see below), the procedure from (i) to (iv) is continued for the remaining goal cells, where all blocked cells, which are not goal cells of the previous round of DPs, are labeled as free again. This procedure is repeated until the desired formation is finally reached. Remark 3: Please note that the introduced algorithm ensures that no dead-lock situations can occur and the final formation is achieved in the end. The initial cell positions are uniquely assigned using (8) and (9). A crucial point in the algorithm is to define the formation such that the number of cells to be covered equals the number of vehicles in the considered region of the highway, see Problem 1 and Assumption 2. This is the task of the high-level controller shown in Fig. 2 and does not constitute any limitation of the proposed method. The procedure explained in points (i) to (v) ensures that the desired formation is reached in the end. Here, a vital part is the mechanism of blocking nonavailable cells to avoid dead-lock or other conflicting situations. In the worst case, only one vehicle can move to a goal cell in the formation (possibly with large cost). However, due to the assumptions of a sufficiently large number of target cells in the formation, the availability of one free cell is guaranteed. Consequently, at least one (and up to 'max_parallel_dps') DP is feasible. This implies that at least one vehicle moves towards its goal cell. The corresponding (blocked) cells on the way to the goal are marked free for the next round of DPs after the transition of the corresponding vehicle is finished.
Thanks to the flexibility offered by the proposed strategy, in fact, a desired choice of the selection policy can be made (possibly, changing over time). One can realize different metrics as, e.g., the total DP cost or the total expended energy.
The described process of subsequent DPs can be formalized using an increasing integer p ∈ N + associated to every subsequent round of considered DPs. Additionally, every cell in the grid c ∈ S is considered as a discrete state g p (q) ∈ S visited at step q during task p, with q = 1, 2, . . . , P DP and g p (q) a map associating, for every problem p, a cell to the step q. A set of is defined, corresponding to forward, backward, right, left and still, respectively. The actions are referred to the relative frame, i.e. a still action requires that a vehicle keeps the velocity of the reference frame. The left and right actions entail also a shift forward of one cell to better account for the non-holonomicity of the vehicles. In addition to set A, a set of feasible actions A(g p (q)) ⊆ A is defined. It only considers actions which do not lead to a conflict with occupied cells and ensure that the vehicles remain on one of the defined lanes of the highway. A cost function c(u(q)) is specified by the coordinator at any iteration depending on feasible actions u(q) ∈Ā(g p (q)). Note that this way of defining the states and costs leads to the fact that space constraints, i.e. the width of the highway, are automatically considered. Based on the occupation map Ω p considered during the p-th round of DPs, a goal cell g * p is chosen. The problem can then be formally stated as that of finding an optimal actions sequenceū * i,p = (u * i,p (0), u * i,p (1), . . . , u * i,p (P DP − 1)) leading from the location g i,p (0) to the goal state g * p , for every vehicle i ∈ V. Based on the Bellman's optimality principle, this translates in solving iteratively the optimization problems for q = P DP − 1, P DP − 2, . . . , 1, 0, where the cost function J i (q) is defined such that As a consequence of (12) and (13), entity J i (0) is the total optimal cost for vehicle i. A minimum-cost vehicle i * can then be found as that associated with the lowest cost. Hence, an optimal sequence of cellsḡ i * ,p = (g i * ,p (0), . . . , g i * ,p (P DP − 1)) follows for vehicle i * that is induced by the optimal action sequence u * i * ,p . The occupation map is then updated for the next round of DPs, marking all of the cells inḡ i * ,p as occupied. Additional cells are blocked in the case of lane changes as shown in Fig. 5. The light shaded cell is marked as occupied even though it does not belong to the cells composing the optimal path (darker shaded), in order to avoid the possibility of generating another path (corresponding to the dashed red trajectory) that would cross the already composed path (solid green).
The procedure is then repeated, for a number of times specified in 'max_parallel_dps', considering the new occupation map Ω. It is stopped earlier if none of the currently available vehicles can find an optimal path to the goal cell, meaning that a feasible path is not physically available, yet. However, at least one DP is feasible for sure in each round of DPs until the final formation is created.
The reference generation is then performed as a subsequent task, which periodically updates position referencesp i . In particular smooth reference trajectories are generated for all p minimum-cost vehicles i * . Third-order curves are used here to generate positions and yaw references, but any trajectory generation procedure providing sufficiently smooth paths can be effectively adopted.
For all of the other vehicles, which are not associated with a minimum-cost path, the reference generation mechanism comprises the possibility of adopting different strategies depending of the particular situation and the algorithm settings. The default reference is the center of the cell, which is currently associated with the vehicle. Two different flags can be employed to modify the behavior of non-minimum cost vehicles. The 'vehicles_approach' flag enables the possibility for the vehicles to approach the formation, being associated sequentially to free cells (if existent) ahead of the currently occupied one. In that case, a minimum number of cells 'free_cells_ahead' is to be kept free between the final position of the considered vehicle and the last cell composing the formation, i.e. the last row of G, or the next cell occupied by a vehicle. The main advantage of such an approach is that the non-minimum cost vehicles uniformly move towards the formation to reduce the creation time of the desired overall formation.
As an additional mechanism, a subset of lanes 'outer_lanes_go_backwards' can be defined such that non-minimum cost vehicles assigned to cells belonging to these lanes are successively assigned to cells at the back of the currently occupied one. The variable 'free_cells_behind' can be set so that a number of cells is kept free with respect to the closest vehicle in the same lane. A maximum distance, in terms of cells, between the farthest cell that can be assigned with respect to the formation and the formation itself can be specified in 'max_cells_dist_behind'. This particular feature is included so that situations can be considered, where vehicles are required to join the formation from other lanes, as, for instance, in case of construction sites ahead. To this end, a discount factor ('outer_lanes_discount') is also included so that lane changes performed from the lanes specified in 'outer_lanes_go_backwards' are discounted. This encourages the inclusion of the vehicles in the formation.
Note that in order to implement the local MPC control loops, a sequence of references is computed over the prediction horizon P considering a sampling time T s which must match in the coordinator and the single vehicles. Additionally, a reference yaw angleψ i is included, so that at each sampling time a reference sequencē is defined for vehicle i. The reference generation is entirely carried out inside of the coordinator, which provides the reference positions to the respective vehicles. The communication between the coordinator and the vehicles is on demand, but only asynchronously done when needed as, e.g., in the case that a vehicle reaches its desired position in the formation. At this point the reference tracking step (described in III-B2) is performed by the subset of vehicles obtaining a minimumcost path in the DP step. While all of them need to trigger the coordinator once they reach their goal cell, the information about the subset of vehicles in charge of performing trajectory tracking is particularly important. In fact, if the flag 'new_dp_after_completion' is set, the iteration finishes as soon as one vehicle finished trajectory tracking. Please note that all other vehicles are stopped in their process of trajectory tracking only if they reach a safe position, i.e. the neighborhood of a cell center. If 'new_dp_after_completion' is not set, the coordinator waits for all of the minimum-cost vehicles to carry out their associated tasks before starting the successive iteration of reference generation.

2) Second
Step -Reference Tracking: For each of the vehicles required to implement local control loops, a reference sequence (14) is received from the coordinator for the position and the yaw angle. On the basis of this information, an MPC reference tracking problem is formulated as a quadratic program, where the subscript i indicating the vehicles indices is omitted for the sake of simplicity.
The single-track model [28], [29] is employed assuming that no torque is generated by differences among the forces applied on the two wheels of the same axle, and additionally the steering angles are equal for the two front wheels. As a direct consequence, the vehicles can be considered as formally having a single wheel for each axle. The resulting equations for the longitudinal and lateral velocities, v x (t) and v y (t), and the yaw angle velocityψ(t) in the vehicle reference frame are then ⎧ ⎪ ⎨ ⎪ ⎩v where m and J z are the mass and the moment of inertia of the vehicle, respectively. The inputs are the total longitudinal forces F l,f (t) and F l,r (t) exerted by the front and rear wheels and the steering angle δ(t), respectively. The total longitudinal force F x (t) = F x,f (t) + F x,r (t) is then obtained projecting the input forces exerted by the wheels with respect to the vehicle reference frame, induced by the steering angle δ(t), i.e. F x,f (t) = F l,f (t) cos(δ(t)) − F c,f (t) sin(δ(t)) and F x,r (t) = F l,r (t). Here, the forces in y direction are given by F y,f (t) = F x,f (t) sin(δ(t)) + F x,f (t) cos(δ(t)) and F y,r (t) = F c,r (t). The cornering forces F c,f (t) and F c,r (t) are obtained resorting to a linear description of the tireroad interaction forces [28] in terms of the wheels sideslip angles α f (t) and α r (t). In particular, F c, , where C f and C r are the cornering stiffness coefficients for the front and rear wheels. The force F loss (t) in (15) accounts for the aerodynamic and rolling resistance losses, while l f and l r are the distances between the front and rear axle and the vehicle center of mass, respectively.
Assumption 4: The input forces F l,f (t) and F l,r (t) are actually exerted on the corresponding axles.
Remark 4: Regarding Assumption 4, it is understood that in practice, the actually controllable inputs are torques exerted at the wheels level, which interact with the ground and generate forces according to the tire-road interaction characteristic and the wheels dynamics. Thus, it is required that fast-enough lowlevel slip controllers are available (see, e.g. [30]) so that the input (reference) force can safely be considered as the actually exerted force. In addition, low-level stability control systems can be included to ensure that the cornering tire-road characteristic curve remains in the linear region during all the maneuvers.
Since (15) describe quantities in the vehicle reference frame, the kinematic model is also considered to obtain the positions in the global reference frame.
In order to implement the local position-tracking at the vehicles level, Model Predictive Control (MPC) is exploited, see, e.g., [31]. A sampling time T s is introduced along with a prediction horizon P ∈ N (based on which the references in (14) are generated) and a sequence of inputs is considered for every sampling step k (corresponding to the sampling time kT s ). Please note that the formulation adopted in this work relies on a continuous time framework, so that the input signal is a piece-wise constant function, namely With respect to the problem under consideration, the vehicle inputs are taken as u(t) = [F l,f (t) F l,r (t) δ(t)] T see (15) and Assumption 4. Then, a successively linearized version of the model in (15) and (16) is computed on the basis of the initial vehicle state for each step k, so that a system in the form is obtained, where f (·) is the vector function obtained stacking the right-hand sides of (15) and (16) (thus representing the complete vehicle model in form of first order differential equations), while A k ∈ R 6×6 and B k ∈ R 6×3 are obtained through a standard linearization procedure around the reference state x k and the reference input u k = [0, 0, 0] T . These latter are therefore the Jacobians of f (·) with respect of x(t) and u(t), respectively. As a consequence, Δx The MPC cost function is then defined to reflect the control objectives on the prediction horizon, namely the tracking of the references (14), by means of the diagonal matrix Q 0, wherer l =x l − r l , with r l as defined in (14), andx l is the vector of predicted states at step l according to model (16) starting from x(kT s ) and sampled over the considered prediction horizon, i.e. at time instants (k + 1)T s , . . . , (k + P )T s . Thus,r l represents the predicted tracking error at the time step l, obtained sampling the trajectories computed inside the MPC procedure, dependent on the input sequenceū k . The diagonal matrix R 0 is employed to weight the inputs in the cost function. Apart from the linear dynamics constraints given by (19), a set of convex constraints is introduced for the states and inputs. In particular, the relationships are enforced for the inputs, while are used for the velocity, the acceleration and the jerk by means of the backward finite-difference method. The terms v(t − ) and v(t −− ) are the velocities sampled at one and two previous sampling instants, respectively. Note that this formulation, as well as the adoption of the relationships (19), (20), (21) and (22), allows to cast the MPC optimization problem into the form of a quadratic program. Once obtained the optimal sequence, according to the receding horizon approach, only the first input, u * k , is actually applied for t ∈ [kT s , (k + 1)T s ). Note that, although not explicitly included in the presented formulation for the sake of simplicity, constraints addressing collision avoidance can be included, see, e.g., [32], [33].
Remark 5: The linear MPC formulation adopted here focuses on a QP formulation to keep the complexity low, and constituting a basic but performing solution which can be further extended. For instance, in order to introduce robustness, one could resort to additional techniques based on integral sliding modes (ISM). For example, works as [34]- [36] propose a multi-rate architecture in which a fast ISM correction term is aimed at rejecting the matched disturbances with a negligible increase in complexity. Additionally, the linearization strategy proposed in [37] can be straightforwardly adopted with possible improvements in prediction accuracy. While the mentioned strategies preserve the QP nature of the optimization problem, the introduction of nonlinear constraints and cost functions can be exploited to consider, e.g., explicit safety constraints or efficiency-related metrics. The adoption of such modifications is always a trade-off between computational complexity and actual practical improvements.
Remark 6: Please note that a basic MPC formulation is used in this work to convey the main points of the two-step iterative procedure. The actually chosen MPC formulation is not the focus of this work, but can be extended in many ways. For example, one can add terminal constraints and a terminal set in the underlying optimization problem to guarantee stability. This is explained, e.g., in [38] and [39] in detail.
The coordinator is periodically updated by the vehicles as soon as they enter a neighborhood of a cell (radius 'goal_distance_threshold'), so that the considered cell is assigned to it in the map Θ and all of the cells behind it, previously occupied, are freed in Ω. This is crucial, if 'new_dp_after_completion' is set, in maintaining a coherent description of the cells status. In any case, the vehicles trigger the coordinator as soon as they enter a neighborhood of the center of the goal cell.

IV. SIMULATION RESULTS
Two different simulation scenarios are considered in order to exemplify the proposed approach and show the flexibility of the proposed strategy. In the first simulation (described in IV-A), a rectangular formation is created and then reshaped in order to leave the first lane free. This maneuver highlights the capability of the algorithm to effectively create the formation from the initial traffic flow and to straightforwardly adopt reconfiguration mechanisms when required, reacting on the external signals coming from the high-level controller and the algorithm parameters in Table I. The second simulation, discussed in IV-B, is proposed in order to evidence the ability of creating and reshaping formations with arbitrary positions for the vehicles. In particular, the possibility of dynamically reconfigure the formation to allow a vehicle to exit the highway is shown.
The continuous-time simulations and the MPC optimizations are carried out for the two considered simulation scenarios in Matlab using the CasADi framework [40].

A. Formation Creation
A 2 × 4 rectangular formation is created in the first two lanes as the first task addressed in this scenario, starting from N = 8 vehicles randomly placed on the highway. Although similar to the maneuvers proposed in [24], here a higher number of vehicles is considered that proceed at high speed (v = 30 m/s). Additionally, the vehicles are equipped with local MPC controllers and a maximum of three rounds of DPs are considered ('max_parallel_dps'= 3). The approach mechanism presented in Section III-B is exploited for non-minimum cost vehicles ('free_cells_ahead'= 1). The considered highway comprises 3 lanes (i.e. H = {1, 2, 3}) of width w l = 4 m, and the dynamics of the vehicles follow the description in (15) and (16).
The local MPC loops consider a prediction horizon P = 8 with sampling time T s = 0.3 s. The vehicles have random masses m ∈ [800, 1400]kg and random moment of inertias J z ∈ [900, 1300]kgm 2 . In addition, the same input constraints are considered such that the total input force is constrained by  In Fig. 6, some key frames of the simulation are reported. The vehicles start in random positions on the lanes of the highway. Red boxes represent desired goal cells of the formation, whereas dark blue cells are used to link individual vehicles to their currently covered cells. Blue and green lines stand for cell center tracking and path tracking trajectories, respectively. Frame 6(a) depicts a situation in which the initialization phase is in progress, with the grid already been constructed (with cell length d = 25 m) and attached to the leader (red square). The cells to be occupied by the formation are highlighted in red, while the blue segments link the vehicles with the respective associated cell centers. The blue lines highlight the measured distance between the vehicle and the currently associated cell.
In frame 6(b), taken during the iterative phase, it can be highlighted how three different trajectories have been generated (green curves) with respect to three different goal cells in the formation ('max_parallel_dps'= 3). Re-planning is considered for this particular situation ('new_dp_after_completion'= 1) and red circles correspond to the instantaneous position references. In frame 6(c), it is possible to evidence how the formation is effectively created (approx. at t = 95 s). The reshaping occurs afterwards, and entails moving all of the vehicles to the second lane, to free the first one. To do so, it is only required that the high-level controller passes to the coordinator the new reference formation and triggers for an update. The iterative phase is then carried out as usual, only considering the new reference cells. In this particular case one can see how the vehicles move backwards to achieve the new shape (frame 6(d)) due to the fact that, additionally with respect to the results from the DP stages (in this case, 'max_parallel_dps' is lowered to 1), a mechanism for which in the first lane the vehicles are encouraged to decelerate is employed (i.e. 'outer_lanes_go_backward'= 1), with 'free_cells_behind'= 0. In Frame 6(e) the new formation is finally attained.

B. Formation Reshape
In the second maneuver, the formation creation is followed by a situation in which a vehicle not occupying the rightmost lane needs to exit the highway, and therefore has to leave the formation. A reconfiguration maneuver is required to be performed so that space is left for the considered vehicle to safely reach the rightmost lane and, consequently, be able to leave the highway. For the sake of simplicity, the same parameters employed for the previous simulation are kept, so that the only difference lies in the number of vehicles (N = 9), the initial positions and velocities, and the reference shape of the formation, which covers all the 3 lanes of the highway and is 3 cells long.
In the first simulation frame in Fig. 7(a), the initial vehicles positions are depicted, together with the cells to be occupied in order to cover the reference formation shape. Frame 7(b) represents a situation in which, during the iterative phase, the vehicles move towards goal cells, independently and in parallel. The first formation of the vehicles is reported in frame 7(c). Then, the vehicle which needs to leave the highway triggers a leave request to the coordinator, and this latter reshapes the formation to allow the vehicle reaching the rightmost lane through free cells (see the new reference cells in frame 7(d)). Frame 7(e) reports the backup movements of nearby vehicles to achieve such positions, while the final shape is achieved in Frame 7(f).
This highlights the fact that arbitrary formations can be imposed and the algorithms underlying structure takes care of moving the vehicles appropriately and in a completely automatic way. Following the arrow in Fig. 7(f), the respective vehicle can safely move to the first lane and then exit, implementing the same trajectory generation and following technique proposed in this work or any other control strategy guaranteeing collision avoidance in the traversing of the cells.

V. CONCLUSION AND OUTLOOK
In this article a novel algorithm for the creation and dynamic reshaping of formations in highway scenarios is proposed for the case of heterogeneous vehicles. The basic idea relies on a mixed discrete-continuous space representation in a two-step iterative procedure.
Thanks to the inherent flexibility of the algorithm, the inclusion of specific features is possible easily and may be particularly appealing to increase the range of practical applicability of the strategy. One of the main improvements in this direction may be the inclusion of mechanisms to effectively account for the presence of uncontrolled vehicles in the area of interest (e.g. by blocking additional cells for the DPs). Moreover, although the discussion about specific V2V or V2I communication strategies is out of the scope of the present work, the adoption of specific strategies constitutes a possible additional research question.
At the level of the local control loops, additional interaction between the vehicles may be proposed to increase safety or control performance, as, e.g., in [14] in for platoon control.
The discussed simulation results highlight the effectiveness of the approach, with particular reference to the scalability. In fact, the application of the presented strategy to an arbitrary number of vehicles does not require modifications to the algorithm in order to achieve the desired behavior. Additionally, the introduction of mechanisms such as the parallelization of the DP computations reduces the overall time to create formations in many cases, addressing effectively the situation in which a large number of vehicles is present.