Path planning and collision avoidance methods for distributed multi-robot systems in complex dynamic environments

: Multi-robot systems are experiencing increasing popularity in joint rescue, intelligent transportation, and other fields. However, path planning and navigation obstacle avoidance among multiple robots, as well as dynamic environments, raise significant challenges. We propose a distributed multi-mobile robot navigation and obstacle avoidance method in unknown environments. First, we propose a bidirectional alternating jump point search A* algorithm (BAJPSA*) to obtain the robot’s global path in the prior environment and further improve the heuristic function to enhance efficiency. We construct a robot kinematic model based on the dynamic window approach (DWA), present an adaptive navigation strategy, and introduce a new path tracking evaluation function that improves path tracking accuracy and optimality. To strengthen the security of obstacle avoidance, we modify the decision rules and obstacle avoidance rules of the single robot and further improve the decision avoidance capability of multi-robot systems. Moreover, the mainstream prioritization method is used to coordinate the local dynamic path planning of our multi-robot systems to resolve collision conflicts, reducing the difficulty of obstacle avoidance and simplifying the algorithm. Experimental results show that this distributed multi-mobile robot motion planning method can provide better navigation and obstacle avoidance strategies in complex dynamic environments, which provides a technical reference in practical situations.


Introduction
The technology of autonomous mobile robots is developing rapidly, and it is widely used in entertainment, mining industry, education, medical services, military reconnaissance, agricultural automation, planetary exploration, and other fields [1].Path planning and obstacle avoidance technology are critical to achieve autonomous robot navigation, which determines the application prospects of mobile robots.Meanwhile, multi-robot obstacle avoidance technology remains a relevant research problem in dynamic and complex environments.
Path planning is generally divided into global and local path planning.Global path planning involves planning an optimal or suboptimal safe path with priori map information [2].In contrast, local path planning is designed with dynamic obstacle environments in real-time.The robot usually must acquire details about the environment, including the coordinates of static and dynamic obstacles, with the help of a local path planner [3].
Current global path planning in a known environment has attracted significant interest.Numerous algorithms have been explored, including the A* algorithm [4][5][6], ant colony optimization [7,8], particle swarm optimization [9,10], bacterial foraging optimization [11,12], bat algorithm [13,14], and whale optimization algorithm [15,16], etc.With the advantages of a simple structure, facilitated implementation, and fast planning, the A* algorithm has been popular among researchers [17].Wang et al. [5] used a bidirectional search strategy to improve the A* algorithm, which significantly enhanced the search performance by simultaneously conducting the iterative search in both positive and negative directions.Zhang et al. [6] enhanced the node expansion method of the A* algorithm based on the jump point search (JPS) strategy, which significantly reduced the memory overhead and the search scale.We further mention some research on intelligent optimization algorithms.Miao et al. [8] introduced an angle guidance factor and an obstacle exclusion factor in the transfer probability of ant colony optimization, and the global search ability and convergence speed of the algorithm were balanced.Song et al. [9] combined an adaptive fractional-order, velocity-improved PSO algorithm with the continuous high-degree Bezier curve to plan smooth paths for mobile robots.Hossain et al. [11] searched for the shortest path in a dynamic environment based on the bacterial foraging optimization algorithm.Tang et al. [13] presented the first application of the bat algorithm to a collaborative multi-robot search task in an unknown environment.They used adaptive inertial weights and the Doppler effect to improve the frequency formulation to avoid premature convergence.Yan et al. [15] proposed a whale optimization algorithm based on the forward-looking sonar to solve the 3D path planning problem for UUVs, with strong stability and search capability.
The studies mentioned above [4][5][6][7][8][9][10][11][12][13][14][15][16] conducted some work to improve the efficiency of path planning.However, they yielded a few practical solutions to the obstacle avoidance problem of mobile robots in the actual dynamic environments.Ensuring the safety of robots with the help of local path planning is an effective solution when the environment is dynamic and full of uncertainties [18], and the main popular local path planning algorithms are the dynamic window approach (DWA) [19] and artificial potential field method (APF) [20], etc. DWA is a highly efficient, real-time obstacle avoidance algorithm that transforms the path planning problem into the constrained optimization problem of the velocity space and controls the robot motion by outputting the optimal real-time speed [19].However, DWA faces problems, such as local optima and a low successful obstacle avoidance rate for dynamic obstacles.Therefore, Chang et al. [21] modified and extended the evaluation function of DWA and used the reinforcement learning method to adaptively adjust parameters, which further enhanced the planning effect.Lin et al. [22] improved the avoidance rate of DWA against dynamic obstacles by using a fuzzy control scheme to evaluate the danger level of moving obstacles through collision risk index and relative distance.Furthermore, there has been significant interest in APF.Cheng et al. [23] introduced optimal control theory to reformulate the UAV path planning problem as a constrained optimization problem with APF.Orozco et al. [24] solved path planning problems in dynamic environments by combining membrane computing with a genetic algorithm and APF.In general, it is popular to combine local path planning algorithms with global ones to cope with environments of increasing complexity and uncertainty.The hybrid algorithms allow the robot to connect global path optimality and stochastic obstacle avoidance to a relatively large extent [25].Ji et al. [26] combined the A* algorithm with an adaptive DWA for global path planning research that solves robot motion in a complex environment.Wang et al. [27] combined the improved PSO algorithm with APF for USVs to solve the dynamic path planning problem in complex offshore regions.
The studies mentioned above [19][20][21][22][23][24][25][26][27] explored the global or local path problems from different perspectives, but with less attention to multi-robot obstacle avoidance.The aim of multi-robot path planning is to find a conflict-free path from the start to the target for each robot.The motion of mobile robots is disturbed not only by known factors in the global environment, but also by dynamic obstacles and other autonomous robots, which making it necessary and practical to design an obstacle avoidance system for multiple autonomous robots.In the context of research on multi-robot strategies, the main approaches are either centralized or distributed.The centralized approach considers the cost or objective function, where the constraints for all robots are considered together, thus obtaining the paths of individual robots in a global search.It prioritizes completeness with less attention to the personal robot [28].One of the more popular ways to employ the centralized approach is the formation control, where the mission planning information and formation information is integrated into a leader robot, while the other robots act as followers.The leader coordinates the actions of each follower to maintain the formation from the start to the end.Dai et al. [29] proposed a multi-robot formation switching strategy incorporating a priority model, where the leader robot with the highest priority is responsible for planning a safe path and guiding the follower robots, and the following robots switch into an obstacle avoidance formation by calculating the desired distance and angle.Sang et al. [30] combined A* and APF for the USVs formation problem, using the A* algorithm to plan the globally optimal path, dividing it into multiple sub-target points, and used the improved APF for path tracking and performing formation obstacle avoidance.In distributed multi-robot path motion planning, each robot independently determined its collision-free trajectory path towards the goal without colliding with static obstacles or colleagues.The navigation problem for a distributed-based multi-robot is divided into path planning and movement phases, planning a globally optimal path for each robot and maintaining the safety of multi-robot movement.Das et al. [31] added the consideration of path deviation and energy consumption optimization by embedding the social and cognitive behavior of an improved particle swarm algorithm (IPSO) into the Newtonian gravity of an enhanced gravity search algorithm (IGSA).They proposed IPSO-IGSA to implement path planning for multiple robots in dynamic environments and improve search capability by simultaneously updating particle positions using IPSO velocity and IGSA acceleration.In subsequent research, the authors [31] further investigated the multi-robot collision-free planning problem by mixing improved particle swarm optimization (IPSO) and evolutionary operators (EOPs) [32].Further, some scholars [33] set different priorities for each robot by the prioritization method, thus reducing the possibility of robot collisions.
This study proposes a distributed multi-robot navigation and obstacle avoidance method in unknown environments, applying it to path planning and navigation.The main contributions are as follows: 1) In global path planning: A jump point search strategy and a bidirectional alternating search strategy are introduced to the conventional A* algorithm, and heuristic functions are designed based on its characteristics, called BAJPSA*, which we efficiently obtain the robots' globally optimal path.2) In local path planning: ➢ First, considering dynamics and environmental constraints, the robots are constructed based on DWA.Then, the adaptive navigation strategy and path deviation evaluation function are proposed for improving the path tracking accuracy and optimality of our robots.➢ Second, according to the potential collision situations between the robot and dynamic obstacles, we improved the obstacle recognition method and designed three obstacle avoidance rules, which increase the robot's success rate in avoiding dynamic obstacles with a higher move velocity or bigger size.➢ Finally, the distributed multi-robot systems are extended from our above single-robot obstacle avoidance algorithm.Focusing on the motion conflicts among multiple robots, we propose a collision recognition strategy and fuse it with a task prioritization strategy to coordinate the robots' motion and obstacle avoidance.This paper is organized as follows: Section 2 describes our BAJPSA* algorithm.Section 3 describes our multi-robot motion planning algorithm.Section 4 discusses the experiments.Section 5 concludes the whole paper and discusses future work.

Conventional A* algorithm
The A* algorithm is a classical heuristic search algorithm, where the algorithm selects the node with the smallest evaluation value as the next expanded node in the search process [34], and the evaluation function is expressed as: where  is the current node and the evaluation function () is used to calculate the total cost of the current node; () is the actual cost from the starting point to the current node ; ℎ() is the heuristic function used to estimate the cost from the current node  to the target point.
The A* algorithm is a search method based on grid traversal [35], such that it must establish a suitable motion environment for mobile robots.A 2D grid map is shown in Figure 1, where Figure 1(a) is the most widely investigated environment for robots with a priori knowledge, containing black obstacle regions and passable white regions.

Conventional jump point search (JPS) algorithm
The JPS algorithm was proposed by Daniel Harabor and Alban Grastien [36,37] in 2011, which is based on the A* algorithm to find paths by defining and computing heuristic values for those nodes on the uniform cost grid graph where the jumping rules are satisfied.It is several orders of magnitude faster than the A* algorithm in terms of computational speed, and the memory overhead as well as the computational effort are significantly reduced, which has been proved by Harabor et al. [36].The main steps of JPS include two parts [36]: (1) pruning rules,which filter out the nodes in the grid map that do not need to be expanded and eliminate them.(2) Jumping rules,which identify the jump nodes in the grid map and evaluate them.

Pruning rules
The set of neighboring nodes around node  is defined as the ℎ(), and the cost of moving one grid in straight is 1, and diagonal is √2.The function of pruning rules are to recursively prune the set of neighbours around each node, which means pruning all nodes that can be reached optimally by a path that does not visit the current node.Besides, the process of pruning rules is performed entirely online, involves no preprocessing and has no memory overhead.

Unknown dynamic obstacles
Known static obstacles

Known static obstacles
Situation 1: ℎ() contains no obstacle 1) Straight moves When node  is not adjacent to an obstacle, and the algorithm is extended along the straight direction, the node  ∈ ℎ() that satisfies Eq (2) will be pruned.

2) Diagonal moves
When node  is not adjacent to an obstacle, and the algorithm is extended along the diagonal direction, the node  ∈ ℎ() that satisfies Eq (4) will be pruned.
Situation 2: ℎ() contains an obstacle Furthermore, when ℎ() contains an obstacle, Eq (2) will not be able to prune all nonnatural neighbors due to the presence of obstacles.Therefore, the concept of the forced neighbor is introduced.A node  ∈ ℎ() is forced if: 1)  is not a natural neighbor of node ; 2)  satisfies the rule of Eq (3).

Jumping rules
Node  is the jump point from node  , heading in direction  ⃗⃗⃗ , if  minimizes the value  such that the  =  +  ⃗⃗⃗ and one of the following conditions holds: (1) Node  is the target node.
(2) Node  has at least one neighbour that is a forced neighbor.
( Figure 3 shows an example of a jump point identified by Condition 3. The dashed line indicates the process of the JPS algorithm searching along the diagonal direction after failing in a straight direction, and the solid lines indicate the path formed by node  and jump points.According to Condition 2 of the jumping rules, nodes  and  have a forced neighbor  and , respectively, so nodes  and  are jump points.According to Condition 3, node  can be reached along the horizontal direction from jump point , such that node  is also a jump point.

Bidirectional alternating search strategy
Bidirectional search defines the forward search from the starting point to the target point and the reverse search from the target point to the starting point, however, it has the following two problems: 1) As shown in Figure 4, the bidirectional search is conducted from the starting point and the target point at the same time, which may result in two different paths being searched.2) Theoretically, forward and backward searching simultaneously search toward the target and starting points and meet at their geometric center [38].In this case, the algorithm has the highest search efficiency.However, the obstacle density and distance between jump points are different, and the paths may not meet at the midpoint.For the above reasons, we use a bidirectional alternating search strategy, where the forward and backward searches are alternated, and only the forward search finds the jump point before starting the backward search.In this way, the forward and reverse searches meet at the midpoint as much as possible and will benefit the search efficiency of our algorithm.The specific steps of the BAJPS strategy are as follows: Step 1: Create two OPEN lists and two CLOSE lists: OPEN_1 and CLOSE_1 are used to store the jump points to be checked, and the expanded jump points in the forward expansion process, respectively, and OPEN_2 and CLOSE_2 are used to store the jump points to be checked and the expanded jump points in the reverse expansion process.Add the starting node S to OPEN_1, add the target node T to OPEN_2, and set both CLOSE lists to empty.
Step 2: Alternate forward and reverse iterative jump point searches, starting with the forward search.
(1) If there was at least one node in the list of OPEN_1, select the lowest cost node  based on the valuation function (); if node  was the target point, the search process is terminated, and the path returned; otherwise, the node  is removed from the list of OPEN_1 and added to the list of CLOSE_1.
(2) Starting from node , continue to search jump point  1 in the direction of its natural successors.Horizontal and vertical search directions are executed preferentially and only consider diagonal directions when obstacles are encountered, or map boundaries are reached.
A. If there was no searched jump point or the returned node  1 was in CLOSE_1, it is ignored.B. If the returned node  1 was not in the OPEN_1, add it to OPEN_1 and calculate its (), ℎ(), and ().Regard the node  as the parent node of the node  1 .
C. If the node  1 was in the OPEN_1, update () and calculate whether () is below its previous value.If yes, change the node  as the parent node of the node  1 .and calculate ().
Step 3: The reverse search for jump point  2 , with its corresponding parent node  2 , begins as soon as the forward search is completed and obtains jump point  2 .
Step 4: The forward and backward jump points are searched alternately, and when there are the same jump points in the CLOSE list, the search would be finished.
Step 5: From the same jump point that appeared in Step 4, connect the jump points deposited in forward and reverse directions in sequence to obtain the eventual route.

Improving the heuristic function
The traditional A* algorithm uses the Euclidean distance, Manhattan distance, or Chebyshev distance to calculate the heuristic function [39] and the distance functions are as follows: 2  (5) where ( 1 ,  1 ) and ( 2 ,  2 ) denote the coordinates of the current and the target nodes, respectively.The A* algorithm with Manhattan distance performs a four-directional search.In contrast, considering that the Euclidean distance is expanded to a broader eight-neighborhood, the obstructive effect of obstacles within the environment will lead to the heuristic value of the evaluation function being smaller than the actual value.Therefore, we combine Euclidean distance and Chebyshev distance to design a heuristic function that is more consistent with our BAJPSA* algorithm as Eq (8).
The improved heuristic function can appropriately reduce the weight of the Chebyshev distance according to the JPS to improve the solution of the optimal path.The obtained heuristic value is closer to the actual path cost and further reduces the number of nodes to be evaluated, which improves the search efficiency of the algorithm.
The pseudo-code of path planning process based on the BAJPSA* algorithm is as follows: Algorithm 1. Bidirectional Alternating Jump Point Search A* (BAJPSA*)

8:
Calculate the f (n) value of all nodes in the Open_1 and Open_2 according to Equations ( 1) and ( 8) ; 9: Close_1 and Close_2 ← The node with the smallest f (n) value in the Open_1 and Open_2, respectively; 10: Positive node and Reverse node = the node with the smallest f (n) value, respectively; 11: If Positive node ==Reverse node;

12:
The optimal path is obtained and the algorithm ends;

3) Multi-robot motion planning
Research on mobile robots relying on multiple sensor fusion technologies to sense the surrounding environment information, combined with appropriate local path planning algorithms to avoid moving obstacles or seeking dynamic goals, has been among the most popular topics in the field of robotics in recent years [40].The APF method is favored by scholars owing to its high flexibility and smooth planning trajectory [41].However, there are problems such as path oscillation and difficulty in ensuring path optimality when facing the actual, more complex natural dynamic environment.This study proposes an improved dynamic window approach (DWA) in Section 3.4 with great real-time and flexibility to make our multi-robot adapt to more complex and changing environments.

Robot kinematic model
Considering the two-wheel differential robot kinematic model shown in Figure 5, (), () and () are the linear velocity, angular velocity and direction of motion of the robot at the current moment , respectively.Then, the motion state of the robot at the moment  + 1 can be expressed as:

Speed sampling
DWA describes the obstacle avoidance as an optimization problem with constraints in the velocity space.The conditions mainly include the incomplete constraints of the differential robot, the limitations of environmental obstacles, and the dynamics constraints of the robot structure.As shown in Figure 6, the search space of the robot is constrained by its maximum and minimum speed, motor performance, and braking distance to constrain the motion speed (, ) within a certain range.Volume 20, Issue 1, 145-178.
. Schematic diagram of robot constrained in velocity space.
According to the velocity limit of the robot,   is defined as the set of linear and angular velocities of the robot to reflect the maximum range of the search solution, and the velocity constraint of the robot is: In practice, the robot is limited by the motor torque constraints.It is theoretically impossible to reach the maximum and minimum reachable linear velocity  and angular velocity , such that the search range of the dynamic window is further reduced.Given the linear velocity   and angular velocity   , the velocity   in the ∆ sampling period under the considered motor constraint is: where   and   are the linear and angular velocities at the current moment, respectively;    and    are the minimum linear and the minimum angular deceleration, respectively;    and    are the maximum linear and the maximum angular accelerations, respectively, and ∆ is the sampling time.
The trajectory of the whole robot can be subdivided into several straight lines or circular arcs.To ensure the robot's safety area, the current speed must be able to decelerate to zero before hitting the obstacle under the maximum deceleration condition.Then, the braking distance of the robot is constrained as follows: where (, ) is the distance between the simulated trajectory (based on velocity group (, )) and the nearest obstacle, that the simulated speed must satisfy 0 −   2 = −2(, )   and 0 −   2 = −2(, )   to guarantee the robot's safety to a greater extent.In summary, according to the three constraints of the robot search space, the input range for velocity control can be expressed as follows:

Evaluation function
The robot's linear velocity () and angular velocity () are sampled and combined with its kinematic model to simulate several trajectories within  .The evaluation function selects the trajectory with the highest evaluation value, and the corresponding velocity group (, ) is passed to the robot motion.The traditional evaluation function is as follows: where (, ) is the navigation function, which indicates the azimuthal deviation between the end direction of the trajectory and the current target point; (, ) is the obstacle avoidance function, which shows the distance between the trajectory and the nearest obstacle; (, ) is the evaluation function of the robot motion speed at the current moment; σ is the normalization process; ,  and  are the weighting coefficients of the corresponding evaluation functions, respectively.

Improved DWA
The widest method [26] takes the turning points of global path planning as the crucial waypoint to guide the robot's motion.However, this is not suitable for the case of power inspection robots, where the global path tracking accuracy must be strictly guaranteed.Besides, traditional DWA is ineffective for avoiding dynamic obstacles in an unknown environment and is highly susceptible to collision with such obstacles.To increase the obstacle avoidance and global path tracking capability of our multi-robot systems in a dynamic environment, we enhance the performance of the evaluation function of conventional DWA and propose a solution to the multiple conflicts that exist between dynamic obstacles and multiple robots.

1) Improvement of 𝑯𝒆𝒂𝒅(𝒗, 𝝎) evaluation function target point tracking method
In related studies [42], the most critical nodes that provide navigation information for robots are turning points of the global path, and the path tracking accuracy is poor.We investigated the method of Yang et al. [40] that extracted the nodes of three times B-spline paths as key navigation points, and designed the function  [. ] to reorganize the path Route generated by BAJPSA*, as shown in Figure 7 and Eq (15): where [.] is the crucial navigation point extraction function, designed in the following way: First, connect the adjacent path points () and ( − 1), following form a line equation, then solve for the sequence of node coordinates (, ) that satisfies the desired node distance   on that line from the starting point ( − 1) and deposit them into  in turn until all equations composed of the path  are cycled through;  is the number of critical nodes of  (includes: start point, endpoint and turning points).
To avoid continuous acceleration as well as deceleration and improve the accuracy of the robot's trajectory tracking, we set the desired distance  1 .When the distance (.)from the endpoint (, )of the optimal trajectory evaluated by Eq (14) to the target point () at moment  is less than the expected distance  2 , the critical navigation point at the moment  + 1 is obtained from (.) in advance.
( + 1) = (( •  1 /  )), (, ()) <  2 (16) where ( + 1) represents the navigation information point of the robot at moment  + 1, that is, when the distance between the optimal trajectory (, ) and the target point is less than  2 , the next navigation point changes from (((1) •  1 /  )) to (((2) •  1 /  )),  is a sequence of consecutive positive integers; (.) is a rounding operation and the number of path node intervals can be estimated by the desired distance  1 and the desired internode distance   . Route(i-1) The first target node The second target node Take Figure 7 as an example, we can analyze the action of Eqs ( 15) and ( 16) in more detail: the original path contains three path nodes  ( − 1, ,  + 1), then the new path NewRoute with a large amount of node information is generated by Eq (15) and the distance between every two nodes is n ds ; Next, we set the desired distance  1 in Eq (16) as a way to extract the robot's motion navigation points from the new path .Furthermore, when the condition (, ()) <  2 is satisfied, the robot will receive the new navigation information in advance.
2) New (, ) evaluation function As shown in Figure 8, the experimental results in relevant literature indicate that most robots tend to deviate from the global path to some extent near the turning point, primarily due to the complexity of the environment.To make our robot consider the degree of global deviation during local path selection, we propose a new ℎ(, ) function based on the original evaluation function to ensure that the robot moves along the global path as much as possible.The improved evaluation function as Eq (17) and the pseudo-code of the improved dynamic window approach is shown in Algorithm 2.

8:
Speed sampling of robot; 9: Simulate motion trajectories; 10: Use the improved evaluation function (17) to select the optimal trajectory; 11: Robot follows the optimal trajectory to move; 12: end 13: end Considering the complex dynamic environment and environmental characteristics of multirobot work comprehensively, our robot must solve not only the path fitting problem at the turning point, but also the path offset problem during dynamic obstacle avoidance.To this end, we design three ℎ(, ) functions to correct the global path tracking capability of the robot by considering the distance relationship between the robot and the obstacles as well as the global path.Situation 1: If the robot is far from the obstacle but deviates from the global path to a lesser extent, global path tracking is guaranteed as a priority.
where ℎ = √( 1 −  2 ) 2 + ( 1 −  2 ) 2 , ( 1 ,  1 ) and ( 2 ,  2 ) denote the local path coordinates planned by the robot according to the kinematic model and the global path coordinates obtained by our BAJPSA* algorithm, respectively; ((, )) denotes the closest distance from the end of the predicted trajectory to the edge of the obstacle; (ℎ) denotes the most relative distance from the robot to the global path;  1 denotes the desired obstacle avoidance distance of the robot from the obstacle in the case of small deviation from the global path;  2 denotes the maximum error of the robot from the global path.Situation 2: If the robot is close to the obstacle and deviates from the global path to a small extent, the weight  of the ℎ(, ) function is 0.Then, our robot's obstacle avoidance effectiveness is guaranteed preferentially in Eq (17).
Situation 3: If the robot is far from the obstacle and deviates from the global path to a large extent, the robot is prompted to move closer to the global path by increasing the evaluation metric of ℎ(, ) in Eq (17).
where  3 denotes the desired distance of the robot from the obstacle in the case of a large deviation from the global path.The traditional DWA does not identify whether the obstacle is dynamic or static when performing trajectory selection, such that the example shown in Figure 9 will mistakenly identify all red trajectories as collision trajectories and discard them.Based on the evaluation function, the robot most likely to select the path with the relatively best score from the green trajectories.However, dynamic obstacles (pedestrians, vehicles, etc.) are in constant motion, and the robot will continue to move in this way by selecting a green trajectory in the following path selection process.The final result is an awkward situation, where either robot collides with the obstacle or gets stuck in a local optimum of following the motion of the obstacle.A typical collision scenario is shown in Figure 10, where a conventional robot lacks effective recognition of dynamic obstacles to make timely decisions, and the collision occurs at the moment 1.In order to improve the safety and reliability of robot motion, our robot adds an appropriate recognition area for such moving obstacles, which reduces the risk of conflict to some extent.
In the natural environment, dynamic obstacles have different volume sizes, so we considered a circular recognition area that can accommodate the whole object.Considering grid environment effects and the movement speed of obstacles, the actual volume of dynamic obstacles in this paper are square-shaped and not more than one grid(1 ), and the robot's circle recognition radius  is as follows: where  is the value of the circle's radius that just contains the dynamic obstacle;  is a positive number greater than 1, and the specific value is obtained from experimental.Since the side length of the grid is 1 m, the recognition radius satisfies  ≤ 1  to avoid the situation that the robot cannot search the path effectively to avoid dynamic obstacles in the case of dense global obstacles.

2) Research on dynamic obstacle avoidance strategy
Referring to the research of Liang et al. [43] on the obstacle avoidance scenarios for the unmanned boat with sea surface, we similarly considered multiple types of motion conflicts between the terrestrial robot and dynamic obstacles, and developed the conflict types as well as obstacle avoidance rules, shown in Figure 11 and Table 1 accordingly.After expanding the robot's recognition area of dynamic obstacles, the frontal and rear-end collision problems can be better solved.However, the lateral collision problem (including left collision and right collision) still presents the dilemma shown in Figure 9, for which the following motion constraints are imposed on the robot: Step 1: When the quantization criteria of robot and obstacle motion direction satisfy the lateral collision scenario, the robot and dynamic obstacle are judged to be in potential motion conflict, based on whether the shortest distance   from the end of the robot's predicted trajectory group  to the obstacle identification region is less than the desired obstacle avoidance distance   .If the condition is satisfied, proceed to step 2. If not, the robot performs obstacle avoidance according to our improved DWA.
Step 2: The robot simulates the trajectory group in   time period, discarding those speed groups (, ) and trajectories  that touch the static obstacle and dynamic obstacle recognition areas.
Step 3: Safe driving distance judgments.Evaluate the optimal trajectory () according to the evaluation function, and calculate the distance  from the end position of the optimal trajectory () to the dynamic obstacle recognition area.If  <   still exists at this time, the conflict cannot be lifted, and the robot cannot avoid obstacles successfully.Let the optimal trajectory group  correspond to the velocity group (, ) = 0. Then the robot will stop the motion quickly under the braking constraint.
Step 4: If the distance  from the end position of the optimal trajectory () to the dynamic obstacle recognition region is greater than the desired obstacle avoidance distance   , the conflict is and the robot resumes motion according to our DWA.

Multi-robot prioritized obstacle avoidance strategy
The path planning problem of multiple mobile robots is extended from the path planning problem of a single mobile robot.We plan a globally optimal path for each mobile robot in the environment by BAJPSA* algorithm.The DWA obstacle avoidance strategy mainly applies to the static environment or local small-scale dynamic environment (low-speed motion with disturbing obstacles, etc.) [44].When applied to the multi-robot systems, this does not solve the phenomenon of motion conflict between multiple robots more efficiently [45].Therefore, we studied the robot's obstacle avoidance strategy for dynamic obstacles through extensive experiments and proposed the dynamic obstacle avoidance rules of 3.4.2.In addition, the priority method [46], as one of the current mainstream techniques for coordinated collision avoidance, can dissipate local conflicts between robots to achieve collision avoidance coordination.To yield better results in global multi-robot motion planning, we combine the BAJPSA* algorithm, improved DWA, and dynamic obstacle avoidance strategy with the multi-mobile robot priority strategy.The algorithm's complexity is simplified by reducing the path planning problem to a dynamic path planning problem with a sequential order for a single mobile robot.
When there are multiple moving robots, the limited environment space is not sufficient for all of them to avoid obstacles.There is an optimal local situation if the robots are treated as dynamic obstacles with the dynamic obstacle avoidance strategy we presented in the previous study.Therefore, we introduce a prioritized strategy and a deceleration mechanism.Different robots have different priorities.When the robot with lower priority encounters one with higher priority and creates a motion conflict, this robot decelerates and stops in advance.The robot with higher priority treats this robot as static obstacle avoidance.As shown in Figure 12: It is assumed that the priority of AGV1, AGV2 and AGV3 decreases in that order.When there is a collision conflict among all three robots, the lower priority AGV2 and AGV3 decelerate to zero.After AGV1 leaves the conflict area, there is still a collision conflict between AGV3 and AGV2.AGV3 stops and waits, resuming motion when AGV2 departs and the collision conflict is lifted.The constraints introduced by the robot's kinematic model and environmental obstacles are considered in our study, making the challenge of the multi-machine priority obstacle avoidance strategy focus on specifying the conflict and coordinating the move.Taking two robots with higher priority, AGV1 and lower priority, AGV2, as an example, we investigated multiple conflict types in Table 2 and obtained the following collision conflict judgments and solutions: Step 1: If the actual distance  12 of two robots is less than the desired obstacle avoidance distance   , the potential collision risk.
Step 2: Establish a local coordinate axis with the lower priority robot AGV2 as the center, calculate the angle  between the line of the two robots and the positive direction of the x-axis, and then calculate the relationship between the magnitude of α and the directional angle  of AGV1.
Step 3: Determine whether | − | < 90°; if this condition is satisfied, the risk of collision is extremely high.To ensure the safety of multi-robot movement, AGV2 with lower priority decelerates within a short time, according to Eq (22).The AGV1 with higher priority treats AGV2 as an unknown static obstacle and performs obstacle avoidance motion through DWA.When the distance relationship between the two robots is  12 >   and | − | ≥ 90°, AGV2 resumes motion.
In conjunction with the research presented in this study, most experiments show that the proposed prioritized obstacle avoidance strategy applies to most situations.This is mainly attributed to the fact that robots are defined as conflicting motions only when the above conditions are satisfied in terms of distance and angle relationships, and robots that are not in the conflict range perform the corresponding obstacle avoidance strategies based on our DWA.
Put the two end nodes into Open_1 and Open_2， respectively

Start
Calculate the node with the minimum value of f(n) in Open_1 and Open_2, remove the two nodes from Open_1 and Open_2, and put them into Close_1 and Close_2 respectively Determine if the jump point in the positive direction reaches the target point?

Fusion algorithm obstacle avoidance process
We propose a multi-mobile robots obstacle avoidance strategy that guarantees global optimality and safety, which fuses the BAJPSA* algorithm for planning global paths with the DWA local obstacle avoidance algorithm that combines a prioritized obstacle avoidance strategy.The flow of the fusion algorithm is shown in Figure 13.
The relevant parameters of this paper are as follows: the sampling period ∆ is 0.1 s; the expected node distance   is 0.018 ; the robot's expected tracking distance   is 1.8 ; the predicted trajectory to the target point expected distance   is 1  ; the expected distance parameters   and   between the robot and the obstacle are 0.4  and 0.7  respectively; the maximum error   of the robot from the global path is 1  ;the expected obstacle avoidance distance   of the robot to dynamic obstacles is 1.5  ; the expected obstacle avoidance distance   between the multiple robots is 2 .The robot model parameters are: maximum linear velocity  / ; maximum angular velocity  / ; linear acceleration . /  ; angular acceleration  /  ; linear velocity resolution ./ ; angular velocity resolution  /   and the evaluation function coefficients are:  = .,  = .,  = .,  = ; the forecast time period   is 3.0 .

Global path planning experiments with improved BAJPSA*
To verify the effectiveness of our BAJPSA*, two sets of maps with scales of 30×30 and 100100 were selected for simulation and compared with the A* and JPS algorithms.The experimental results and simulation data are shown in Figures 14 and15 and Table 3.The green grid depicts the nodes searched by the algorithm, the blue grid is the forced neighbor nodes of JPS and BAJPSA*, while the red path and the orange path in Figures 14 and 15(c) are the paths of our BAJPSA* forward and the reverse searches, respectively.
The green node area in Figures 14 and 15 and the number of OPEN and CLOSE list nodes in Table 3 show that our BAJPSA* dramatically reduces the number of extended nodes compared to the A* algorithm, and the improvement is particularly noticeable in a large-scale map of 100×100.There is a 92.5% reduction in the number of extended nodes and a 91.3% reduction in the running time of our BAJPSA* algorithm compared to A*, with a slight increase of 0.585  in length.There is an advantage of our BAJPSA* over the JPS algorithm, mainly in terms of the search time, which benefits from the mechanism of bidirectional alternating search and the improvement of the heuristic function.We improve the search time of BAJPSA* by 50 % and reduce the number of expansion nodes by 60 % in the 30×30 scale map.Similarly, our BAJPSA* search time for paths is reduced by 79 %, and the number of expansion nodes is slightly improved by 30 % in the 100×100 large-scale map.To determine the effect of our improved algorithm, the path with the global path length of 17.3616  marked in black is obtained based on our BAJPSA* algorithm, as shown in Figure 16.The robot with the two-wheel differential motion model is built for trajectory tracking based on our improved DWA, and the test results in Table 4 and the robot angle variation and path error plots in Figure 17 show the experimental data of three robots with different parameters in detail.The specific exploratory analysis is as follows: In Path_1, we do not consider the robot's new ℎ(, ) evaluation function, such that the weight  = 0.The final path length of the robot travels is 17.7138 , the robot deviates from the global path by 0.2255  on average for each movement, and the robot spends 50.8668  moving from the starting point (1.5, 2.5) to the endpoint (16.5, 10.5).
In Path_2, we set =0.2, and the final path length of the robot is 17.3676 , while the average deviation of the robot from the global path is 0.0577 .Compared with the conventional DWA of Path_1, we sacrifice some efficiency of the algorithm as we simultaneously consider the four Equations indicators (17)(18)(19)(20).Moreover, it is evident from the experimental results that the actual robot motion of Path_2 is basically along the global path, except near the obstacles.
In Path_3, we further increase the optimization effect of the ℎ(, ) indicator, and Path_3 is obtained by setting  = 0.4, which is improved slightly in terms of the path length and path offset indicators compared with Path_2.

Simulation effect test of dynamic obstacle avoidance capability
To test the effectiveness of our proposed dynamic obstacle avoidance strategy, the following three sets of experiments were conducted for three collision types: frontal, rear-end, and lateral, as shown in Figures 18-21, respectively.The robot model parameters are the same as those in Section 4.2.1.
First, we conducted a frontal collision experiment, where the movement speed of dynamic obstacles was set to 0.2 and 0.4 / in Figures 18(a),(b), respectively (in Section 3.4.2,we analyzed the actual speed of the robot as   =   ∩   ∩   , considering the limited search space of the robot, the speed of obstacles in this experiment was less than 0.5 times the maximum speed of the robot).The results show that the robot has a large recognition area for moving obstacles, can avoid them with conflicting frontal motion, and have a specific safety distance.Figure 18(b) shows the environment with high-speed moving obstacles, and the path length of the robot after completing obstacle avoidance is 9.8702 ; it takes 15.62  to move from the starting point (3.5, 1.5) to the endpoint (3.5, 10.5).The faster the speed of the obstacle movement, the higher the requirement for the robot's obstacle avoidance performance.Thus, the path length of the robot's passage compared to the low-speed moving obstacle in Figure 18 Next, we performed a rear-end collision experiment with an obstacle moving at 0.1 /.The test results showed that our robot detected a slow-moving impediment in front and successfully avoided it.The final travel length of the robot was 13.4653 , and it took 19.82 .
Finally, we set up obstacles with different motion directions and speeds for side collision experiments for three tests, respectively, and added traditional DWA for comparison to reflect the advantages of our obstacle avoidance strategy.In the first group, the speed of obstacle movement is 0.25 /, and in the second and third groups is 0.5 /s.The experimental effect is shown in Figures 19-21.In the first collision experiment shown in Figure 19, the moving speed of the obstacles we set is relatively slow, such that both the traditional DWA and our DWA can successfully avoid obstacles.The travel path lengths of the conventional and our robot are 7.4440 and 7.6449 , respectively.Keeping the direction of movement of the obstacle, we increased the movement speed of the obstacle to 0.5 / and carried out the second collision experiment shown in Figure 20.The travel path lengths of the conventional robot and our robot are 8.1640  and 7.1840 , respectively.The experimental results in Figure 21, show that the robot collides with the obstacle under the traditional DWA obstacle avoidance strategy.Our robot can determine conflicting obstacles and slow down and avoid such obstacles that otherwise be avoided successfully.Compared with the first experiment, although our robot has an additional waiting time of 1.68 , the length of the traveled path is reduced by 0.4609 m, and the safety is guaranteed.For the third test, we changed the direction of movement of this obstacle.From the experimental results of Figure 21, the traditional DWA has the optimal local problem of following the obstacle movement described in Figure 21(a), and the conflict is resolved only when the obstacle stops.The final path length of the robot traveled was 14.8600  and took 19.16 .Compared with our robot, the driving distance increases by 1.09 times, and the obstacle avoidance time also increases by 4.17 .

Multi-robot motion planning experiments
The four experiments with three robots, all shown in Figures 22-28, were performed in four different environments: a globally known static environment, one with unknown static obstacles, one with unknown static and dynamic obstacles, and a dynamic environment with a large-scale sea area, respectively.The effectiveness and stability of our proposed fusion algorithm of the BAJPSA* algorithm, DWA dynamic obstacle avoidance strategy, and multi-robot priority obstacle avoidance strategy have been verified.

Globally known environment
The simulation results for the environment with known global obstacle information are shown in Figures 22-23, where the blue, pink, and red lines are the path situations of the three robots, AGV1, AGV2 and AGV3, respectively; the black grid depicts the static obstacle, for which the robot has a priori information.Figure 22 shows the real-time change of the robot during the planning process.The linear velocity and angular variation curves of the robot are shown in Figures 23(a) and (b), respectively; Figure 23(c) shows the offset of the robot's motion path compared with the global path, where the starting and ending points of AGV1 are (14.5, 7.5) and (1.5, 10.5), respectively; the starting and ending points of AGV2 are (5.5, 6.5) and (13.5, 7.5), respectively; the starting and ending points of AGV3 are (5.5, 6.5) and (13.5, 10.5), respectively.
Figures 22 and 23(a) show that AGV3 decelerates at the 50 th motion control node when it detects a collision risk with AGV2, and completely stops at the 80 th control node.Then, AGV2 starts decelerating from the 85 th control node, until the 95 th control node completely stops, as there is a collision conflict with AGV1.AGV1 has the highest priority, such that AGV2 and AGV3 in the stopped state are considered static obstacles, and AGV1 performs reasonable obstacle avoidance based on DWA.Furthermore, Figure 23(c) shows that the actual path of AGV1 movement deviates most from the global path.The collision risk between AGV2 and AGV1 is released, and AGV2 starts to move at the 130 th control node.At this time, AGV3 still maintains collision conflict with AGV2.Thus, AGV3 resumes motion at the 160 th control node.
Figure 23 shows that AGV3 has the lowest priority, resulting in the most prolonged maintenance state of its stopped motion and less significant changes in the motion direction and path deviation.The additional obstacle avoidance performed by AGV1 and AGV2 for the lower priority robots produced a more noticeable motion angle and global path deviation.However, the whole motion was relatively smooth.The multi-robot motion experiments we conducted took 130.91 .The travel distances of AGV1, AGV2 and AGV3 were 13.54 , 8.89  and 9.63 , respectively, and the average error of motion deviation from the global path for each motion moment (0.1 s) was 0.3919 , 0.2056 , and 0.0120 , respectively.

Environments containing unknown static obstacles
The multi-robot global path based on the BAJPSA* algorithm in the global static environment is shown in Figure 24(a), where the starting and ending points of AGV1 are (3.5, 2.5) and (12.5, 14.5), respectively; the starting and ending points of AGV2 are (2.5, 15.5) and (12.5, 2.5), respectively; the starting and ending points of AGV3 are (9.5, 15.5) and (7.5, 1.5), respectively.Then, the randomly distributed unknown static obstacles (the red grid) were increased to conduct the following simulation experiments.
The results shown in Figures 24 and 25 indicate that our multi-robot systems can avoid random static obstacles successfully when moving along the global path in an environment that includes unknown factors.The risk of collision between multi-robot systems is solved successfully with an improved prioritization strategy.What we know from Figures 24(b) and 25(a) is that when the higher priority AGV2 collides with lower priority AGV1, AGV2 decelerates from the 140 th control node, and the velocity reaches zero at the 160 th control node.The conflict is released once AGV1 moves away from AGV2 and AGV2 resumes motion at the 180 th control node and successfully avoids random obstacles.
The variation of the motion speed, travel direction, and global path offset for each robot is relatively significant, as the obstacles of unknown disturbance were considered compared to those in Section 4.3.1.In conclusion, the travel distances for AGV1, AGV2 and AGV3 are 16.36, 19.46 and 14.57 , respectively.The average values of global path offset for each robot movement are 0.5353, 0.7853, and 0.4287 , respectively.The robot movement took 160.68  program running time.

Environments containing unknown dynamic obstacles
There are several dynamic obstacles we added to the unknown static obstacles in Section 4.3.2 to further test the applicability of single-robot dynamic obstacle avoidance strategy and multi-robot prioritized obstacle avoidance strategy in the given scenario, where the starting and ending points of AGV1 are (1.5, 7.5) and (15.5, 12.5), respectively; the starting and ending points of AGV2 are (15.5, 8.5) and (1.5, 12.5), respectively; the starting and ending points of AGV3 are (13.5, 14.5) and (3.5, 6.5), respectively.The yellow squares shown in Figure 26 are dynamic obstacles without a priori knowledge for the robots, and the red enclosures are the recognizable regions that the robots are assigned.The study of dynamic obstacle motion speed and robot recognition area in Section 3.4.2indicates that they are positively proportional.Therefore, we set the movement speed for the three dynamic obstacles of the small, medium, and large sizes as 0.5, 0.39,  0.30 /, respectively; the radius of the recognition area given is 0.55, 0.40 and 0.35 , respectively; the preset motion path is shown in the last figure of Figure 26.
As shown in Figure 26(b), three robots and three unknown dynamic obstacles are encountered at the center of the map.Figures 26(b) and 27(a) show that AGV1 and AGV2 detect dynamic obstacles in front of them, and cannot avoid them because of the limited environment and rapid movement of the obstacles.According to the dynamic obstacle avoidance rules for motion conflicts in Section 3.4.2, the deceleration of both AGV1 and AGV2 starts from around the 125 th control node and stops entirely at the 150 th control node.After the dynamic obstacle leaves, both AGV1 and AGV2 resume motion around the 185 th control node.Subsequently, the collision risk with AGV1 is detected by AGV3 and AGV2 at the 170 th and 200 th control nodes, respectively.According to Section 3.4.3,AGV3 and AGV2 decelerate and wait until the conflict is removed in the multi-robot priority obstacle avoidance strategy.
This experimental simulation result demonstrates the effectiveness of our improved single-robot dynamic obstacle avoidance strategy combined with a multi-robot priority avoidance strategy in an environment with random static and unknown dynamic obstacles.As we find from Figures 27(b Figure 28 shows that our fusion algorithm is equally effective in the large-scale map environment.As indicated in Figure 28(b-4), AGV3 encounters a dynamic obstacle traveling in the same direction.It overtakes left side to avoid the obstacle traveling ahead, following the dynamic obstacle avoidance rules of the single robot.In Figure 28(b-6), AGV1 successfully avoids the random static obstacle and traverses the map's narrow area.In Figure 28(b-7), AGV2 and AGV3 successfully avoid the random static obstacles.

Conclusions
To solve the path planning problem of distributed multiple robots in dynamic environments, we propose a BAJPSA* algorithm fused with adaptive DWA, performing in two stages.
In the first stage, we plan the globally optimal path for each robot by the BAJPSA* algorithm, with simulation results demonstrating the effectiveness of BAJPSA* in global path planning.In the second stage, we perform the local path planning.The adaptive navigation strategy and path deviation evaluation function are proposed to improve the path tracking capability of the traditional DWA.Next, we categorize and discuss multiple unknown static and dynamic obstacle environments with motion conflict scenarios, and propose dynamic obstacle avoidance rules for the single robot.Then, we extend the single-robot to distributed multi-robot with decision rights, discuss multiple classes of motion conflict situations, and achieve cooperative multi-robot avoidance by fusing the prioritizing avoidance rules.The simulation results demonstrate the effectiveness of this algorithm for multi-robot path planning in unknown dynamic environments.
In this study, unknown static, as well as highly dynamic environments are the environments we focus on, and more complex factors (non-flat terrain, large-scale robots, etc.) will be gradually considered in future work.We can also test this algorithm in a robot platform and further on the multi-robot cooperative efficiency problem.

Figure 7 .
Figure 7. Schematic diagram of key target point extraction method.

Find the optimal pathFigure 13 .
Figure 13.Flow chart of multi-robot path planning.
(a) increases by 1.9845 , and the movement duration increases by 1.38 .
Figure 28(b-9) shows the final travel paths of the three robots.The traces of the robots are smooth and fit the global path.The experimental simulation results demonstrate the effectiveness of our multi-robot obstacle avoidance strategy with BAJPSA* fusion improved DWA in a large-scale environment.The motion distances of AGV1, AGV2 and AGV3 are 693 , 388.5  and 667 , respectively, and the average errors of motion deviation from the global path at each motion moment (0.1 ) are 4.052 , 0.895  and 0.755 , respectively.
⃗⃗⃗ represents that node  can be reached by taking  unit moves from node  in diagonal direction  ⃗⃗⃗ . =  +     ⃗⃗⃗ represents that node  can be reached by taking   unit moves from node  in straight direction   ⃗⃗⃗ .

Table 1 .
Collision quantification criteria and avoidance rules. represents the movement direction of the obstacle towards the robot;  represents the movement direction of the robot relative to the obstacle.Volume 20, Issue 1, 145-178.

Table 2 .
Multi-robot motion conflict detection and resolution scenarios.

Table 3 .
Comparison of global path planning results.

Table 4 .
Robot motion planning results.