Research status of multi - robot systems task allocation and uncertainty treatment

The multi-robot coordination algorithm has become a hot research topic in the field of robotics in recent years. It has a wide range of applications and good application prospects. This paper analyzes and summarizes the current research status of multi-robot coordination algorithms at home and abroad. From task allocation and dealing with uncertainty, this paper discusses the multi-robot coordination algorithm and presents the advantages and disadvantages of each method commonly used.


Introduction
Multi-robot systems have become a hot topic of research in the field of robotics. In addition, because of their high fault tolerance and robustness, they are widely used. The multi-robot system still faces problems such as task allocation and resource competition. They are referred to as coordination of teammate robots. Complex problems are divided into a set of sub-problems. The optimization problem is solved by competing and cooperation among robots. The operational capacity is improved. In the study of multi-robot system coordination algorithm, task allocation is an important guarantee for multirobot to complete the task, and it is an important part of task planning. In the multi-robot system, there are many problems such as communication delay, dynamic environment change, sensor precision limit and system component failure. These uncertainties will affect the assignment of multi-robot system tasks, so the research on solving uncertainties is increasingly affected Attention. Therefore, this paper reviews the research of multi-robot systems in the process of exploration and summarizes the latest research trends of multi-robots in cooperation, task allocation and dealing with uncertainty.

Task Allocation
The goal of task allocation across a multi-robot system is to maximize the benefits of task execution and to minimize the cost for the entire system. The design goals for task allocation methods are real-time, robustness, reliability, optimization performance, communication requirements, computing requirements, and scale capabilities. There are two ways to distribute the tasks. The first is the way is emergent allocation and the second is intentional cooperative distribution. Emergent allocation is realized coordination by the interaction between each other and through impacts with the environment, which is more suitable for the tasks of large scale multi-robot systems. However, this method cannot predict the behavior of a robot accurately in the system and cannot guarantee the efficiency. Intentional cooperative distribution is achieved through robot display cooperation, easy-to-implement humancomputer interaction, and has the potential to better utilize the capabilities of heterogeneous robotic teams. Intentional cooperative distribution can be divided into centralized distribution and distributed distribution. The centralized allocation has the characteristics of simple structure, strong calculation performance and easy implementation. But it is easy to cause the problem of communication congestion, and difficult to deal with real-time processing. The methods of centralized allocation are integer programming method, basic search algorithm and based on the satisfactory decision method, heuristic search algorithm, method based on graph theory, swarm intelligence method [1,2]. Distributed allocation of communication load balancing is more suitable for a dynamic environment, involving frequent task changes and local autonomy [3]. The methods of distributed allocation are distributed cooperative structure ALLIANCE [4], vacancy chains method [1], contract net protocol and auction algorithm.
The market economy method [5][6][7][8][9][10] is the main method to realize the coordination of a multi-robot system. It solves the problem that the distributed system control is based on the decentralized subsystem. The market economy algorithm optimizes the control in a selfish way with the goal of maximizing the local index of the subsystem to get the optimal solution in the whole system. The advantage of the market economy approach is the simplicity of the calculation [1]. It uses a bidding-auction approach to achieve the task allocation, assigning the corresponding exploration task to the highest bidder. Auction types are usually divided into single auction, multiple auctions, and combined auctions. Dias, Zlot, Kalra and Stentz [7] summarized the early work and compared the time complexity and communication complexity of the three auction methods. From the auction to achieve the way, a centralized auction can achieve the optimal task allocation, where the decentralized auction not only makes each robot take a small amount of computation but also has robustness [8]. Guillaume, Lounis, Aurelie, Abdel-Illah and Philippe [11] proposed control architecture based on topological representation of the environment and a protocol based on sequential simultaneous auctions (SSA) which reduces the planning complexity to coordinate Robots' policies. The policies are individually computed using Markov Decision Processes oriented by several goal task positions to reach. When dynamic task allocation in multi-robot search and retrieval tasks, Wei, Koen and Catholijn [12] put forward a prediction method based on extension of the auction algorithm. The method uses a winner determination mechanism to allocate tasks to each robot. Robots implicitly coordinate their activities by team reasoning that leads to consensuses about task allocation. Based on the framework of a multi-agent system, the mathematical model of multi-UAV cooperative target assignment was developed by Liao, Chen and Zhou [13]. Considering the value of UAVs, fighting force, initial position, the value of targets, and battlefield threats on target allocation, he proposed a target allocation algorithm based on agents for a distributed cooperative auction, which can achieve the goal of multi-UAV system cooperative distribution. The target allocation algorithm has good optimization and time characteristics, which can meet the real-time requirements on the battlefield. However, the task allocation algorithm does not consider the problem of dynamic target assignments in the case of uncertain information. Through the analysis of the uncertainty of multi-UAV air combat situation information, Chen, Zhao and Xu [14] established the combat dominant function for a multi-UAV dynamic game, based on fuzzy information. A dynamic target assignment algorithm based on distributed cooperative auction is proposed. The dynamic expansion game was transformed into a static strategy game. The fuzzy structure element method and the particle swarm optimization algorithm are combined to give the Nash equilibrium solution of the mixed strategy under the fuzzy information. The proposed algorithm can achieve a optimal demand of project under the specified time or resource constraints. In order to achieve the goal of multi-UAV by the time of arrival of multiple targets in uncertain information, Zhao, Li and Wang [15] combined the auction algorithm with the interval consistency algorithm. Multi-target attack task allocation for multi-UAVs is proposed by using the auction algorithm. Based on the interval consistency, a control method is proposed for the simultaneous arrival of multi-UAV with uncertain information. In the combined auction, each robot will bid on all the subsets of the task; resulting in communication, evaluation of the tender, task clear, and other workload into an exponential growth. In recent years, there are two aspects of the combination auction method that need improvement. One is the hybrid structure of the form [5] of the auction and the other is the task after the grouping of the auction [6,[9][10]. The problem can be simplified by grouping the tasks before exploration, by clustering algorithms in cases where it is known to explore a priori information about a task or environment. Jones, Dias and Stentz [6] used two heuristic methods (clustering and opportunistic path planning) to simplify the problem and then, through the hierarchical auction, these methods will be coupled with each other. Two tasks were assigned to two different kinds of the robots. The patterns that are performed here after grouping the tasks can further balance the tasks between the robots. The method proposed by Nanjanath and Gini [9] is to re-auction the remaining tasks after a member of the multi-robot system has completed the task, so as to deal with the dynamic tasks and improve the robustness. Zhang, Collins and Shi [10] gave the results for heterogeneous multi-robot system coordination with different moving speeds and different task completion times. They proposed random clustering auction coordination based on the Markov chain search process and the simulated annealing algorithm, which can make the coordination performance index of the multi-robot system distribute between the results of global optimal and random allocation coordination strategies.
There are many research papers on the same information processing mode with the market economy algorithm. Cabrera-Mora and Xiao [16] proposed the exploration coordination based on the flood algorithm. Nieto-Granda, Rogers III and Christensen [17] proposed coordination algorithms of the redundant robot in the system. In addition, some scholars have researched the coordination of multirobot systems at the logic level through the finite-state machine [18][19] or Petri nets [20][21][22]. This method can deal with different athletic abilities [18] or functions of robots [20], completely from the logical level of coordination achieved by the relatively limited behavior. However, it is possible to participate in the control loop of the multi-robot system by supervising the control [23].

Dealing with Uncertainty
Another trend in the research of coordination algorithms for multi-robot systems is that the uncertainties in the robot itself are receiving more and more attention. The uncertainties include problems with the communication delays, the dynamic change of the environment, the limitation of the precision of the sensor, the failure of the system components, and so on. In order to solve the problem of uncertainty, the common methods include the Markov decision process (MDP), fuzzy control method [14], and rough sets theory [24]. The typical method is based on the MDP model in decision-making theory to research the coordination of multi-robot systems. MDP is an effective tool [25,[26][27][28] to deal with uncertainties in robot systems.
Work in this area includes the studies of Huang and Luo [26] for multi-robot initiative targeting, seeking the difference between benefits and costs within the group to select a moving target maximum point for each robot. In this study, information entropy is used to represent the uncertainty of the robot positions, and the benefit is defined as the reduction of the position uncertainty of the robot after its movement to the target point. Capitan, Spaan, Merino and Ollero [25] modeled the behavior of UAVs with a partially observable Markov decision process (POMDP) and then applied these models to instances of multi-robot systems in environmental monitoring and target follow-ups. In order to achieve a number of UAVs to monitor the multi-target [29], POMDP was employed to infer the uncertainty in the application of dynamic model, so that more than UAV team to complete real-time tasks. To realize environment monitoring and a moving target tracking, an auction algorithm was used to solve the decision problem.
For the sake of promoting the cooperation between the UAVs, the different behaviors of UAVs are analyzed by using the factorization strategy to simulate the dynamic situation, and the decentralized data fusion system is used to obtain the belief of the joint factor state observed by all team members. The advantage of the proposed algorithm is that its computational complexity does not increase with the increase of UAVs or targets. Moreover, the complexity of the auction and decentralized data fusion depends only on the size of the local communication neighborhood of each UAV. Lozenguez, Adouane, Beynier, Mouaddib and Martinet [27] adopted a topological map representation instead of the grid method, which would lead to an explosive growth of the state space of the goal-oriented Markov decision process for each robot. Kim, Baik and Lee [30] studied the task allocation problem of a heterogeneous UAV team performing search and destruction tasks in a dynamic and uncertain environment. He put forward a distributed task allocation strategy based on resource benefits that provided the dynamic, unpredictable, events for robustness and a balanced way for the UAV team. The method achieved high resources and improved the welfare of aggressive responses. The scheme is superior to the real-time task allocation method. In the above work, the choice of the action strategy of the decision model is completed by the auction mechanism [25,27,29], the entropy is used as the measure of the uncertainty of the robot [25,26], but the stratified social entropy [31] is not used in these literatures.
In the study of the multi-UAV close-formation problem [32], optimization of the control volume of the differential evolution strategy, and the Markov chain proves it is the optimal solution. The exploration process of a mobile robot belongs to a POMDP with continuous state [33] of approximate methods for the solution of POMDP have been given. Besides the foregoing, the methods related to robot applications include the task shortening method [34], the evolutionary strategy method [35], etc. In addition, when the UAVs perform cooperative target positioning, the measurement data transmission will be delayed. Wang, Qin, Bai and Cui [36] proposed a nonlinear filter based on solving the Fokker-Planck equation to solve this problem. According to the measured arrival time, the proposed nonlinear filter can be divided into two parts. Non-delay measurements are fused in the first part, where the Fokker-Planck equation is used to propagate the conditional probability density function in the positive form. The time delay measurement is fused in the second part, where Fork-Planck is used in a backward form. The Bayesian formula is applied to the two parts during the measurement update process.

Conclusions
Multi-robot systems have many potential applications. From the point of view of task allocation, and dealing with uncertainty, this paper reviewed the state of the art of the multi-robot coordination algorithm and points out the advantages and disadvantages of each method.

Acknowledgments
This article is funded by National Natural Science Fund (No.61672304) and Natural Science Fund of