Distributed Imaging Satellite Mission Planning Based on Multi-Agent

As the core technology in the field of imaging satellite application, imaging satellite mission planning has received more and more attention. Aiming at this problem, this paper proposes a distributed imaging satellite mission planning method (DISMPA) based on multi-agent system. Firstly, a distributed imaging satellite mission planning model with variable collaborative division of labor is constructed based on multi-agent system theory. The model defines the intelligence level of satellites in the satellite cluster and the interaction mode between satellites. Secondly, the cooperation mechanism between satellite agents is established based on blackboard model. Two negotiation strategies of worker serial activation and worker parallel activation are proposed. In order to improve the efficiency of negotiation, the targets are preassigned before negotiation. Finally, a hybrid discrete multi-verse optimization algorithm is proposed to solve the mission planning problem of worker agents in the model. Simulation experiments show that the average fitness values obtained by DISMPM using serial activation strategy and DISMPM using parallel activation strategy are up to 20% higher than those obtained by the centralized algorithm with the best solution effect in this paper, and the average negotiation times of the two DISMPM methods are both reduced by more than 80% compared with the contract network algorithm, effectively reducing the system traffic and reducing the communication burden. It is proved that DISMPA is suitable for distributed imaging satellite mission planning.


I. INTRODUCTION
Imaging satellites are remote sensing satellites that can detect the earth surface and lower atmosphere and obtain image information by using optical imaging remote sensors (panchromatic, hyperspectral, infrared, etc.) or synthetic aperture radar (SAR) imaging remote sensors carried by them in their orbits. As an effective means of obtaining space information, imaging satellites have unique advantages such as wide observation range, long observation time, no limitation of space region, and no personnel safety involved. At present, imaging satellites have been widely used in military reconnaissance, territorial mapping, urban planning, disaster prevention and control, environmental protection, resource detection, crop monitoring and many other fields.
The associate editor coordinating the review of this manuscript and approving it for publication was Qiang Li .
Imaging satellite mission planning refers to allocating the limited observation windows of imaging satellites to different imaging missions on the basis of meeting the constraints of satellite use, in order to improve the efficiency of satellite imaging and reduce the operating cost. When the imaging satellite is in orbit, there are many objects that can be observed in each orbit. Each target has different observation Windows, which are distributed on different orbits of different satellites. In order to make the best use of valuable satellite resources and maximize the imaging income, a reasonable planning of which targets are to be imaged by which satellite at which time and at which side view angle is required. Related studies show that imaging satellite mission planning is a typical non-deterministic polynomial hard (NP-Hard) problem [1]. With the increase of the scale of missions and resources, the complexity of solving the problem also rises sharply. From the perspective of solution architecture, centralized imaging satellite mission planning and distributed imaging satellite mission planning together constitute the main technical system of multi-imaging satellite mission planning.
In centralized imaging satellite mission planning, the planning department integrates imaging requirements and satellite resource status, centrally arranges imaging missions for all satellites, and uploads relevant instructions to imaging satellites in orbit. Satellites are instructed to perform specific imaging missions. As the most widely used planning method at present, the research results in this direction are relatively rich, mainly focusing on problem modeling and solving algorithms. In terms of problem modeling, researchers have modeled satellite mission planning problems from different perspectives and proposed different types of models, including constraint satisfaction problem model [2], [3], backpack problem model [4], graph theory-based model [5], [6], etc. In terms of solving algorithms, the solving algorithms currently applied in the field of satellite mission planning can be divided into exact algorithm [7], [8], heuristic algorithm [9], [10] and intelligent optimization algorithm [11], [12], [13], [14]. The exact algorithms can obtain the optimal solution of the problem, but the solution efficiency is low. When solving large-scale mission planning problems, lagrange relaxation, column generation and branch pricing are needed to decompose the problem, so as to improve the solution efficiency. Heuristic algorithm mainly guides the construction of solutions by designing heuristic rules based on the knowledge of planning domain knowledge. Most studies design heuristic rules through the combination of mission income, overlapping degree of time Windows and resource consumption ratio [15].
However, there are some unavoidable problems in centralized imaging satellite mission planning.
(1) High solution complexity. With the increase of problem scale, the complexity of global mission planning based on all satellite states will increase dramatically. High complexity will increase the solving time, and more importantly, it puts forward higher requirements on the optimization ability of the algorithm.
(2) Untimely dynamic response. Centralized satellite mission planning relies too much on ground resources. If a new task arrives, either scrap the existing plan and start over, or modify the existing plan. When the mission arrival becomes the norm, it will lead to the repeated modification of the planning scheme. Therefore, it is difficult for centralized mission planning to timely respond to new tasks that arrive normally.
(3) The component encapsulation is weak [16]. Centralized mission planning requires the planning department to know the technical parameters of all satellites. However, in practice, different satellites are usually managed by different agencies, and the technical parameters of satellites are not disclosed to each other. At this time, the centralized satellite mission planning method is difficult to apply.
In view of this, the distributed imaging satellite mission planning emerges. Distributed imaging satellite mission planning is based on multi-agent system theory, which models satellites as agents with intelligence. Satellite agents can realize reasonable assignment of missions through negotiation and plan missions they undertake based on their own states. Compared with centralized imaging satellite mission planning, distributed imaging satellite mission planning has the following advantages: First, in distributed mission planning, mission assignment can be achieved among satellites through negotiation algorithms. The satellites can perform mission planning autonomously according to their own states, which effectively reduces the computational complexity.
Secondly, distributed mission planning is in the mode of ''online planning''. When an imaging mission arrives, each satellite can decide whether to undertake the mission according to its own capabilities and benefits. Satellites are less dependent on ground resources and can respond promptly to newly arrived missions.
Finally, distributed mission planning is based on the theory of multi-agent system. When a new satellite is added, it only needs to be modeled as a new satellite agent and then registered in the multi-agent system, so that the multisatellite negotiation can proceed normally and the original negotiation protocol does not need to be changed. Similarly, when a satellite fails, simply log out the satellite agent in the multi-agent system.
In this paper, we propose a distributed imaging satellite mission planning method (DISMPM) based on multi-agent system. The main contributions of this paper are as follows: (1) A distributed imaging satellite mission planning model with variable collaborative division of labor is constructed based on multi-agent system theory. The model defines the intelligence level of satellites in the satellite cluster and the interaction mode between satellites.
(2) The cooperation mechanism between satellite agents is established based on blackboard model. Two negotiation strategies of worker serial activation and worker parallel activation are proposed. In order to improve the efficiency of negotiation, the targets are pre-assigned before negotiation.
(3) A hybrid discrete multi-verse optimization algorithm is proposed to solve the mission planning problem of worker agent in the model.
Simulation experiments show that DISMPA can effectively improve the quality of the planning scheme compared with the centralized mission planning algorithm, and effectively reduce the system traffic compared with the contract network algorithm. It is proved that DISMPA is suitable for distributed imaging satellite mission planning.
The rest of this paper is organized as follows. In the Section II, some related work of distributed imaging satellite mission planning will be briefly reviewed. In the Section III, we describe the problem of mission planning for imaging satellite. In the Section IV, the architecture of distributed imaging satellite mission planning based on multi-agent system is constructed. We propose the hybrid discrete multi-verse algorithm in the Section V, and the experimental results and corresponding analysis will be given in the Section VI. Finally, the conclusion will be drawn in the Section VII.

II. RELATED WORK
At present, most researchers establish a solution framework for distributed imaging satellite mission planning based on the theory of multi-agent system. Experiments have proved that the distributed solution framework effectively reduces the solution complexity and can obtain better planning schemes, which has certain advantages compared with the centralized solution framework [17], [18], [19]. The operational basis of the distributed imaging satellite solution framework is the cooperation protocol between satellites through which the reasonable assignment of missions can be realized. Contract network protocol is a common cooperative protocol in distributed artificial intelligence and is widely used in distributed imaging satellite mission planning. References [20] and [21] introduce the contract network protocol and design the corresponding algorithms to effectively solve the mission allocation problem. Reference [22] improves the contract network agreement and introduces the all-mission bidding mechanism and the second-bid winning mechanism. Based on the traditional contract network protocol, [23] integrates three kinds of contracts, namely sale, exchange and replacement to realize multi-star collaboration and interaction, and further improves the rationality of mission allocation. In order to reduce the communication burden of the distributed satellite mission planning system, [24] designs a bidder primary mechanism. The system will estimate the candidates who can complete the mission before bidding, and send the bidding notice only to the relevant workers. Reference [25] proposes an improved contract network protocol (ICNP) with credit mechanism. Experimental results show that ICNP can effectively reduce the cost of inter-satellite communication. For the emergency observation problem, [26] establishes a three-layer dynamic interactive architecture model based on multi-agent, and designs a mission planning algorithm based on extended contract network protocol to solve the problem. Reference [27] proposes a mission secondary allocation strategy to reduce the conflict between missions and accelerate the mission allocation process. Reference [28] proposes a mission assignment algorithm based on trustworthiness mechanism for agile satellite dynamic mission planning, and proves the effectiveness of the algorithm through experiments. Reference [29] realizes the mission collaboration of distributed satellite system based on the Belief-Desire-Intention extension contract network protocol, and improves the basic contract network technology from the three aspects of tendering of missions, bidding and evaluation of bid. Single-satellite scheme clustering and evolution operators are added into the contract network algorithm to improve the optimization of multi-satellite earth observation scheme [21].
The research of multi-agent system has been very popular. Some researchers have explored the game in multi-agent system. Reference [30] provides a novel game-theoretical approach for multi-agent systems. The game method in multi-agent provides another way to solve distributed mission planning problem. Reference [31] introduces the thought of game theory into multi-satellite collaborative mission planning and designs various negotiation mechanisms, such as utility-based regret game, smoke signal game and broadcast game. Reference [32] solves the problem of distributed mission planning through a networked game model based on a game-negotiation mechanism. In this model, each satellite is viewed as a ''rational'' player who continuously updates its own ''action'' through cooperation with neighbors until a Nash Equilibria is reached. Through the analysis of the above literature, it is found that the contract network protocol is generally used as the operation mechanism of distributed imaging satellite mission planning based on multi-agent system. However, there are some problems with the contract network protocol. On the one hand, mission processing in contract network protocol is usually carried out one by one, and only one mission is bid out in one negotiation. As the size of the mission increases, the system generates a larger amount of communication. On the other hand, the order of mission processing will have a great impact on the final planning result. The mission at the top of the list gets the observation opportunity first, while the mission at the bottom of the list cannot seize the allocated time windows, so it gets fewer observation opportunities, which is not conducive to the overall optimization of the problem.

III. PROBLEM DESCRIPTION
Imaging satellite mission planning refers to the problem of determining which satellites to use and which time to perform imaging missions to maximize the comprehensive benefits, based on the requirements put forward by users for multiple imaging satellites. It can be seen that imaging satellite mission planning is a complex combination optimization problem composed of imaging mission and satellite resources. In this section, we describe the problem concretely from the aspects of problem constraints and objective function.

A. BASIC SYMBOL DEFINITION
In order to facilitate the description of the problem, the relevant symbol definitions are first given, as shown in Table 1.
(2) The actual observation time window of the target must be within the window of time that the satellite is visible to (3) The same satellite must satisfy the attitude adjustment time constraint for two consecutive targets.
(4) Energy constraint. The duration of mission observation cannot exceed the available power-on time of satellite in the current orbital lap.
(5) Data storage constraints. The storage capacity occupied by the mission cannot exceed the storage capacity available for satellite's current lap.

C. OBJECTIVE FUNCTION
The imaging satellite mission planning problem is a typical multi-objective optimization problem with different optimization objectives for different application scenarios.
In some scenarios, imaging satellite mission planning considers multiple optimization objectives at the same time, such as mission income, mission completion rate, mission timeliness, resource utilization, etc. When the optimization objectives become complex, the multi-objective optimization technique represented by pareto technique has great application space. It provides a group of non-dominant solutions from which the planner can choose the most suitable and better solution. This paper focuses on proposing a solution method for distributed mission planning of imaging satellites, considering only two optimization objectives, the mission income and the mission completion rate. Therefore, the form of subobjective weighting is adopted. The basic equation for calculating mission income is as follows.
The satellites involved in the planning in this paper have a certain degree of autonomy and can plan their own tasks. During the negotiation, satellites are allowed to cancel previously scheduled targets in order to achieve greater imaging income. In order to prevent a certain target from being arranged or cancelled repeatedly in the negotiation process and speed up the convergence process, a memory table is designed for the targets to record the number of targets being arranged or cancelled. If a target is arranged and then cancelled in a negotiation, its income will decrease. The improved equation for calculating the income of the missions is as follows.
x n m · val n · (λ ) q n (7) where, λ is the income penalty coefficient and λ ∈ (0, 1), q n is the number of times target n is arranged. The mission completion rate is the ratio between the number of completed missions and the total number of missions, which can be calculated by the following equation.
The objective function of comprehensive consideration of the mission income and the mission completion rate is as follows.
where α denotes the weight of the first planning objective and satisfies 0 < α < 1. In fact, α can be determined by the real mission planning background. The larger weight of the mission income reflects the priority guarantee of important missions, and the larger weight of the mission completion rate reflects the priority guarantee of user demand satisfaction. We can change the weight value within a certain range by keeping other variables unchanged, and select the weight value corresponding to the optimal scheme after several experiments. In this paper, we set up the imaging missions VOLUME 11, 2023 65533 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
through simulation and do not have real mission planning background. Therefore, we set the weights of both the mission income and the mission completion rate to 0.5.

IV. DISTRIBUTED IMAGING SATELLITE MISSION PLANNING ARCHITECTURE BASED ON MULTI-AGENT SYSTEM
On the basis of imaging satellite mission planning, distributed mission planning focuses on how to achieve mission allocation and planning through interactive negotiation between satellites. This section establishes the architecture of distributed imaging satellite mission planning based on multiagent system, focusing on the satellite mission planning model based on multi-agent and the interactive negotiation strategy between satellite agents.

A. DISTRIBUTED IMAGING SATELLITE MISSION PLANNING MODEL CONSTRUCTION
Agent refers to a software or hardware entity capable of independent activities, which has sensibility, autonomy and interaction. Multi-agent system is a distributed autonomous system composed of independent agents to complete a certain mission. According to the relatively independent characteristics of individual satellites, a single satellite is mapped to a satellite agent, and multiple satellite agents constitute a multiagent system. In a distributed satellite cluster system, the roles of satellites are different, and their autonomy is different. According to [33], the autonomous capability of satellite can be divided into I 1 , I 2 , I 3 and I 4 levels, as shown in Fig.1. Among them, I 4 satellite agent is only responsible for receiving and executing missions, and most of the current satellites in orbit belong to this type. I 3 satellite agent has local planning ability and can plan missions undertaken based on its own state. I 2 satellite agent has the ability to interact with other satellite agents, and can resolve conflicts through negotiation, thus improving the overall planning income. I 1 satellite agent has the strongest autonomous capability. It can obtain the information of other satellite agents, assign missions to appropriate satellite agents, and generate the mission planning scheme of the entire satellite cluster.
Based on the analysis of four kinds of autonomous capability level satellite agents, this paper adopts the distributed structure multi-agent system with variable cooperative division of labor [16] as the organizational structure of satellite cluster. In this structure, the cooperative division of each satellite agent is variable, but at any time, only one satellite agent has I 1 intelligence, while the other satellite agents have I 2 intelligence, as shown in Fig.2: In this model, communication links can be established between satellite agents, as shown in Fig.2(a). When the division of labor of a satellite agent is I 1 (for example, satellite agent1), in order to improve the negotiation efficiency and reduce the communication between satellite agents, other satellite agents only communicate with satellite agent1, as shown in Fig.2(b).

B. SATELLITE AGENT COOPERATION MECHANISM BASED ON BLACKBOARD MODEL
The cooperation mechanism between satellite agents is the basis of the operation of distributed imaging satellite mission planning model based on multi-agent system. Based on the characteristics of the established multi-agent system model of distributed satellite cluster, this paper uses the blackboard model to establish the negotiation mechanism between satellite agents.

1) BLACKBOARD MODEL
The blackboard model is a multi-knowledge source system, whose concept was first proposed by Newell in 1962. The blackboard model has three main components: knowledge source, blackboard and control mechanism, as shown in Fig.3.
Knowledge source (KS): Knowledge source is an independent knowledge base that describes the knowledge of the problem to be solved. It can also be called experts, who are the subject of solving the problem. A blackboard model usually includes multiple sources of knowledge, which cannot communicate or call each other directly but interact directly with the blackboard and cooperate to solve problems.
Blackboard: The blackboard is a shared work space for solving problems. It is mainly used for storing data required by knowledge sources and transferring information. It is a global work area in the system. What's on the board is called a hypothesis, which is an explanation of one aspect of a problem at a particular level of information. In the process of solving, each knowledge source will record the generated part of the solution on the blackboard, and constantly modify the blackboard.
Control mechanism: The control mechanism is the reasoning mechanism of blackboard model to solve the problem, which is composed of the supervision procedure and the scheduling procedure. It activates relevant knowledge sources according to the solving state of the problem on the blackboard and the solving skills of each knowledge source, and selects appropriate knowledge sources according to certain strategies to solve the problem, so that the knowledge source can timely change according to the blackboard.
The main idea of the blackboard model is that multiple experts use the same blackboard to solve a problem collaboratively. The blackboard is a shared problem solving workspace where all experts can see the blackboard. Problem solving begins when the problem and initial data are recorded on the blackboard. The control mechanism can select appropriate experts to solve the problem according to the solution state of the problem on the blackboard. After each expert has solved the problem, the solution results will be recorded on the blackboard. All experts share information through the blackboard and seek to use the solving experience of other experts to guide their own solving process. When an expert finds that there is enough information on the blackboard to support him to solve a problem further, he writes new results on the blackboard. Additional information may support other experts to solve the problem further, and the process is repeated until the problem is solved.

2) SATELLITE AGENT COLLABORATION STRATEGY BASED ON BLACKBOARD MODEL
In the distributed satellite system architecture based on multiagent, each satellite participating in the planning is regarded as an agent. The blackboard function is realized by the satellite agent with I 1 intelligence, which is called manager agent. Each knowledge source is realized by satellite Agent with I 2 intelligence, which is called worker agent.

a: TARGET PRE-ASSIGNMENT
In order to accelerate the solution of the problem by the worker agents, this paper preassigns the observation targets based on the manager agent's ability to possess global information of satellite cluster. As shown in Fig.4, square nodes represent the targets to be observed, and circular nodes represent the worker agent. The worker agents with the observation window of the target are connected to the target through realization. Target pre-assignment is to traverse the target set and assign a worker agent to each target according to a certain rule. Target pre-pre-assignment can adopt random assign or heuristic method. After a target is assigned to a specific worker agent, it becomes an imaging mission undertaken by the worker agent. In order to maximize the income of initial mission planning for worker agent, this paper designs two heuristic factors, worker load and target conflict degree, to guide the pre-assignment process of targets based on the ability of manager agent to possess global information of satellite cluster. Specifically, when assigning the target to be observed, the manager agent gives priority to assigning the target to the worker agent with low worker load and low target conflict degree.
Worker load is used to measure the number of missions currently undertaken by worker agent. The calculation equation is as follows.  where, num m cur refers to the number of missions currently undertaken by satellite m. The maximum worker load in all worker agents is load max = max{load m |m = 1, 2, · · · M }, and the minimum worker load in all worker agents is load min = min{load m |m = 1, 2, · · · M }. The equation for calculating the normalized worker load of a worker agent is as follows.
Target conflict degree refers to the degree of conflict caused by the current target to the arranged targets and the targets that the worker agent is capable of undertaking after the current target is assigned to the worker agent. The targets that the worker agent is capable of undertaking refers to the targets that have visible time windows on the worker agent. The calculation equation of target conflict degree is as follows.
where, m refers to the target set that satellite m has undertaken and is capable of undertaking. If the current task n is in conflict with task k in the set m then c m n,k = 1, otherwise its value is 0. The maximum target conflict degree of task n among all worker agents is con max = max{con m,n |m = 1, 2, · · · M } and the minimum target conflict degree of task n among all worker agents is con min = min{con m,n |m = 1, 2, · · · M }. The normalized target conflict degree of task n on a worker agent is: ncon m,n = con m,n − con min con max − con min (13) According to worker load and target conflict degree, the heuristic information of target pre-assignment can be calculated as follows.
In the process of target pre-assignment, the manager agent assigns a worker agent with a smaller λ m,n value to the current mission by calculating the heuristic information. β is the weight of the worker load. In fact, high worker load and target conflict degree can make target pre-assignment less effective. Based on this, we believe that both worker load and target conflict degree can affect the target pre-assignment scheme equally. In the target pre-assignment, it is better to set the weight value of both as 0.5.

b: WORKER AGENT NEGOTIATION STRATEGY
After the manager agent completes the target pre-assignment, each worker agent calls the mission planning algorithm to plan the missions to be undertaken, and sends the mission planning scheme with the highest imaging revenue value to the manager agent as the optimal mission planning scheme. The manager agent integrates the mission planning schemes of each worker agent, forms the global mission planning scheme, and releases the set of unplanned missions. By interacting with the manager agent, each worker agent adjusts the individual solution to some extent, further plans unplanned missions, and continuously improves the imaging revenue value of individual planning. In this paper, the following two strategies are adopted to realize the interaction between each worker agent and the manager agent.
Strategy 1: Worker agent serial activation strategy After the manager agent releases unplanned missions, each worker agent is activated in a certain order. After the activated worker agent obtains the unplanned mission sequence, it expands its original planned mission sequence, and calls the mission planning algorithm to re-plan the missions and update the mission planning scheme, as shown in Fig.5. Usually at the end of a round of consultations, each worker has an executable population of planning schemes. When the unplanned mission sequence released by the manager agent is obtained, the original planning scheme population needs to be reconstructed. Population reconstruction refers to taking out individuals from the population and forming new individuals by combining the acquired unplanned mission sequence. After a new round of planning, the population composed of these new individuals forms the updated scheme population of the worker Agent. Fig.5 shows the updating process of one of the individuals in the original scheme population, which realizes the expansion of the individual by directly adding the unplanned mission sequence released by the manager agent to the end of the sequence. In this paper, worker agents are allowed to abandon previously planned missions in order to obtain greater mission income. After the current worker agent completes the mission planning, if the income value of the newly generated optimal mission planning scheme is higher than the previous mission planning scheme of the worker agent, the worker agent sends the optimal mission planning scheme to the manager agent. The manager agent updates the planning sequence and unplanned sequence of the current worker agent according to this scheme, and activates the next worker agent. The above steps are repeated. When all worker agents are activated, a round of negotiation is completed. The manager agent generates the global mission planning scheme after this round of negotiation, and calculates the objective function value of the global planning scheme, namely the fitness value. If the fitness value of the global planning scheme does not increase for G max consecutive times, the algorithm exits and the current optimal global planning scheme is output.
Strategy 2: Worker agent parallel activation strategy Under this strategy, all worker agents are activated at the same time after the manager agent releases the unplanned mission sequence. At the same time, each worker agent adds the unplanned mission sequence to its original planned mission sequence and calls the mission planning algorithm to update its own planning scheme. Similarly, all worker agents can give up previously planned tasks on the premise of obtaining greater mission income. When all worker agents send the optimal plan of this planning to the manager agent, different from strategy1, the manager agent cannot directly change the planning mission sequence of each worker agent according to the current optimal mission planning scheme of each worker agent. Because worker agents are activated simultaneously, the mission planning of each worker agent is executed concurrently, and different worker agents may plan a certain mission at the same time. At this point, updating the mission sequence of each worker agent directly will result in repeated execution of some missions.
In this paper, the planning schemes of each worker agent are sorted based on scheme disturbance rate and income increase rate. Scheme disturbance rate refers to the change of the new scheme relative to the original scheme. This paper considers only one kind of change: missions that could be executed before cannot be executed in the new scheme due to the scheme change. Income increase rate refers to the extent to which the imaging income of the new planning scheme is improved compared with that of the original planning scheme. The lower the disturbance rate and the higher increase rate, the higher the ranking of planning scheme.
The manager agent updates the mission planning sequence of each worker agent according to the ranking. The planning mission sequence of the first worker agent can be updated directly. When updating the planning mission sequence of other worker agents, the mission planning scheme submitted by the worker agent needs to be processed. The planned missions of the previous worker agent will be removed and the income of the scheme will be recalculated. If the income value of the new scheme is still higher than the income of the original scheme of the worker agent, the planning mission sequence will be updated; otherwise, the original scheme will not be updated. After all worker agents are traversed, a round of negotiation is completed. The manager agent generates the global mission planning scheme after this round of negotiation and calculates the fitness value of the global planning scheme. If the fitness value of the global planning scheme does not increase for G max consecutive times, the algorithm exits and the current optimal global planning scheme is output.

V. HYBRID DISCRETE MULTI-VERSE OPTIMIZATION ALGORITHM
After each worker agent obtains the planning missions, how to carry out the mission planning is a key point. In this paper, a hybrid discrete multi-verse optimization algorithm is designed and applied to the mission planning phase of worker agent.

A. STANDARD MULTI-VERSE OPTIMIZATION ALGORITHM
The multiverse algorithm [34] originates from the multiverse theory, which holds that there are other universes besides the one we live in and that universes interact with each other. The multi-verse algorithm chooses three main concepts from the multi-verse theory as its inspiration: white holes, black holes and wormholes.
Every universe has an inflation rate, and the universe with a high inflation rate tends to produce white holes and send objects in the universe into them. The universe with a low inflation rate is more likely to produce black holes and receive objects from other universes through them. No matter the rate of inflation, all universes can exchange objects with other universes through wormholes. The multi-verse algorithm regards a universe as a feasible solution to a problem and regards the inflation rate of the universe as the fitness value of the solution. The algorithm can be divided into two phases: object transfer phase between universes based on the white hole/black hole mechanism and the movement to the optimal universe phase based on the wormhole mechanism.
During the object transfer phase between universes based on the white hole/black hole mechanism, objects can move between universes through white/black hole tunnels. When a white/black hole tunnel is built between two universes, the universe with a higher inflation rate is thought to have white holes, while the universe with a lower inflation rate is thought to have black holes. Objects are then transferred from the white hole in the source universe to the black hole in the target universe, raising the inflation rate of the target universe. During each iteration, a universe producing white holes is selected according to the sorted inflation rates of the universes through the roulette principle, and then the current universe is updated by the following equation.
where, x j i is the j th component of the x th universe, NI (U i ) is the normalized inflation rate of the x th universe, r 1 is the random number between [0,1], and x j k is the j th component of the x th universe selected by the roulette mechanism.
In the phase of moving to the optimal universe based on the wormhole mechanism, the algorithm believes that each universe has a certain probability of generating wormholes and randomly transporting objects through them in space. The algorithm assumes that wormhole tunnels are always built between the current universe and the current optimal universe. The mechanism is described as follows.
where, x j represents the j th component of the current optimal universe, r 2 , r 3 and r 4 are random numbers between [0,1], lb j and ub j are the upper and lower limits of x j respectively, WEP is the probability of wormhole existence, and TDR is the step size of the object moving to the current optimal universe. The updated equations of both are as follows.
where, WEP min = 0.2, WEP max = 1, l is the current iteration number, and L is the maximum iteration number. In order to ensure more development later in the iteration, WEP increases linearly with the number of iterations.
where, p is the development precision, and the value is 6. The TDR is reduced in iterations to allow for more precise local searches around the optimal universe.

B. DISCRETIZATION OF MULTIVERSE OPTIMIZATION ALGORITHMS
Like many other intelligent optimization algorithms, the standard multi-verse algorithm is only suitable for optimizing in a continuous solution space. However, mission planning of imaging satellite is a discrete optimization problem, and its solution space is discrete. In this section, the main phases of the standard multi-verse algorithm are improved and discretized to make them suitable for solving the imaging satellite mission planning problem.

1) ENCODING AND DECODING
In the multi-verse algorithm, each universe represents a feasible mission planning scheme, and each component in the universe represents the number of imaging missions, as shown in Fig.6. Since the missions have been decomposed into a single worker agent, the planning scheme shown in Fig.6 represents the planning scheme of a worker agent. The encoding method in this paper only specifies the relative execution order of imaging missions on a worker agent. Whether the mission can be successfully executed and the time window occupied by the mission are specified in decoding. The specific steps are as follows: The mission number of the current planning scheme is taken out successively, its available time windows in the current satellite are traversed, and constraint detection is carried out on the windows successively. The time period for the specific execution of the mission is specified on the first time window that passes the constraint detection, and the decision variable value of the mission is set to 1. If all time windows fail to pass the constraint detection, the decision variable value of the mission is set to 0.

2) OBJECT TRANSFER PHASE BETWEEN UNIVERSES BASED ON THE WHITE HOLE/BLACK HOLE MECHANISM
Object transfer phase based on the white hole/black hole mechanism is the exploration phase of the algorithm, which aims to make the algorithm jump out of the local optimal solution and explore in the whole feasible solution space. At this phase, the standard multi-verse algorithm has the following two problems: Firstly, the standard multi-verse algorithm uses the roulette strategy to select the universes that produce white holes. However, the universe chosen in the roulette strategy has a high probability of being the best universe. The algorithm does not make good use of the information of other universes. Secondly, when updating each component of the current universe, the standard multi-verse algorithm will use the roulette strategy to select a universe that produces white holes. If the conditions are met, the corresponding components will be directly replaced. As a result, the components of the updated current universe come from multiple universes. This updating mechanism is suitable for low dimensional continuous optimization problems, but may not be suitable for high dimensional discrete optimization problems similar to satellite mission planning. In fact, as a better solution, there must be some information worth using in the universe producing white holes. For satellite mission planning problems, this information refers to some specific mission sequences. The single dimensional inheritance strategy of the standard multi-verse algorithm cannot inherit the information of a better universe, and even leads to a large number of non-integer elements in the current solution, which requires a lot of time to legalize the solution.
Based on the above analysis, the mutation operator of differential evolution algorithm is used to improve this phase. Both DE/current/1 and DE/current-to-best/1 take the current individual as the basis vector, which realizes the development of the current individual and conforms to the characteristics of the first phase of the multi-verse algorithm. The difference is that the former is based on a random individual to guide the current individual, which helps to maintain population diversity. However, due to the lack of guidance from excellent individuals, there is a problem of slow convergence. The latter uses the current best individual to guide the current individual to search in the most promising direction. The guidance of the best individual can accelerate the solution convergence, but it will also reduce the population diversity and make the algorithm fall into the local optimal. This paper draws on the idea of elite population [35], and adopts the mutation strategy DE/current-to-elite/1 based on elite population guidance. Its equation is expressed as follows.
where, X i is the current individual, F is the scaling factor, and X elite is the individual randomly selected from the elite population, which consists of the first q · N individuals in the population. N is the population size, parameter q is the random number between [0,1], which controls the size of elite population. The value of parameter q has a great influence on the performance of the mutation operator. When q value approaches 0, X elite will be the globally optimal individual, and when q value approaches 1, X elite will be a random individual in the population. In fact, in the early iteration, a large q value is needed to maintain population diversity, and in the late iteration, a small q value is needed to accelerate algorithm convergence. Based on this, the value of parameter q adaptively changes with the number of iterations. The updated equation of q value is as follows.
where, l is the current iteration number and L is the maximum iteration number. With the increase of the number of iterations, the q value gradually decreases from 0.9 to 0.1.

3) THE MOVEMENT TO THE OPTIMAL UNIVERSE PHASE BASED ON THE WORMHOLE MECHANISM
The phase of moving towards the optimal universe based on wormhole mechanism is the development phase of the algorithm, and the essence is to explore around the positive and negative directions of the current optimal solution. The standard multi-verse algorithm is suitable for optimizing in continuous solution space. Therefore, the optimization strategy of the second phase, like the strategy of the first phase, cannot be directly applied to discrete optimization problems such as satellite mission planning. If it is used directly, a large number of non-integer elements will be generated in the current solution, which cannot be decoded directly. However, if forced to legalize the solution, it not only consumes time, but also cannot reflect the characteristics of the original strategy. In the differential evolution algorithm, both DE/best/1 and DE/best/2 mutation strategies take the current optimal solution as the basis vector, which realizes the local development of the optimal solution. It just conforms to the optimization idea of the second phase of the standard multi-verse optimization algorithm. At the same time, two mutation strategies, DE/best/1 and DE/best/2, are similar in form to the improved strategy in the first phase, so the discrete strategy designed in this paper can be directly used without generating illegal solutions. In this paper, two mutation strategies, DE/best/1 and DE/best/2, are adopted to improve this phase. The improved phase of moving towards the optimal universe based on wormhole mechanism is described as follows.
where, X i is the current individual, X best is the current optimal individual, F is the scaling factor, r 2 and r 3 are random numbers between [0,1], X r1 , X r2 , X r3 and X r4 are individuals randomly selected from the population.

4) DISCRETIZATION PROCESSING
Equation (19) and (21) are suitable for continuous optimization problems and cannot be directly applied to discrete optimization problems such as satellite mission planning. Combined with the research in literature [36], this paper takes Z = X + F · (Y 1 − Y 2 ) as an example to redefine the operators in the update equation.
Step1: Subtract Y 1 and Y 2 to get the difference vector . The specific operation rules are as follows: Compare Y 1 and Y 2 corresponding elements in turn, if they are not equal, take the element in Y 1 as the corresponding position element of the difference vector ; if they are equal, set the value of the corresponding position element of the difference vector to 0, that is: Step2: Perform · operation with F and to obtain vector . A sequence of the same length as the vector is randomly generated, and the elements in the sequence are randomly generated between [0,1]. If rand < F, then take the element in vector as the corresponding position element of vector ; Otherwise, set the element value of the VOLUME 11, 2023 corresponding position of vector to 0, that is: Step3: Z = X + follows the following steps: make k=1; if ϕ(k) = 0, make j = j + 1; otherwise, find the mission numbered ϕ(k) in X at location k ′ , and exchange the missions in X at location k and k ′ , update X , and finally make j = j+1; if k ≤ N , (N is the length of the vector) repeat the previous step; otherwise, output Z = X .

C. THE FLOW OF HYBRID DISCRETE MULTI-VERSE OPTIMIZATION ALGORITHM
The general flow of the hybrid multi-verse optimization algorithm is obtained based on the above description as follows.
Step1: Initialize the multi-verse population and algorithm parameters; Step2: Calculate the inflation rate (fitness value) of each universe, normalize the inflation rate, and update the best universe; Step3: Traversing every universe individual, calculating WEP and generating random numbers r 1 , r 2 and r 3 ; Step4: If r 1 < NI (U i ), then the object transfer phase based on white /black hole is performed to generate universe U ′ i . If the inflation rate of U ′ i is greater than U i , update the current universe, otherwise do not update the current universe. If r 1 ≥ NI (U i ), go to the next step; Step5: If r 2 < WEP, when r 3 < 0.5, the current universe moves towards the optimal universe based on DE/best/1 strategy to generate universe U ′ i ; when r 3 ≥ 0.5, the current universe moves towards the optimal universe based on DE/best/2 strategy to generate universe U ′ i ; if the inflation rate of U ′ i is greater than U i , the current universe is updated; otherwise, the current universe is not updated. If r 2 ≥ WEP, go to the next step.
Step6: If the maximum number of iterations is reached, the current optimal solution will be output and the algorithm will end; Otherwise return to step 2.

D. ANALYSIS OF THE ALGORITHM COMPLEXITY
The standard multi-verse algorithm mainly includes parameter setting, population initialization, object transfer phase based on the white hole/black hole mechanism (exploration phase), and the phase of moving towards the optimal universe based on wormhole mechanism (development phase). Let the population size be N and the maximum number of iterations be T max . It is important to note that not all individuals in the population have the opportunity to perform the exploration phase and the development phase in every iteration. Only individuals who meet the conditions set in the algorithm have the opportunity to perform the above two phases. Here, we consider the most complex case where all individuals in each generation participate in the exploration and development phases. Therefore, the computational complexity of the algorithm to execute the exploration phase and the development phase once respectively is O (2N ).Then, the total computational complexity of the algorithm is O( N ) is the computational complexity of T max iterations of the two search stages, O(N ) is the computational complexity of population initialization, and O(1) is the computational complexity of parameter setting. The hybrid discrete multi-verse algorithm proposed in this paper inherits the basic architecture and two main optimization phases of the standard multi-verse algorithm. The proposed algorithm mainly improves the formulas of the two optimization phases in the standard multiverse algorithm.
In fact, compared with the original formulas, the improved formulas do not generate extra computational burden, especially the cyclic operation which has a great influence on the computational complexity of the algorithm. Therefore, the computational complexity of the hybrid discrete multiverse algorithm is at the same quantitative level as that of the standard multiverse algorithm, and its computational complexity can also be expressed as (1). It can be seen that the hybrid discrete multi-verse algorithm does not improve the solving effect at the cost of increasing the computational complexity.

VI. SIMULATION EXPERIMENT
In order to test the effectiveness of the proposed distributed imaging satellite mission planning method (DISMPM) based on multi-agent system, missions of different sizes are set in this section for simulation experiments. Due to the complexity of satellite mission planning problems, there is no recognized test set for imaging satellite mission planning. The mission set and imaging satellites used in this paper are constructed by calling STK interface function in MATLAB. A Walker constellation consisting of 12 imaging satellites is established. The satellites adopt sun-synchronous orbits and are evenly distributed in 4 orbital planes. In practice, imaging satellites usually adopt a sun-synchronous orbit, which makes the sun shine at the same angle every time the satellite flies over a place, excluding the influence of solar illumination on imaging. It is helpful to prove the extensibility of the proposed method in real-world problems. The satellite's position in space is defined by six orbital parameters: semimajor axis (SA), eccentricity (E), inclination (I), right ascension of the ascending node (RAAN), Argument of periapsis (AP), and true anomaly (TA). The orbital parameters of the seed satellite are shown in Table 2. A total of 6 image missions are set, and the mission sizes are 100, 150, 200, 250, 300 and 400 respectively. 65540 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  All missions are randomly generated in the range of 70 • E, and the mission income is randomly generated in the range of 1 to 10. The start time of the simulation scenario is 18 Apr 2023 04:00:00(UTCG) for one day.

A. CONVERGENCE ANALYSIS OF DISMPM
When the mission scale is 300, the DISMPM is run to record the convergence of the global fitness value and the income value of each satellite under the two strategies (DISMPM1 uses the worker agent serial activation strategy, and DISMPM2 uses the worker agent parallel activation strategy), and the results are shown in Fig.7. Fig.7 (a) and Fig.7 (c) respectively show the iteration curve of global fitness value with the number of negotiations under serial activation strategy and parallel activation strategy. Fig.7 (b) and Fig.7 (d) respectively show the iterative curve of income value of a single worker agent with the number of negotiations under serial activation strategy and parallel activation strategy. It can be seen that when the mission scale is 300, the normalized global fitness values of DISMPM1 and DISMPM1 converge to about 0.95 after several negotiations, which shows the effectiveness of the distributed mission planning method proposed in this paper. DISMPM deploys mission planning nodes on each satellite involved in planning, breaking down large-scale imaging missions onto a single satellite. This effectively reduces the complexity of solving the problem and reduces the pressure on the optimization ability of the algorithm. In addition, thanks to the task pre-assignment strategy in this paper, DISMPM1 and DISMPM2 have a relatively high global fitness value at the initial moment of negotiation. In the following negotiation process, local adjustment of missions is realized between satellite agents VOLUME 11, 2023   through interaction, so that the missions undertaken by each agent are more reasonable. Thus, the global fitness value is further improved. It can also be seen that the planning income value of some worker agent always stays at a high level, which further verifies the effectiveness of the mission pre-assignment strategy designed in this paper. This strategy enables some workers agent to obtain more appropriate tasks in the initial task pre-assignment, so as to obtain higher planning income value. Since the planning income of single worker agent in both strategies adopts the ''non-decreasing'' updating mode, the planning income of single worker agent gradually converges to a relatively stable value with the increase of negotiation times. Accordingly, the global fitness value also converges to a fixed value with the increase of the negotiation times until the algorithm meets the exit condition, which indicates that DISMPM proposed in this paper can realize the continuous improvement of the planning scheme of each worker agent, and finally ensure the convergence of the fitness value of the global planning.

B. COMPARISON WITH CENTRALIZED PLANNING
In order to verify the solution performance of the proposed method, it is compared with the centralized improved genetic algorithm (CIGA) [37], the centralized improved discrete particle swarm optimization algorithm (CIDPSO) [38] and the centralized simulated annealing algorithm (CSA) under different problem scales. Each algorithm is run 20 times on each problem scale to record the best fitness value (BF) and the average fitness value (AF), and the experimental results are shown in Table 3 and Fig.8.
It can be seen that the average fitness values and the best fitness values of the two methods proposed in this paper are better than the other three centralized solving algorithms in problems of different scales. In particular, when the problem scale gradually increases, satellite resources become strained, the conflict between missions gradually increases, the solving complexity of the centralized algorithm increases sharply, and its optimization ability decreases significantly. The distributed solution method decomposes the large-scale planning problem to the level of single satellite, which reduces the complexity of solving the problem to a certain extent and ensures the optimization ability compared with the centralized algorithm. In general, the optimization capabilities of different methods are DISMPM2, DISMPM1, CIDPSO, CIGA and CSA in order from strong to weak. Specifically, as the problem scale increases, the average It can also be seen from Fig.8 that DISMPM2 has better optimization capability than DISMPM1. In fact, the rationality of mission redistribution will affect the quality of planning scheme. DISMPM1 uses a serial activation strategy, which means that only one worker agent is activated when the manager agent releases unplanned missions. After the currently activated worker agent plans unplanned missions, the manager agent needs to update the sequence of unplanned missions before activating the next worker agent. There may be some problems in this way. From a global perspective, mission M may be more suitable for worker agent B to complete, but worker agent A is activated before worker agent B, and worker agent A adds mission A to its planning scheme based on the principle of '' non-decreasing '' of planning income. As a result, although mission M is more suitable for worker agent B to complete, worker agent B has no opportunity to plan for it. DISMPM2 uses a parallel activation strategy. After the manager agent releases unplanned mission, all worker agents can simultaneously plan all unplanned missions based on their own states. This strategy will give worker agent B the opportunity to perform mission M, thus achieving a more reasonable mission redistribution. Therefore, DISMPM2 has better optimization capability than DISMPM1.

C. COMPARISON WITH CONTRACT NETWORK ALGORITHM
Another important evaluation index of distributed imaging satellite mission planning scheme is the system traffic. Since the satellites need to achieve reasonable assignment of missions through negotiation, communication links need to be established between satellites for communication. Excessive traffic will put great communication pressure on the system. In this paper, the traffic volume of the system is measured by the number of negotiations, and the number of negotiations of DISMPM is compared with that of the contract network algorithm (CA). Each algorithm is run 20 times on each problem scale to record the average number of negotiations (ANN) and the minimum number of negotiations (MNN), and the experimental results are shown in Table 4 and Fig. 9.
As can be seen from Table 4 and Fig.9, the number of negotiations of DISMPM proposed in this paper are obviously less than that of the contract network algorithm for problems of different scales. Under different problems scales, the average number of negotiations of DISMPM1 are reduced by 89.0%, 92.7%, 91.5%, 91.9%, 91.4% and 92.1%, respectively, compared with CA algorithm, and the average number of negotiations of DISMPM2 are reduced by 88.6%, 91.6%, 89.8%, 87.4%, 86.8% and 88.2%, respectively, compared with CA algorithm. With the increase of the problem scale, the complexity of solving the problem also increases gradually, which leads to the increase of the number of negotiations. In the contract network algorithm, missions are processed individually. In a round of negotiation, the manager agent will decide the ownership of the current mission according to the global income. Even if all worker agents participate in the negotiation, only one worker agent has the opportunity to perform the mission. Therefore, the number of negotiations in the contract algorithm is related to the mission scale, and the number of negotiations is equal to the number of missions. This strategy works well when the mission scale is small. However, when the mission scale increases, even though the contract algorithm can obtain a better planning scheme, it is difficult to apply this strategy in practice because of the large communication burden on the system. DISMPM processes all unplanned missions at the same time in each round of negotiation, and all worker agents participating in the negotiation have the opportunity to plan unplanned missions. Compared with the contract algorithm, DISMPM has higher negotiation efficiency, effectively reduces the number of negotiations, and reduces the communication burden of the system. It can also be seen from Fig.9 that DISMPM1 with serial activation strategy of worker agents has less negotiation times than DISMPM2 with parallel activation strategy of worker agents. This is because under the serial activation strategy, different worker agents will plan the current unplanned missions in turn, and different worker agents will not compete for the same mission, which is more efficient for negotiation. However, when the parallel activation strategy is adopted, different worker agents may compete for the same mission and plan for it at the same time. As a result, other missions that could get planning opportunities cannot obtain visible window resources in the current round of negotiation and cannot be planned, thus increasing the number of negotiations.

VII. CONCLUSION
With the development of intelligent satellites, distributed imaging satellite mission planning will be more widely used. In this work, we propose a method named DISMPM to solve the distributed mission planning problem of imaging satellites. Our main contributions are as follows. Firstly, a distributed imaging satellite mission planning model with variable collaborative division of labor is constructed based on multi-agent system theory. The model defines the intelligence level of satellites in the satellite cluster and the interaction mode between satellites. Secondly, the cooperation mechanism between satellite agents is established based on blackboard model. Two negotiation strategies of worker serial activation and worker parallel activation are proposed. In order to improve the efficiency of negotiation, the targets are preassigned before negotiation. Finally, a hybrid discrete multi-verse optimization algorithm is proposed to solve the mission planning problem of worker agents in the model. In the simulation experimental part, comparative experiments are designed to demonstrate the effectiveness of DISMPM based on two indexes: the quality of the planning scheme and the system traffic. In terms of planning quality, DISMPM can effectively improve the quality of the planning scheme compared with the centralized solution methods. In particular, with the increase of the problem scale, the advantages of DISMPM become more obvious. The average fitness values obtained by DISMPM using serial activation strategy and DISMPM using parallel activation strategy are up to 20% higher than those obtained by the centralized algorithm with the best solution effect in this paper. In terms of system traffic, the average negotiation times of the two DISMPM methods are both reduced by more than 80% compared with the contract network algorithm, effectively reducing the system traffic and reducing the communication burden.