The modeling and solution of course of action generation based on IN-DEA

The course of action (COA) generation is the course of designing the task planning flow, which is one of the key supporting techniques while the military organization executing aerial operation. Due to the outstanding advantage of influence nets (IN), describing the causal logic between random variable, this proposal will introduce IN into the fast building mechanism of COA model. The COA generation model utilize influence nets is established. The probability propagation method is designed, which is based on influence intensity. An improved differential evolution algorithm (IDEA) is designed to solve the action policy optimized selected model. At last, the simulated result of aerial arrack campaign case show the action policy optimized selected in variety influence constants can improve the capability of cause-effect modeling, and the improved differential evolution algorithm has fine convergent and optimized capability.


Introduction
The Course of Action (COA) is a type of overall orderly behavior that the joint commanding organization of arms of the services conducts rational planning of the combat troops' action, allocation of the task sequence, scheduling of military resources, to achieve its operational objectives according to the analysis results of the battle-field-related information; it is a crucial component in the joint operational plan. Because the action plan is essentially an optimal process for COA to achieve the overall mission, it can be seen as a directed evolution process with the goal of combat mission, meaning that according to the combat process, select a series of COA so that the strength of the organization or cluster can achieve the desired evolutionary results. Probabilistic network can represent the interrelationships between variables in a simple and effective way, and becoming a powerful tool with broad application prospects in the uncertainty reasoning system [1]. In the application of COA generation, the probability network has prominent advantages in describing the association relations of COA elements, so it has gradually become the main tool for COA modeling and production in recent years. Probably the most representative of the probabilistic network is markov networks (MN), bayesian networks (BN) and influence networks (IN). In the actual practice of COA problem modeling, BN and MN has two limitations; one is probabilistic reasoning difficulties and the other one is the need to specify a large number of conditional probabilities. The former limitation leads to low time efficiency of variable probability calculation and the latter limitation leads to the complex modeling under a small amount of information [2]. IN Desired combat goal: the ultimate operational goal of joint air operations and the combat effect is to make the greatest operational goals achieve joint probability. Set the expected goal set of combat in a certain battlefield situation as

Generation Model of COA Based on the Influence Network
In order to effectively describe the influencing model based on influence intensity, the enemy-us action in joint air operations is deemed as the action in a unified way. Take the example of the influence of the action set For a given state vector n x , set the occurrence probability of Action B is ( | ) n P B x , then the quantification of the influence intensity is described as 1, () PB or 0 correspondingly. () PB is the probability of Action B. If ( ) [ 1,1] n n hx  , use the influence intensity () n n hx and () PB to deduce the conditional probability of the occurrence of Action B as: Therefore, the influence network model can be formally described as (V, E,C, A, B) , of which V is the node set of the model, and the single node is a random variable of 0-1 value. E is the directed edge set of the model and is the description of the cause-effect influence between nodes. C is the set of causeeffect influencing intensity between nodes, and g represents the influencing intensity of Action A on Action B when it is implemented or not (true or false) respectively.
In this way, the cause-effect relationship model of various variables in the course of setting up COA through the influence network is to conduct quantization and modeling of the cause-effect relationship among the basic action, the enemy action, the stage goal and the expected operational target and to use the influence intensity value to quantify and describe the degree of cause-effect influence. In the model, the combat target node and the stage target node will influence each other in the course of the combat, and the current state of the node is associated with the state of the last moment. The parameter . Under the constraints of the entity resource and the time resource, the best combat goal, which is described as follows: According to the combat under the implementation of the best COA *  , the initial state 0 () TC t is finally converted to the goal state () T TC t ; if the action program  is followed to implement the combat, the consumption of combat resources should be within the limitations of the entire resource amount TC R . If the action program  is followed to implement the combat, the completion time needs to be within the set combat time limit TC T .

Probability Propagation Calculation Influencing the Network
Probabilistic propagation calculation is essentially a probabilistic reasoning mechanism [3]. The use of this reasoning mechanism needs to satisfy the requirement that the nodes of the probability-conduction association are independent. The probability of the root node changes with the process of the combat. In the new stage, the new probability will influence other nodes. Therefore, it is necessary to update the root node probability, calculate and update the probability of other nodes layer by layer.
Step 1: For the constructed influence network G = (V, E), select a node r V ; use the Dijkstra method to calculate the minimum distance ( , ) Step 2: if kT tt  , update the prior probability of the layer number L and root node of the influence network, update the influencing intensity value between the root node and sub-node and turn to Step 3. Otherwise, the calculation is ended, output the independent probability value and joint probability value of the sub-node; Step 3: Calculate the conditional probability distribution of the sub-nodes; Given the influence value () hx  , the influence value of a two-dimensional random state vector n X on the target sub-node B is () n n hx and the conditional probability ( | ) n P B x of the subnode B is calculated as follows: (2) Polymerized positive influence intensity value Step 4: Calculate the occurrence probability of a sub-node Given the conditional probability ( | ) n P B x of the sub-node B and the prior probability () n Px of the root node, according to the mutually independent assumption of the occurrence probability of the general formula and the root node, it can get ( ) Step 5: Determine whether the occurrence probability of each layer of nodes is calculated; if it is not completed, 1 LL  and return to Step 2; if completed, update the time period 1 [ , ] kk tt  , and return to Step 2.

Coding Form of the Individual
The feasible strategy set composed of the basic action

Fitness Function
For the evaluation of the preferred COA, the ultimate goal of the joint air operations is to achieve the desired target intention. Therefore, the program to make the target effect reach the largest probability is the best COA, In the uncertain environment, in order to accurately estimate the natural state of the enemy action E , the new random test observation can be used to gain the new signal and the simulation data is M . The joint probability value f can be evaluated by calculating the mean  and variance 2  of M 1 f . For the joint air operations, it is desirable to maximize the joint probability of the combat target, that is, the signal-to-noise ratio (SNR) can be defined to integrate the mean  and variance 2  , replacing the single joint probability f , the fitness function as follows.

Differential Variance
Regardless of whether the individual is feasible before and after the mutation, the individual . The size of the disturbance is positively correlated with the absolute value of the difference vector. When the population approaches the optimal value, the disturbance will also become smaller.

Cross Operation
In formula, (0,1) rand is the random value between 0 and 1; CR is a cross constant, the value of which is the real number between [0,1] . The probability of cross increases as the CR value increases. The cross operation ensures that at least one phase element is exchanged between individuals, thus creating new individuals and preventing the evolution of the population from interruption.

Selection Operation with Learning
The selection operation in IDEA is a preference model based on the "greedy mechanism", and the size of the adaptive function of the individual is the only reference standard. If and only if the fitness value of the new vector individual i v is better than that of the target vector individual i  , the new individual will be selected. Otherwise, the new generation of population j  is still reserved for iterative calculation, and in the process, it continues to conduct variation and cross-operation of target vector. IDEA's selection strategy is relatively harsh; parent individual and offspring candidate individual compete against each other and the preference is selected, making the offspring individuals always inferior to the parent individuals and promoting the population to seek for preference towards the best orientation in the shorted path. With the ranking result, seeking common gene of the whole chromosomes to get the fine individual and marked the former-rank chromosome gene bit. While individual learning probability (0,1) l p rand  , utilize the superiority common gene instead of the latterrank gene bit, to enhance the algorithm convergence rate.

Case Scenario
In order to organize and apply all kinds of air power in a unified way, the scenario destroys most of the important goals on an island reef, forces them to abandon the island and thus gets rid of the barrier for the friendly forces to carry out landing operations. In order to ensure that the operational intent is achieved, the air force will carry out the following actions: 1 A is air cover on the air attack; 2 A is air cover on marine attack; 3 A is air ground assault of high value fixed island goal; 4 A is air-ground destruction of the ground's fixed air defense weapons target; 5 A is air-found destruction of high maneuver air defense target; 6 A is air-ground assault of high-value island target; 7 A is air command and control operations; 8 A is regional cover; 9 A is electronic interference to the enemy; 10 A is organization of air refueling. The enemy counterattack may include: 1 E is the enemy aircraft carries out air interception to us; 2 E is the enemy ships organizes the attack of air plane; 3 E is the island air defense weapon system conducts attack of air plane; 4 E is the enemy organizes electronic antiinterference; 5 E is air interception of the enemy's second echelon. The expected battle effect under the joint function of basic action and enemy action: 1 S is air-to-air combat effect; 2 S is air-to-sea combat effect; 3 S is attack effect of air-to-ground air defense firepower; 4 S is gathering of air formation; 5 S is destruction degree of island objective; 6 S is return and meet. Focus on the operational intent, the final target effect is summarized into: 1 D is task completion; 2 D is battle loss. Combined with staff experience and knowledge of experts in the field, the influence network is used to set up the COA generation model, as shown in Figure 1. According to the operational guidance, under the common analysis of the planning staff and experts in the field [4], divide the combat into six stages for organization and implementation: 01  tt . Subsequently, the enemy conducts electronic anti-interference and the interference effect gets worse in the stage of 12 [ , ] tt and 23 [ , ] tt . This action will not be organized in the stage of 34 [ , ] tt , leading to the same influencing intensity as before. The joint probability of the target functions 1 D and 2 D is the largest. The objective function 1 () fs is the independent occurrence probability of the node 1 D ; 2 () fs is the independent occurrence probability of 2 D , and the objective function 3 () fsis the joint probability that the node 1 D occurs and 2 D does not occur, as shown in Fig.2 and Fig.3. The final target function value 0.8897 is better than the target function value 0.7875 of 1  .Focusing on the generation of COA 2  , in the case of operating for 30 times, as shown in Fig 4 and Fig 5,  the optimal under certain parameter setting conditions, although IDEA and DEA have more running time than GA, and when approaching the global optimal solution, it is necessary to go through several iterations to achieve the optimal, which is decided by DEA's own mechanism [5] , but compared to GA, the global search ability of IDEA and DEA is stronger and they can seek for the global optimal solution more quickly in the feasible solution space. Compared with the standard DEA, IDEA has better average convergence efficiency through the feature learning of the dominant individual points.