Evasion-Pursuit Strategy against Defended Aircraft Based on Differential Game Theory

,


Introduction
In the traditional pursuit-evasion scenario, the guidance law was investigated for two players which included an evasion target and a pursuit attacker.Zarchan studied a variety of guidance laws for this pursuit-evasion scenario [1].A new impact time and angle control guidance law against stationary and nonmaneuvering targets was investigated for the missile [2].A novel extended proportional guidance law was designed to intercept the maneuvering target [3].The adaptive integral sliding mode guidance law was derived in a three-dimensional scenario [4,5].In these papers, the acceleration of the target was a known bounded external disturbance to the missile.A two-phase optimal guidance law was derived to improve the estimation accuracy and terminal performances for impact angle constraint engagement [6].Yang et al. [7] presented a time-varying biased proportional guidance law in which two time-varying bias terms were applied to divide the trajectory into the initial phase and terminal phase.Recently, various pursuit-evasion scenarios involving multiple players have been investigated.The guidance laws for two missiles attacking one target were analyzed [8,9].References [10][11][12] described a scenario in which multimissiles attacked one target, and the cooperative guidance laws were derived.
When a missile attacks the aircraft, the aircraft always launches a defender to protect itself.Meanwhile, the aircraft evades the attacking missile.The problem which includes a target aircraft, a defender, and an attacking missile is known as the active defense scenario.It is difficult for a missile to hit the aircraft that launched a defender by using the traditional guidance law.The three-player engagement is different from the typical one-to-one engagement.In recent years, the strategies in the active defense scenario have been a hot topic, and especially, the cooperative guidance laws between the target and the defender have been studied a lot.
A scenario in which the defender and the fixed or slowly moving target constituted the defended system was investigated [13][14][15].In these papers, the optimal defense guidance laws were derived under the condition that the positions and the trajectories of the target and the attacker were known to the defender.Rusnak [16] investigated a scenario in which the lady evaded from the bandit that pursued the lady and the body guard intercepted the bandit before the bandit captured the lady.In this paper, the optimal strategies were derived base on the differential game theory and optimal control theory.Line of sight (LOS) guidance law was investigated to intercept the attacker and protect the target [17].In this paper, the defender is located on the line of sight of the attacker and the target.The defender could intercept the attacker with less control by using the LOS guidance law than by using the traditional guidance law.The cooperative optimal guidance laws between the target and the defender were studied [18][19][20].In these papers, the target launched one defender, and the guidance laws were derived by differential game theory.The defender and the target helped each other to intercept the attacker.Oyler et al. [21] studied the pursuit-evasion games in the presence of obstacles that inhibited the motions of the players.The dominance regions were presented and analyzed to provide a complete solution to the game.Unlike previous research, two defenders were launched from the target to protect itself [22,23].In Reference [23], the cooperation between the defenders and the target was one-way which meant one defender received the information from the target and another defender sent information to the target.In Refs.[24][25][26], the cooperative guidance law for protecting the target was investigated by using nonlinear methods, and the defender could intercept the attacker with high heading angle errors.The conditions were investigated for the attacker winning the game in the active defense scenario by using the differential game theory [27].Rubinsky and Gutman [28] investigated a three-player scenario in which the attacker evaded a defender and continued to pursue a target.In this scenario, the target and defender were independent, and the derived guidance law is only suited for the condition that the zero-effort-miss (ZEM) distance between the attacker and the target is not a large value.An evasion and pursuit guidance law for the attacking missile was analyzed [29], and the control efforts of the defender and the target were known to the attacking missile.In this paper, the attacking missile chose an appropriate lateral acceleration to maneuver before the defender and the attacking missile met, then the attacking missile used the optimal pursuit guidance law to hit the aircraft.
In the previous paper, the studies always focused on the cooperative guidance law between the aircraft and the defender.However, the attacking guidance law for the attacker winning the game is relatively rare.Refs.[27][28][29] presented the attacking guidance laws for the attacking missile.However, in these papers, the control efforts of the target and the defender were known to the missile, and they were hard to obtain in reality.The method presented in Reference [28] is unsuited for the condition that the zero-effort-miss (ZEM) distance between the attacker and the target is large, and the zero-effort-miss (ZEM) distance between the attacker and the defender is small.
In this paper, a new strategy is investigated for the attacker to hit the target.In this scenario, the miss distance between the target and the attacker and the miss distance between the defender and the attacker are considered for the attacker at the same time.The target and the defender are independent, and they use the optimal strategies.It is not necessary for the attacker to obtain the control efforts of the target and the defender by using the derived guidance law.

Problem Formulation
The problem consists of three players: an attacker (A), a target (T), and a defender (D), and the scenario is described in Figure 1.LOS is the line of sight.R and V represent the range and velocity.γ represents the flight path angle.λ represents the angle between line of sight and the X axis.The lateral acceleration is denoted by α.The subscripts A, T, and D represent the corresponding players.AT and AD present the corresponding parameters between the players.
Neglecting the gravitational force, the geometric relations for the rates of the ranges are obtained by The LOS rate relations are expressed as follows: The dynamics of each player are considered to be a linear time-invariant system that can be described by the following equations [24]: Here, δ i is the state vector of internal state variables of each agent with dim δ i = n i , and u i ′ represents its controller.
The path angle relations satisfy the following equation: It is assumed that the problem occurs in the endgame phase and the defender separates from the target; thus, the problem can be linearized along the initial lines of sight.The relative displacement between two players normal to LOS 0 is denoted as y i i = AT, AD .The accelerations of the attacker and target normal to LOS AT are denoted by u AL AT and u TL AT .The acceleration of the defender normal to LOS AD is defined by u DL AD .Thus, we can obtain International Journal of Aerospace Engineering where We solve the problem under the condition that the players A, T, and D obey ideal dynamics.Thus, a i n×n , It can be obtained by u A , u T , and u D satisfy the following form: The state vector of the linearized engagement is expressed as follows: The equations of motion corresponding to equation ( 9) are given by The equations can be written in the following form: where

International Journal of Aerospace Engineering
The intercept times are considered to be fixed because of the problem occurring in the endgame phase, and they can be given by After t AD f , the defender will disappear.The time-to-go t go can be described by 3. Strategy for the Attacker 3.1.Order Reduction.The order of the problem needs to be reduced so that it can be solved expediently.The well-known zero-effort-miss (ZEM) distance between the attacker and the target can be expressed as follows: Similarly, the ZEM distance between the attacker and the defender can be expressed as follows: where Φ t AT f , t and Φ t AD f , t are the transition matrices with respect to equation (11), AD and D AT are expressed as follows: Equations ( 15) and ( 16) can be presented by The dynamics of Z AT t and Z AD t can be obtained by 3.2.One-to-One Optimal Strategies.In the attacker-target engagement, the attacker needs to pursue the target.The cost function to solve the problem is expressed by Because of , the cost function can be rewritten in the following form: Similarly, in the attacker-defender engagement, the attacker needs to evade from the defender.The cost function to solve the problem is expressed by The Hamiltonian functions corresponding to equations ( 23) and ( 22) are given by The adjoint equation and transversality condition are as follows: Thus, the solution can be obtained as follows: Substituting equations ( 26) and (20) into equation (24), it can be obtained in the following form: International Journal of Aerospace Engineering The optimal strategies for the attacker-target engagement are as follows: The optimal strategies for the attacker-defender engagement are as follows: where superscript max represents the maximal value.

Optimal
Trajectories for the Attacker.In the attackertarget engagement, the optimal pursuit strategy for the attacker in equation ( 28) is It is assumed that u max A , u max T , and u max D satisfy u max A > u max T and u max A > u max D in the scenario.Z AT t satisfies the following form: The kill radius of the attacker is R; Z AT t satisfies The positive and negative pursuit boundary trajectories are given by Figure 2 presents the optimal pursuit trajectories.The positive and negative boundary trajectories are marked with triangles.In the engagement, the attacker uses the optimal pursuit guidance law, and the target uses the optimal evasion guidance law corresponding to equation (28).If Z AT t locates on the boundary trajectories, the final miss distance between the attacker and the target will be R.If Z AT t locates within the zone between the positive and negative boundary trajectories, the final miss distance between the attacker and the target will be less than R; thus, the attacker can hit the target successfully.Conversely, the aircraft evades the attacker successfully.
The defender is launched from the target; thus, κ is always a positive value.Similarly, Z AD t satisfies the following form: The kill radius of the defender is M; Z AD t satisfies Figure 3 presents the optimal evasion trajectories under the condition that u max A κ > u max D .The positive and negative boundary trajectories are marked with triangles.In the engagement, the attacker uses the optimal evasion guidance law, and the defender uses the optimal intercept guidance law corresponding to equation (29).If Z AD t locates on the boundary trajectories, the final miss distance between the attacker and the defender will be M.If Z AD t locates without the zone between the positive and negative boundary trajectories, the final miss distance between the attacker and the defender will be larger than M; thus, the attacker can evade the defender successfully.Conversely, the defender intercepts the attacker successfully.
It can be noted that if the signs of Z AT and Z AD are the same, the optimal strategies of the attacker are different in equations ( 28) and (29).It means that when the attacker pursues the target, it will approach the defender.Figure 4 shows the time evolution of the ZEMs for the situation in which the attacker evades the defender before the engagement time t AD f , then pursues the target.It is shown that if the attacker evades the defender before t AD f , the absolute value of Z AT will increase heavily, and it will go out of the zone between the positive and negative pursuit boundary trajectories easily.Thus, it is difficult for the attacker to pursue the target successfully after t AD f .Figure 5 shows the time evolution of the ZEMs for the situation in which the attacker pursues the target in the total endgame phase.It is shown that the value of Z AD will easily go in the zone between the positive and negative evasion boundary trajectories.Thus, the attacker can be intercepted easily by the defender because the attacker only pursues the target and ignores the defender.

Optimal Pursuit
Strategy for the Attacker.If the attacker wants to win the game, the attacker needs to evade from the defender and pursue the target.Thus, the cost function is designed by

37
where α and β are nonnegative weights.The Hamiltonian function of the problem is in the following form: Parameters satisfy Substituting equations ( 39) and ( 20) into equation (38), : Time evolution of the ZEMs for the attacker evading the defender before t AD f .
The open-loop optimal strategies can be expressed as follows: The close-loop optimal strategies of u Θ T and u Θ D are solved as follows:

42
where superscript max represents the maximal value.The close-loop optimal strategy of the attacker is difficult to obtain.The open-loop optimal strategy is The strategy is designed for the attacker to evade from the defender and pursue the target as follows: Equation (44) can be rewritten as follows:

Nonlinear Simulation
The initial condition is shown in Table  9 International Journal of Aerospace Engineering and its influence on the cost function becomes greater.Thus, the control direction of the attacker changes, which leads to the decrease of Z AD t and Z AT t .It can be concluded that the attacker can evade from the defender and hit the target by using the derived strategy through observing the trajectories, and results are shown in Figure 8 and Table 2.

Conclusion
The scenario in which the attacker attacks the active defense aircraft is investigated.In this scenario, the target evades the attacker, and the defender intercepts the attacker by using optimal guidance laws.The optimal  Z AT (t)t go value (Figure 10) Z AD (t) (Figure 6) t go (Figure 9) t go (Figure 9) Z AT (t) (Figure 7) one-to-one guidance law is derived for the attacker.If the attacker evades the defender by using the optimal evasion guidance law before t AD f , it will go out of the zone between the positive and negative pursuit boundary trajectories easily.Thus, it is difficult for the attacker to pursue the target successfully after t AD f .If the attacker pursues the target in the total endgame phase, the value of Z AD will easily go in the zone between the positive and negative evasion boundary trajectories, and the attacker can be intercepted by the defender.
Thus, a new strategy is derived for the attacker to win the game in the active defense scenario.In this problem, the target evades from the attacker, and the defender intercepts the attacker by using the derived close-loop optimal strategies.Although the close-loop strategy is difficult to obtain by using the presented cost function for the attacker, an available strategy is designed for it based on the open-loop solution.The attacker can accomplish the task of evading from the defender and pursuing the target by using the derived strategy.

Figure 5 :
Figure 5: Time evolution of the ZEMs for the attacker pursuing the target in the total endgame phase.

Figure 6 :Figure 7 :
Figure 6: Time evolutions of Z AD t .
Control parameter value (m.s) Change points of control direction

Figure 10 :
Figure 10: Time evolutions of the control parameters of α/βZ AD t t AD go κ and Z AT t t AT go .

Figure 11 :
Figure 11: Meanings of the lines for different values of α/β.

Table 2 .
1.  Figures 6,7,8,and 9show the time evolutions of Z AD t , Z AT t , three players' trajectories, and time-to-go by using nonlinear simulation for different values of α/β.Figure10shows the values of the control parameters corresponding to α/βZ AD t t AD go κ and Z AT t t AT go .In the simulation phase, the initial line of sight is updated in real time, and t AD It is shown that when the time approaches the engagement time t AD f , the absolute value of Z AD t increases substantially because at this time, the LOS changes quickly, and t AD go increases heavily.It is noted that at the initial time, Z AD t and Z AT t increase because at this time, the absolute value of α/βZ AD t t AD go κ is bigger than that of Z AT t t AT go , and the attacker tries to minimize the cost function.As time goes on, the absolute value of Z AT t tATgo increases more quickly,

Table 2 :
Simulation results for different values of α/β.