Game control of air strikes against a recoverable ground facility

The current task is dedicated to the problem of developing minimax game control in pure strategies when two enemies (an aviation group and a repair team) act against each other. The first one destroys an object and the second one repairs it. The algorithm of optimal controlling process based on the average risk criterion does not guarantee a result of the governance. It happens because of both enemies observe and conduct the object’s structure. Optimal solutions are based on the Systems theory with a random jumping structure and the Game theory too. The controlled object is described by dynamic stochastic system with a random jumping structure which has a finite number of possible states. Transitions from one state to another occur at random moments of time and they are controlled by two warring parties (the aviation group and the repair team) pursuing strictly opposite goals. State changing process of the object’s structure is assessed by indicators of players which are worked with errors. Some kind of object’s state functionality is a criterion for the optimality of controlling processes. One party seeks to minimize it and the other one seeks to maximize it. Players control the object’s structure in clean strategies applying the final number of possible strategies. The optimal controlling processes are among the class of non-stochastic mathematical dependencies on the observations results which are preceded the current moment.


Introduction
During the recent decades, military aviation has played a significant role in local wars and armed conflicts, particularly at the initial stage of armed confrontation. This is caused by its unique combat properties, which ensure an effective fulfillment of assigned tasks by delivering missile and bomb strikes against the enemy's territory, facilities, forming its military and economic potential, command posts, and groupings of troops (armed formations). In this case, the effectiveness of the performed strikes is largely determined by the nature of the targets to be hit, the combat capabilities of the weapons being used and their carriers, the chosen tactics, the developing combat situation, the state of countermeasures applied by the enemy, and the rate of recovery of the targets being hit.
References [1,2] reviewed the goals of optimizing the algorithms, applied to control the intensity of air strikes against a ground facility. In the course of the hostilities, the aircraft carried out strikes against the facility, and when it became unserviceable, the defending party began to eliminate the inflicted damage and brought the facility back into its serviceable condition. Guided by the data on the facility state, obtained through aerial reconnaissance, the attacking side party struck the facility again. In this case, the intensity of air strikes by the attacking party either increased to its maximum level or decreased to its minimum level, depending on the reconnaissance results. In practice, the defending party uses all available means and efforts to minimize the damage caused by the attacking party and restore the facility as quickly as possible. Therefore, the defending party also increases or decreases the intensity of restoration activities, depending on the facility state. Thus, both parties, acting against each other, pursue opposite goals, as one attempts to destroy the facility and the other intends to restore it. In this case, the information about the state of the facility, available to the parties, is not accurate and requires additional data processing. This paper is dedicated to fulfilling the goal of developing a minimax game control algorithm in pure strategies [1,[3][4][5][6][7], applied by both opposing parties based on the values of the facility state indicators.

Problem statement
The problem under consideration is solved by methods of the theory of systems with random jump-like structure (SCS) [1,3,[8][9][10] in the following mathematical formulation. The structure of the facility at every discrete moment k = 0, 1, 2, . . . is in one of two possible mutually exclusive states: s k = 1-"the facility is destroyed" or s k = 2-"the facility is restored". Transitions from one state to the other occur at random times and are described by a Markov chain with controlled transition probabilities is the facility destruction probability, and are the ranges of possible values. As a rule, information about the health of a facility is inaccurate and incomplete. Therefore, the states of the structure are determined by indicators that operate with errors and are defined by conditional Markov chains with the following transition probabilities: π A k+1 (r k+1 | r k , s k+1 ) the probability of transiting from state into r k+1 state with fixed and π B k+1 (ρ k+1 | ρ k , s k+1 ) is the probability of transiting from ρ k state into ρ k+1 state with fixed s k+1 , r k , s k = 1, 2; here, r k and ρ k are output parameter signals; A and indices indicate whether the parameter belongs to player A (the recoverable object) or B (the air strike formation).
The probabilities π A (·) and π B (·) are determined by the following formulas: where ∆t = t k+1 − t k ; T is the time constant characterizing the inertia of the parameter;π i k+1 (·) are known steady-state probabilities of true (where x k+1 = s k+1 ) and false (where x k+1 = s k+1 parameter values; besides, The criteria for the optimal control actions of the opposing parties are defined by the following formulas:  , 1) is Kronecker sign. The task is to find optimum control algorithms for the values d k , g k , which are deterministically dependent on the observations r 0,k , ρ 0,k during the [0, k] interval. (1), therefore formulas (2), (3) are transformed to read as follows:

Optimum player control algorithms
Then, the optimality criteria obtain the following physical meaning: the expressions in square brackets, which are the efficiency indicators, constitute the total of the individual parameters, taken using their weighing coefficients 1, λ i k , µ i k , i = A, B. The first of them, which is the primary one, is the probability of a faulty state of the facility s k+1 = 1, based on the parameter values the interval [0, k]:p A k+1 (1) for player A andp A k+1 (1) for player B, respectively. The second and third individual parameters characterize the restrictions, imposed on the control actions of the players based on tactical, structural, or informational considerations.
Player A minimizes the efficiency parameter value by controlling the facility restoration probability d k , assuming that player B will maximize the value of this parameter, controlling the facility destruction probability g k . On the contrary, player B maximizes the efficiency parameter value by controlling g k probability, assuming that the defending party (player A) will minimize it.
The predicted probabilitiesp A k+1 (1),p B k+1 (1) at k + 1 step are related to the posterior probabilitiesp A k (1) P s k = 1 | r 0,k andp B k (1) P s k = 1 | ρ 0,k at k step by the known formulas [2][3][4] Adding (6), (7) into (4), (5), we get the following: Solving the equations (8), (9), we get the algorithm for the optimum facility structure regulator for player A: where d * k is optimum control by player A; g A k is optimum control by player B, as supposed by player A, based on the values of structure indicator r 0,k , belonging to player A. The posterior probabilityp A k (1) is determined by the algorithm of the optimum classifier of the structure [1,3] In general, the optimum minimax control algorithm for player A (attacked facility) consists of a structure regulator (10), (11), and a structure classifier (12). Formulas (10)-(12) form a closed system of recurrent equations, the input of which is the value of the structure indicator r k , and the output of which is the optimum control for the probability of the destroyed facility restoration d * k . The optimum maximin control algorithm for player B (air strike formation) consists of similar recurrent equations as follows: where g * k is optimum control by player B; d B k is optimum control by player A, as supposed by player A, based on the values of structure indicator ρ 0,k , belonging to player A.
The input signals for the algorithm are the structure indicator ρ k values, and its output is the optimum control for the facility destruction probability g * k . The obtained control algorithms (10)-(11) and (13)-(14) are relay type characteristics, the physical meaning of which will be reviewed, using the algorithm of player B as an example. Based on the aerial reconnaissance reports (structure indicator ρ k values), the posterior probability of a serviceable (restored) facility statep B k (2) is defined. This probability is compared with the value of µ B k . If the probabilityp B k (2) exceeds the level, defined by µ B k , the intensity of air strikes (or the power of air attacks) increases up to a level that raises the probability of facility g * k transiting from the restored state s k = 2 into the destructed state s k = 1 (g * k = g max k ). Ifp B k (2) ≤ µ B k , the intensity or power of air strikes is reduced to a level characterized by the value g * k = g min k . Similar control is applied for the intensity of facility restoration by the repair brigade based on the values of posterior probabilityp A k (1). Therefore, if the probability of a structure state, which is undesirable for the player (p A k (1),p B k (2)), exceeds a certain threshold level (λ A k , µ B k ), then an "energetic" control mode is activated, increasing the likelihood of a transition to the desired state. If the probability of an undesirable state is less than the specified threshold level, then the "economical" control mode is activated.
Weighing (priority) coefficients λ A k , µ A k , λ B k , µ B k are pre-assigned, based on the conditions of a specific practical problem, and then refined, using the parametric optimization in the course of simulation mathematical modeling.