Game-Theoretic Power Allocation and the Nash Equilibrium Analysis for a Multistatic MIMO Radar Network

We investigate a game-theoretic power allocation scheme and perform a Nash equilibrium analysis for a multistatic multiple-input multiple-output radar network. We consider a network of radars, organized into multiple clusters, whose primary objective is to minimize their transmission power, while satisfying a certain detection criterion. Since there is no communication between the distributed clusters, we incorporate convex optimization methods and noncooperative game-theoretic techniques based on the estimate of the signal-to-interference-plus-noise ratio (SINR) to tackle the power adaptation problem. Therefore, each cluster egotistically determines its optimal power allocation in a distributed scheme. Furthermore, we prove that the best response function of each cluster regarding this generalized Nash game belongs to the framework of standard functions. The standard function property together with the proof of the existence of the solution for the game guarantees the uniqueness of the Nash equilibrium. The mathematical analysis based on Karush–Kuhn–Tucker conditions reveals some interesting results in terms of the number of active radars and the number of radars that over satisfy the desired SINRs. Finally, the simulation results confirm the convergence of the algorithm to the unique solution and demonstrate the distributed nature of the system.


I. INTRODUCTION
R ECENT advances in digital signal processing and the constant development of computational capabilities suggest that it may be feasible for next generation radar systems to incorporate multiple-input multiple-output (MIMO) technology.The superiority of a MIMO radar against other radar schemes lies in its waveform diversity, which in essence means that a MIMO radar can simultaneously emit several diverse, possibly linearly independent waveforms via multiple antennas, in contrast to existing radar systems that transmit scaled versions of the same, predefined waveform [1].In particular, there are two principal types of MIMO radar, those that incorporate colocated antennas [2] and systems equipped with widely separated antennas (bistatic, multistatic) [3].MIMO radar technology provides direct applicability of adaptive beamforming [4], waveform design and power allocation, higher angular resolution, ability to acquire the target's geometrical characteristics through the radar cross section (RCS) and multiple target detection [1].However, in order to combat multiple source interference in a radar field, while achieving high detection performance using minimum power consumption, the system should adopt an optimal resource allocation strategy.A centralised approach to resource allocation is possible using convex optimization techniques for example.Nevertheless, centralised control may not be desirable or will have implementation difficulties in a multistatic radar network and thus it is preferred to consider autonomous decentralised resource allocation schemes.A natural and efficient tool to achieve this is game theory, which provides a framework for analyzing coordination and conflict between rational but selfish players.
Recently, game-theoretic techniques have been extensively explored within the radar research community to tackle several issues and to improve and optimize various radar parameters.Specifically, the authors in [5] and [6] formulated a noncooperative game to address the power optimization problem with a predefined SINR constraint.Furthermore, to extend the study in [5], a signal-to-disturbance ratio (SDR) estimation technique was applied in [7].Three different game theoretic techniques were applied in [8] to address a distributed beamforming and power allocation problem for a radar system in the presence of multiple targets.Specifically, a strategic non-cooperative game, a partially cooperative game and a Stackelberg game were applied to obtain the optimal resource allocation strategy, while satisfying a certain SINR criterion for each of the targets.A two-player, non-cooperative, zerosum game was considered in [9] to investigate the interaction between a radar and a jammer.Non-cooperative MIMO radar and jammer games were also applied in [10], where the utility functions were formulated using the mutual information criterion.The authors in [11] studied the problem of polarimetric waveform design by forming a zero-sum game between a target and a radar engineer.Moreover, in [12], the power allocation problem of a distributed MIMO radar was tackled using a cooperative game approach through maximizing the Bayesian-Fisher information matrix (B-FIM) and exploiting the Shapley value solution.Potential game theory techniques were exploited in [13] for optimal waveform design and maximization of the detection performance.A proof of the uniqueness of the Nash equilibrium of a potential game waveform design problem was presented in [14].Finally, the authors in [15] proposed a water filling method for optimal power distribution using a Stackelberg game-theoretic framework.
In this paper, motivated by the results in [5] and [7], we revisit the power allocation problem of a distributed, multistatic radar network, where multiple MIMO radars are organized into clusters.It should be emphasized that this problem is particularly attractive to tracking radars where we have certain belief on the approximate location of the target, but we will require fine detection to retrieve further information regarding the target's exact position and characteristics.The primary goal of each cluster is to secure a certain detection criterion, in terms of signal-to-interference plus noise (SINR) ratio, while allocating the minimum possible power to each radar.Hence, we formulate a generalized Nash game (GNG), where there is no communication between the clusters of the network, despite the fact that they belong to the same organization.Such a scheme could be deployed in a scenario, where the opponent incorporates electronic warfare methods to intercept information about the location of the radars.In this case, in order to apply the game-theoretic algorithm, we require estimation of the SINR, as there is no coordination between the clusters and thus no information on the inter-cluster channel gains.
The main contribution of this work lies in the proof of the uniqueness of the Nash equilibrium of the game-theoretic power allocation problem described above.Specifically, we demonstrate that the best response function of each cluster in this GNG belongs to the family of standard functions by using convex optimization techniques and by analyzing the Lagrangian dual of the initial optimization problem.Moreover, through the game-theoretic analysis, we have characterized the behavior of the radars in a cluster.Specifically, the theoretical results based on Karush-Kuhn-Tucker conditions showed that in a cluster, the number of radars that exactly achieve the desired SINR is equal to the number of radars that are actively transmitting.This powerful result has facilitated the proof of uniqueness of the Nash equilibrium.Furthermore, the simulation results confirm the convergence of the algorithm to the unique Nash equilibrium.
This paper is organized as follows.Section II introduces the decentralized radar network as the system model.In Section III we present the game-theoretic formulation of the problem and the definition of the generalized Nash game (GNG) considered in this paper.The SDR estimation technique utilized in this work is demonstrated in Section IV.The analysis on the existence and uniqueness of the Nash equilibrium is performed in Section V. Finally, the simulation results and the concluding remarks are presented in Sections VI and VII, respectively.
Notation: We use bold lower-case letters and bold uppercase letters to denote column vectors and matrices, respectively.a H gives the Hermitian of the vector a and a T denotes its transpose.A(i, j) corresponds to the element located on the i th row and j th column of matrix A. I M stands for the M × M identity matrix.The Euclidean norm is denoted by || • ||.An N × 1 vector of ones is indicated by 1 N .Finally, any inequalities among vectors are considered element-wise.We consider a decentralized, multistatic radar network that consists of K separate clusters C = {C 1 , . . ., C K } each consisting of M radars, i.e.C k = {R k1 , . . ., R kM } for all k = 1, . . ., K.Such a radar network with a possible target is shown in Fig. 1.The primary aim for each radar in every cluster is to attain a predefined detection criterion, consuming the minimum possible transmission power.In the considered framework of noncooperative games, each cluster performs the power minimization autonomously.There is communication and coordination among the radars within the same cluster, whereas there is no coordination between different clusters in the network.Consequently, each cluster possesses full information regarding the channel gains of its respective radars, whereas it has no knowledge of the inter-cluster cross channel gains.Nevertheless, this scenario is not competitive and the radars should avoid causing interference to the rest of the clusters of the network intentionally, since they belong to the same organization.

II. SYSTEM MODEL
In order to identify the desired target, each one of the M radars in the k th cluster transmits the respective element of the predesigned waveform vector ψ k (t) = [ψ k1 (t), . . ., ψ kM (t)] T of size M × 1, which satisfies the orthogonality condition , where T 0 is the radar pulse width and t refers to the time index within the radar pulse.Hence, we exploit the waveform diversity of the MIMO architecture, since the waveforms corresponding to different radars of the same cluster are orthogonal, i.e., ψ ki (t)ψ kj (t)dt = 0, where i = j.On the other hand, waveforms emitted from radars belonging to different clusters may not be orthogonal and thus could induce considerable inter-cluster interference.We assume that each cluster determines the presence of a target, by applying a binary hypothesis testing on the received signal based on the generalized likelihood ratio test (GLRT) [5].The sampled pulses of the received signal for radar i in cluster k R ki , under the two hypotheses H 0 and H 1 of target being absent and target being present respectively, are written as the complex N × 1 vectors as: where s kij = √ p kj ψ kj (n) a kij denotes the desired received signal at radar R ki corresponding to the transmission of radar R kj , which incorporates the Doppler shift introduced by the target.The parameter α kji denotes the channel gain, including the geometrical signature of the target, i.e. its radar cross section (RCS), from radar R kj to radar R ki , a kij = [1, e j2πf D,k,i,j , . . ., e j2π(N −1)f D,k,i,j ] T is the Doppler steering vector, f D,k,i,j denotes the normalized Doppler shift at radar R ki originating from the target's movement when reflecting the transmitted signal from radar R kj , N is the number of signal return samples that the radars receive at each time step of duration T 0 and p kj stands for the transmission power of radar R kj .The inter-cluster interference experienced by radar R ki due to the emissions from radars belonging to all other clusters is denoted as where β jki describes the direct cross-channel gain from radar R j to radar R ki , which depends on the respective characteristics of the antennas and the distance between the radars.Since all the radars are considered stationary in the proposed model, there is no relative Doppler frequency regarding the direct cross-channel interference, hence the Doppler based steering vector associated with the waveform vector transmitted from the radars in clusters other than k is shown as an N × 1 vector of all ones 1 N .The term φ jki stands for the target reflection gain at radar R ki originating from the signal transmitted from radar R j , a i jki = [1, e j2πf i D, ,j,k,i , . . ., e j2π(N −1)f i D, ,j,k,i ] T describes the Doppler steering aspects of the target at radar R ki arising from the reflected signal from radar R j and f i D, ,j,k,i is the corresponding Doppler frequency shift.The term φ c jki denotes the clutter reflection gain at radar R ki originating from the signal transmitted from radar R j , a c jki = [1, e j2πf c D, ,j,k,i , . . ., e j2π(N −1)f c D, ,j,k,i ] T describes the Doppler steering vector of the target at radar R ki arising from the reflected signal from radar R j and f c D, ,j,k,i is the corresponding Doppler frequency shift.The last components of the received signal in (2) are the noise and the clutter introduced by the waveforms transmitted by the radars in cluster k denoted by the parameter kij + n, where c kji includes the signal propagation loss and the geometrical characteristics of the clutter, in other words its RCS, a c kij = [1, e j2πf c D,k,i,j , . . ., e j2π(N −1)f c D,k,i,j ] T denotes the Doppler steering vector at radar R ki associated with the clutter and f c D,k,i,j denotes the normalized Doppler shift at radar R ki arising from the clutter's movement when reflecting the transmitted signal from radar R kj and each element of n is white Gaussian noise (WGN) with variance σ 2 n .The received signal x ki is subsequently sent to a bank of matched filters, matching each of the orthogonal waveforms and incorporating the Doppler effect as ψ ki (n) a kij , i = 1, . . ., M .Subsequently, the corresponding energy at the output of the matched filter is accumulated.Hence, the expected energy of the signal originating from the target direction for radar R ki can be given by: where α kji ∼ CN (0, h kji ), hence h kji denotes the variance of the desired channel gain, which includes the information on the target's RCS.As observed from Fig. 1 and equation ( 2) the detection of a target is deteriorated by direct intercluster interference, in addition to the clutter effect and the noise power.Therefore, the expected power of the accumulated interference and noise for radar R ki can be modeled as: where σ 2 n denotes the noise power, c kji ∼ CN (0, ν kji ξ kji ) and ν kji defines the variance of the accumulated clutter channel gains, embedding information on the clutter's RCS, ξ kji accounts for the correlation factor associated with the difference of the Doppler frequencies between the target and the clutter.For the rest of this paper, we include the Doppler correlation factor ξ kji in the term ν kji for simplicity.β jki ∼ CN (0, µ jki jki ) and µ jki jki describes the variance of the accumulated direct cross-channel gain, aggregating a non-zero correlation factor jki between the waveform vector emitted from radar R j and the matched filtering waveform at radar receiver R ki .Without loss of generality, we assimilate the waveform correlation factor jki and consider the whole term as µ jki in the sequel.φ jki ∼ CN (0, jki ˜ jki ) and jki is the variance of the accumulated inter-cluster target reflection gain, including information on the target's RCS and ˜ jki is the correlation factor between the target reflected waveform emitted from radar R j and the matched filtering waveform at radar receiver R ki .φ c jki ∼ CN (0, c jki ˜ c jki ) and c jki describes the variance of the accumulated inter-cluster clutter reflection gain and ˜ c jki is the correlation factor between the clutter reflected waveform emitted from radar R j and the matched filtering waveform at radar receiver R ki .We assume that the variances of the inter-cluster target and clutter reflection gains are significantly smaller when compared to the variance of the direct cross-channel gain and hence jki and c jki are neglected for the rest of this work.Also, as we do not assume any prior knowledge of the interference coming from radars in other clusters but we only estimate it, these terms can also be absorbed in µ jki .
Based on the above definitions, the expected SINR for the i th radar in the k th cluster is written as In order to design an efficient detector for the hypothesis testing we utilize the GLRT.Assuming the clutter and interference contribution is considered as Gaussian noise, the probability density functions of x ki under hypothesis H 0 and H 1 respectively, can be given by: where a ki = [α k1i , . . ., α kM i ] T .The maximum likelihood (ML) estimate of noise variance under the hypothesis H 0 , when there is no target present, can be obtained by σ2 H0 = x ki 2 /N.Subsequently, by keeping σ 2 H1 fixed and differentiating f H1 with respect to α kji , the ML estimate for α kji ∀i = 1, . . ., M is given by αkji = s H kij x ki/N .After obtaining the ML estimate for α kji , we substitute it in (7) and maximize with respect to σ 2 H1 to derive the maximum likelihood estimate for σ 2 H1 as: denotes the detection threshold for the hypothesis testing for each radar i = 1, . . ., M in cluster k and thus the GLRT can be reformulated as: The performance and efficiency are generally assessed in terms of the probabilities of detection P d and false alarm P f a for each radar.It is shown in [16] that as the number of samples approaches infinity, the performance of the GLRT is similar to that of the Neyman-Pearson detector.Consequently, the threshold for hypothesis testing λ ki can be obtained from the desired probability of false alarm P f a [6], [17]- [19].However, the probability of detection will depend on the threshold and the SINR associated with the received signal.Hence, for a given P f a , and a desired P d , it is possible to determine the desired SINR γ * ki [6], [17]- [19].Hence, we formulate our game-theoretic resource allocation problem as optimizing transmission power while achieving a desired SINR, as presented in the next section.

III. GAME-THEORETIC FORMULATION
As described in the previous sections, the main goal for each cluster is to decide the optimal power allocation for its respective radars, while attaining a specific detection criterion.As we observe from the SINR equation ( 5), although increased power allocation at a specific cluster improves the detection performance, it induces higher interference to the environment and consequently to the remaining radars of the network.Therefore, by exploiting noncooperative game-theoretic techniques, we model this interaction as a generalized Nash game.The set of clusters C = {C 1 , . . ., C K } are considered to be the players of the game.The action set of the k th player is P k = P k1 × . . .× P kM with The acceptable strategy set of the GNG depends both on the action of the k th player P k and the actions of all other players P −k and is defined as where p −k denotes the power allocation adopted by all other players except player k.Let us also define p k = [p k1 , . . ., p kM ] T as the power allocation vector of cluster k.It is evident from equation (5), that the SINR is a function of the power allocation of all K players.Thus, the interdependency of the admissible strategies is clearly stated through the constraints in (9).The game model is completed by defining the utility function as u k (p −k , p k ) = M i=1 p ki , which represents the total transmission power of cluster k.At this point, we can summarize the game as: In this GNG, player k greedily minimizes its transmission power, while all radars belonging to cluster k attain the target SINR, given the power allocation strategies of all the other players.Therefore, the best action for the k th player is given by the following set, denoted by BR k : Recalling the action set of player k, the above set can be determined by solving the following convex optimization problem: It is apparent that for any cluster of radars, if the optimization problem in the absence of inter-cluster interference and noise is infeasible, then the optimization in the presence of the inter-cluster interference and noise is also infeasible.Hence, it is important to ensure that the SINR targets are set such that the following signal-to-intra-cluster interference ratio (SICIR) is achievable: It can be deduced from (11) that for certain values of h kji and ν kji , (10) could become infeasible for high SINR targets.In such cases, when low SINR targets are required for ensuring feasibility, we could rely on increasing the dwell time of the radar on the target.In other words, by increasing the number of signal return samples N , a lower SINR target γ * ki can be used for detection, as shown in [6].Hence, in this work, we assume that the SINR targets are appropriately chosen, such that for the given target and clutter channel realizations, namely h kji and ν kji , the constraints in (11) are achievable, i.e. the convex optimization problem (10) is feasible in the absence of inter-cluster interference and noise.However, if the problem is feasible in the absence of inter-cluster interference and noise, then the problem is also feasible in the presence of inter-cluster interference and noise as stated by the following proposition.
Proposition 1: If the convex optimization problem in (10) in the absence of inter-cluster interference and noise is feasible, then the problem is also feasible in the presence of inter-cluster interference and noise.
Proof.The proof is based on showing that there exists a positive scaling factor for the power allocation such that the SICIRs and the SINRs asymptotically approach the same values as demonstrated in [20].Let the inter-cluster plus noise term be denoted as n .Since r −ki is strictly positive by definition, the SINRs of the general case model are strictly lower than the SICIRs in the absence of r −ki , namely ν kji p kj for every radar in the system.However, by scaling the power allocation p k to βp k for appropriately large β > 0 and dividing both the numerator and the denominator of the left hand side of the above inequality, then as the term r −ki /β approaches zero for arbitrarily large β, the SINRs approach the SICIRs within a required accuracy.Since the optimization problem with SICIR constraints is feasible, the power allocation vector p k is non-negative.Hence, there exists a scaled non-negative power allocation vector that also renders the problem in the presence of inter-cluster interference and noise feasible.
A crucial part of a game-theoretic analysis is to investigate whether the game G converges to a stable solution, where no player can benefit by unilaterally deviating its power allocation strategy.Such a solution defines the Nash Equilibrium and for the game G describes the strategy profile (p * −k , p * k ) when: It is evident from the constraints of (10) and the definition of SINR (5), that each radar in cluster k requires knowledge of the inter-cluster interference plus noise term, r −ki , in order to determine its optimal power allocation.However, since we assume no communication between the clusters, it is difficult to obtain the required information and thus we overcome this deficiency by using the estimate of the instantaneous SINR γki using a similar approach as discussed in [6].Section V describes the power allocation optimization based on the estimate of SINR.

IV. EXISTENCE AND UNIQUENESS OF THE NASH EQUILIBRIUM A. Existence
The existence of a generalized Nash equilibrium (GNE) follows from the result in [21] on abstract economies.According to this result, a GNE exists if the following hold: for all players k = 1, . . ., K the set P k is compact and convex, the utility function u k (p −k , p k ) is continuous on P and quasi-convex in p k .For every p −k the set-valued function S k is continuous with closed graph and for every p −k the set S k (p −k ) is nonempty and convex.For our problem, these requirements can be straightforwardly established using analytic notions, hence there exists a GNE for our game.

B. Uniqueness of the Solution through Duality Analysis
The main contribution of this paper lies in the analysis and the derivation of the proof of the uniqueness of the Nash equilibrium for the strategic noncooperative game G.According to the result in [22] and since the existence of a GNE is secured, our primary objective is to prove that the best response of each cluster is a standard function, which is a sufficient condition for the uniqueness of the solution.By exploiting the convexity of the optimization problem (10) we derive the respective Lagrangian, the Karush-Kuhn-Tucker (KKT) conditions and the Lagrangian dual problem.The analysis of the KKT conditions is necessary for the equilibrium analysis as some of the radars may achieve the desired SINR with inequality, i.e. could over satisfy the SINR requirement.First, we reproduce the definition of a standard function [22] as follows: A function F(x) is standard if for all x ≥ 0, the following properties hold: • Positivity: In order to prove that the best response function of each cluster is a standard function, we will consider the optimization problem of the k th cluster as defined in (10).By rearranging the constraints in matrix form and explicitly imposing constraints for non negative radar power, we have the following minimization problem for the k th cluster: where r −k = [r −k1 , . . ., r −kM ] T is the inter-cluster interference plus noise vector, which can be written as , where the cross-channel matrix M i is given by: and M × M matrix G is written as: For the multi-static scenario considered in this paper, it is possible that some of the radars in a cluster may not illuminate any signals, however, they may use the signals generated by other peer radars (within the same cluster) as signals of opportunity to achieve the desired SINR and to detect the target.When all radars are active, it is straightforward to establish uniqueness of the GNE as will be shown in the forthcoming analysis, however, when at least one of the radars in a cluster is inactive, the establishment of the Nash equilibrium requires further analysis in terms of the KKT conditions, as presented later in the section.Hence, we define the Lagrangian L associated with the problem (12) as: where λ a = [λ 1 , . . ., λ M ] T and λ b = [m 1 , . . ., m M ] T are the Lagrange multipliers associated with the inequality constraints of (12).Let (p * k , λ * a , λ * b ) be the primal and dual optimal points of (12).Then, the KKT conditions on convexity must be satisfied [23].In particular we have: In order to investigate all of the potential outcomes of the game G, we consider three different cases with respect to the values of the Lagrange multipliers λ a , associated with the SINR constraints.In particular, firstly we study the case when all of the radars achieve the SINR target with equality.In this case, all of the corresponding Lagrangian multipliers are non zero and the uniqueness is proved straightforwardly using the definition of the standard function.The second case is when all of the Lagrangian multipliers are zero.It is shown that this case is impossible.The final case is when certain radars achieve the desired SINR with equality while the remaining radars in the cluster over satisfy the SINR target.In this case, we have both zero and non-zero Lagrangian multipliers.For this case, we resort to the analysis of the Lagrange dual problem and the derivation of the Lagrangian function and the KKT conditions to establish the GNE.The mathematical analysis of the proof of the uniqueness of the solution considering all possible cases is presented below: Case 1: λ i = 0, ∀i = 1, . . ., M .In this case, the set of equalities ( 15) from the KKT conditions implies that all of the SINR inequality constraints are active and must be satisfied with equality.Hence, by reformulating (15) in a matrix form we have Gp * k = −r −k .Following Proposition 1, we assume that the optimization problem ( 12) is always feasible ∀r −k > 0, hence G must be invertible and p * k = −G −1 r −k > 0. This case corresponds to the scenario when all of the radars are active and actually transmit signals.As a result, by replacing the interference vector, the best response function can be stated as: Lemma 1: The best response function ( 17) is a standard function.
Proof.Following [24], the best response strategy (17) satisfies the following necessary properties for all p −k ≥ 0: a) Positivity: The best response of the k th cluster p * k is always positive, as This concludes the proof on the uniqueness for Case 1, where all of the SINR constraints are satisfied with equality.
Assuming λ 1 = λ 2 = . . .= λ M = 0, then from ( 14) we have that m 1 = . . .= m M = 1.By substituting in (16), we obtain p k1 = . . .= p kM = 0 which indicates that all the radars in cluster k are inactive.Consequently, the constraints of the optimization problem ( 12) can be restated as: r −k1 , . . ., r −kM ≤ 0 which is a contradiction, since the inter-cluster interference plus noise terms are always positive, i.e. r −k1 , . . ., r −kM > 0. As a result, at least one of the radars in the cluster must be active in order for the optimization problem ( 12) to be feasible.
Theorem 1: In the case when exactly n radars in cluster k achieve the SINR constraints with equality, then at least M − n radars in cluster k remain inactive and do not generate any signals.
Proof.In order to investigate the interdependence among the number of radars that satisfy the SINR constraint with equality and the number of the radars that are active and actually generate illuminating waveforms in cluster k, a critical analysis on the Lagrange multipliers λ b is essential.Hence, we obtain the Lagrange dual function g as: It is straightforward from (18) that the Lagrangian is an affine function of p k and is bounded below only when 1 + G T λa − λ b = 0. Thus, it follows Next, we formulate the Lagrange dual problem as: By excluding the case when g is infinite and changing the sign of the objective function and by exploiting the fact that from (20), λ b = 1+G T λa , we can rewrite the aforementioned maximization problem as the following minimization problem: Proposition 2: For any feasible optimization problem ( 12), at least one of the elements in each row of matrix G must be negative.
Proof.If every element in any row of G is positive, the left hand side of the corresponding SINR constraint Gp k +r −k ≤ 0 in (12) will always be positive, since p k ≥ 0 and r −k is strictly positive.Hence, the constraints Gp k + r −k ≤ 0 are violated and the convex problem ( 12) is rendered infeasible.
The overall aim of the dual problem ( 21) is to obtain the largest possible λa in order to minimize the cost, while satisfying λ b = 1 + G T λa ≥ 0. However, λa can not grow unbounded, because this will violate the constraint 1 + G T λa ≥ 0, since at least one element per row of G (or column of G T ) is negative.Consequently, in order to minimize the objective (i.e.maximize λT a r−k ), λT a will grow until exactly n elements of the vector 1 + G T λa are equal to zero.In other words, due to n degrees of freedom (i.e.number of non-zero elements of the Lagrangian multipliers vector λa ), it is possible to obtain λa such that exactly n rows of the constraints 1 + G T λa ≥ 0 will be satisfied with equality and the rest with strict inequality (there are n linear combinations to constitute n constraints equal to zero).Subsequently, from (20), one has: It is evident from (22) that exactly n elements of the Lagrangian multipliers vector λ b are equal to zero and the remaining M − n elements are positive.Due to the complementary slackness condition denoted in (16), at least M − n values of vector p k are zero, i.e. at least M − n radars in the respective cluster would opt to remain silent and would not transmit any signals.
Corollary 1: The indices of the radars that are inactive in a cluster are determined only by the target and clutter channel characteristics of the corresponding cluster and the target SINR, and are independent of the actions of the other clusters and the corresponding cross-clutter channel interference.
Proof.It comes straightforwardly from the proof of Theorem 1 and equation ( 22) that the indices of the radars that remain silent in a cluster depend solely on the matrix G, whose elements are functions of the channel gains and the target SINR of the corresponding cluster.
The finding in Corollary 1 is very important for the Nash equilibrium analysis.When a subset of radars is inactive in a cluster, the action set in terms of the power allocation of a cluster is reduced to the power allocation of those radars that will eventually be active.In other words, determining indices of radars that are inactive is not part of the action set of the game as it will not be influenced by the action of other clusters.Hence the best response function for standard function analysis should include only the power allocation of active radars.Furthermore, the distributed nature of Corollary 1 strengthens the decentralized approach of the considered model.
By revisiting equation ( 15) from the KKT conditions of the convex optimization problem (12), the SINR constraints corresponding to λ 1 , . . ., λ n = 0, are satisfied with equality.All other SINR constraints will be satisfied with inequality.Hence, the optimal power allocation can be obtained only by using the SINR equations that achieve equality.At this setting, other radar receivers will automatically satisfy the SINR constraints with inequalities.Therefore, we consider only the active antennas and the receivers that achieve the SINR target with equality and obtain the following reduced dimensional matrix equation: where q * k = [p 11 , . . ., p 1n ] T , r red −k = [r −k1 , . . ., r −kn ] T and the reduced square n × n matrix G red is defined as: It is straightforward that the solution of this set of n equations solely depends on matrix G red , which is determined from the channel gains regarding cluster k and from the target SINR.Hence, as the problem is always feasible (Proposition 1) ∀r red −k > 0, G red must be invertible and the best response function of cluster k in this case can be defined as: When G red from ( 24) is full rank and when n radars in cluster k attain the SINR with equality, then exactly n radars in cluster k will be active and actually transmitting, whereas the remaining (M − n) radars will remain inactive.However, it is possible theoretically to have certain channel gains, clutter gains and target SINR such that n radars could attain SINR with equality but with fewer than n radars being active.This happens when G red is rank deficient or when any column of G red is co-linear with r red −k .In the latter case for example, we may have all n radars achieving SINR with equality, however, only one radar will be transmitting.Although this may happen with almost zero probability, the following Lemma is still applicable to this scenario as well with a reduced size G red .Hence, without loss of generality, we consider the case of full rank G red .
Lemma 2: The best response function ( 24) is a standard function.
Proof.Following Lemma 1, the best response strategy (24) satisfies the following necessary properties for all p −k ≥ 0: a) Positivity: The best response of the k th cluster q * k is always positive, as r red −k > 0 and b) Monotonicity: for p −k ≥ p −k , we have from Lemma 1 that r −k ≥ r −k element wise and consequently r red −k ≥ r red −k .As a result: ) Scalability: Using the same approach as Lemma 1, for all a > 1, we must show that aBR k (p −k ) > BR k (ap −k ).Indeed: Lemma 2 completes the uniqueness of the Nash equilibrium of the GNG G, considering all possible cases.

V. SINR ESTIMATION
In order to obtain the power allocation values, each radar needs to perform the optimization as in (12).This requires estimation of the inter-cluster interference plus noise variance rk .However, using (17), we could write the estimate of intercluster interference plus noise variance in terms of the estimate of SINR as follows: where the matrix Ĝ is constructed as: where γki is the estimate of instantaneous value of SINR, which was obtained using a similar approach as in [6]: Hence, by replacing the estimated inter-cluster interference plus noise term into the constraints of ( 12), the power minimization problem for cluster k at time t can be reformulated as: where p (t−1) k is the power allocation vector at the previous iteration (t − 1), and p (t) k is the power allocation at the current iteration provided by the optimization problem (27).By utilizing the SINR estimation ( 26), the proposed system can perform power minimization in a totally distributed manner, without the need for any communication among the clusters.It should be highlighted that this power allocation problem is particularly attractive for tracking radars where we have certain belief on the approximate location of the target, but we will require fine detection to track the exact location.In this case, we aim to obtain optimum power allocation to maintain particular SINR hence probability detection.However, in the case there is no target, (26) will provide average approximately − 1 N as the estimated SINR.This is because the waveform is matched to a particular delay and Doppler corresponding to approximate location and velocity of the target.Hence, in the absence of a target, j=1 |s H kj x ki | 2 goes to a small value, thus the dominant term ||x ki || 2 in the numerator and denominator will cancel each other and we will obtain − 1 N .As this is a negative number, the optimization problem will be indicated infeasible and as a result we will have to resort to a standard power allocation, where each radar will be allocated to some minimum level of power to perform general surveillance.In the next section, we present simulation results to support the mathematical analysis.

VI. SIMULATION RESULTS
this section, we present simulation and numerical results to illustrate the convergence of the algorithm to the unique solution and to demonstrate the distributed structure of the network.Initially, we consider a network consisting of two clusters, each with six radars.In every time step, each radar receives N = 32 signal samples.We also set the maximum number of iterations at T = 30 to investigate the convergence of the game.For a predefined target channel gain h kji , we set the values of the cross-channel and clutter gain as ljki = h kji /20 and ν kji = h kji /10.The channel gains for the simulations were chosen following a uniform distribution in the range [0, 1].Finally, the Doppler shift is considered to be f D,k,i,j = 0.1 for all k = 1, . . ., K, i = 1, . . ., M and the noise power is set to σ 2 n = 0.01.Before the initialization of the game, we should first decide the detection criterion for all radars, namely the desired SINR γ * ki .We consider that the covariance matrix of the intercluster plus clutter plus noise interference at radar R ki is denoted as E[nn H ] = B, where n = i ki + d ki and is positive definite.For a given B, we may use ( 15) and ( 16) of [19] to determine the desired SINR for specific probabilities of false alarm and detection, P f a and P d .In the considered model, we set the desired probabilities of false alarm and detection at P f a = 0.0099 and P d = 0.999, respectively, and we obtain the corresponding detection threshold and SINR target, λ ki = 0.001 and γ * ki = 2.1599, respectively, for every radar.In order to study the convergence of the GNG, Figures 2 and 3 demonstrate the power allocation update of all the radars in the network for two different initial power allocations in cluster 2. The channel gains remain the same in both simulations.First, it is evident that the number of active radars in both clusters is the same in both examples, regardless of the initial power allocation.Furthermore, power values in both simulations converge to the same Nash equilibrium, as expected.The efficiency of the algorithm is evident, as the process converges to the optimal power allocation within 6 iterations.This result confirms Theorem 1, suggesting convergence to the unique Nash equilibrium regardless of the initial strategy.In order to assess the efficiency of the proposed power allocation technique, we compare the results of the proposed method with the case when uniform power allocation is considered among the radars of the same cluster.Uniform power allocation has been studied in [25], [26] when a fixed system power budget is considered.By imposing an additional constraint in the optimization problem (10), which allocates uniformly the power among the radars in the same cluster, we obtain the resource allocation for the uniform power allocation GNG.To facilitate a fair comparison, we set the same SINR target in both cases and we simulated three different radar system scenarios, the first consisting of two clusters each with two radars, the second consisting of two clusters each with six radars and the last one considering three clusters each consisting of three radars.Table I presents the total power consumption in each cluster for each scenario comparing the proposed GNG to the uniform resource allocation case.It is apparent that the proposed game-theoretic technique outperforms the uniform power allocation in all cases, in terms of the total power consumption in each of the clusters.In order to illustrate the aforementioned result, Fig. 6 presents a histogram of the total power consumption at cluster 1, comparing the two methods for the three different radar network cases.It is yet again evident, that the total power consumption for the proposed scheme is much lower than the uniform power allocation to achieve the same set of SINRs for all system scenarios simulated.Fig. 6: Total power consumption at cluster 1, comparing the proposed GNG with the uniform power allocation GNG, for different system scenarios.
Final example considers a scenario where we used estimates of interference arising from other clusters (instead of the true values) for the game-theoretic algorithm.We assume a network of two clusters, each consisting of four radars.Figure 7 shows the power allocation of the radars in the first cluster throughout the convergence process using the true and the estimated SINR from (26).It is evident that the estimation is sufficiently accurate and the convergence based on the estimation of the SINR follows the convergence trajectories of the power allocation game obtained using the true SINR values.

VII. CONCLUSION
We have studied game-theoretic power allocation for a distributed MIMO radar system.By defining a GNG and exploiting convex optimization techniques and duality properties, we presented an extended Nash equilibrium analysis, concluding with the proof of the existence and uniqueness of the solution.Through this analysis, we also derived important properties of the system.In particular, we proved that the number of active radars in a cluster that actually transmit signals is exactly the same as the number of radars in the same cluster that satisfy the SINR constraint with equality.In addition, the number of active radars and the optimal strategy of a cluster are dependent only upon the channel gains and the target SINR and are totally independent of the other players' power allocation.This contribution strengthens the decentralized and distributed nature of the system.Finally, the simulation results support the mathematical analysis of the convergence and the study of the existence and uniqueness of the Nash equilibrium.

Fig. 1 :
Fig. 1: A distributed MIMO radar network with K clusters and their corresponding channel gains.

Fig. 4 :
Fig. 4: Convergence of power allocation of player 1 for different starting strategies when K = 4 and M = 3, first simulation (different linestyles correspond to different initial strategies for player 1).

Fig. 5 :
Fig. 5: Convergence of power allocation of player 1 for different starting strategies when K = 4 and M = 3, second simulation (different linestyles correspond to different initial strategies for player 1).

Fig. 7 :
Fig. 7: Power allocation in the second cluster using the true and the estimated value of the SINR when K = 2 and M = 4.

TABLE I :
Total power consumption in each cluster for three different system realizations.