Strong Tracking PHD Filter Based on Variational Bayesian with Inaccurate Process and Measurement Noise Covariance

Assuming that the measurement and process noise covariances are known, the probability hypothesis density (PHD) filter is effective in real-time multi-target tracking; however, noise covariance is often unknown and time-varying for an actual scene. To solve this problem, a strong tracking PHD filter based on Variational Bayes (VB) approximation is proposed in this paper. The measurement noise covariance is described in the linear system by the inverse Wishart (IW) distribution. Then, the fading factor in the strong tracking principle uses the optimal measurement noise covariance at the previous moment to control the state prediction covariance in real-time. The Gaussian IW (GIW) joint distribution adopts the VB approximation to jointly return the measurement noise covariance and the target state covariance. The simulation results show that, compared with the traditional Gaussian mixture PHD (GM-PHD) and the VB-adaptive PHD, the proposed algorithm has higher tracking accuracy and stronger robustness in a more reasonable calculation time.


Introduction
Multi-target tracking is one of the core issues in information fusion research, which has been widely used in civil and military fields such as aerospace, electronic information, and control engineering. It mainly estimates the number and state of the target through the data obtained by sensors, where the target is possible to be born, die, and derive at any time. Traditional multi-target tracking methods require data association when tracking multiple targets, such as joint probabilistic data association (JPDA) [1][2][3] and multi-hypothesis tracking (MHT) [4,5], but they can only deal with a fixed number of targets. As the number of targets increases, the calculation amount of these algorithms increases exponentially, which seriously affects the real-time performance.

Related Work
In recent years, the multi-target tracking algorithm based on random finite set (RFS) theory does not require complex data association when tracking multiple targets, so it has attracted widespread attention from scholars [6][7][8]. The probability hypothesis density (PHD) propagates the first-order statistical moment of the RFS, which is developed to alleviate the computational intractability as a recursion [9]. At present, the existing closedform solutions of PHD filters mainly include particle filter PHD [10][11][12] and Gaussian mixture PHD filter [13,14]. However, these algorithms only have better performance in multi-target tracking systems where the noise variance is known.
Due to the different characteristics of sensors, it is difficult to accurately obtain statistical information of noise in practical applications. For the problem of unknown noise statistics, the variational Bayesian (VB) approximation method is widely used to estimate the state of linear Gaussian systems. Zhang et al. proposed an improved PHD filter. It introduced the VB method into the PHD recursion and derived the closed-form solution of the improved PHD filter for the linear Gaussian multi-target model by using the inverse gamma (IG) and Gaussian mixture distribution [15]. An adaptive cubature Kalman-VB-PHD (ACK-VB-PHD) filter was deduced by Yuan et al., which was based on the PHD filter and used the Cubature Kalman filtering (CK) to approximate the nonlinear measurement model. The measurement noise covariance distribution was described by the inverse Wishart (IW) distribution. Then, it iteratively estimated the measurement noise covariance and multi-target state joint posterior density through the VB approximation technique [16]. Li et al. took the prior gamma distribution for noise parameters in PHD filter, so that the intensity of prediction and update can be represented by a mixture of Gaussian gamma terms. When the target state and noise parameters were coupled in the likelihood function, the VB method was used to derive the approximate distribution [17]. However, using gamma distribution as the prior distribution of measurement noise limits covariance matrix of measurement noise to being a diagonal matrix. Additionally, the above algorithms solved one problem. Only the inaccurate measurement noise covariance was considered.
In real scenes, not only the measurement noise but also the process noise are usually unknown and time-varying. Then, some scholars put forward the use of the VB method to estimate the measurement noise while increasing the estimation of the unknown process noise. Huang et al. proposed a novel variational Bayesian-based adaptive Kalman filter (VBAKF). It selected the IW prior distribution for the prediction error covariance and the measurement noise covariance and used the VB method to derive the target state, the state prediction covariance matrix, and the amount measurement noise covariance matrix [18]. Li et al. proposed a robust Poisson multi-Bernoulli mixture (PMBM) filter to jointly estimate the target motion state, the corresponding state covariance, and the measurement noise covariance. The IW distribution was selected as the conjugate prior distribution of Gaussian. This algorithm introduced VB to ensure the conjugacy to obtain the approximate posterior density of the augmented state [19]. However, due to the simultaneous variation iteration of the measurement noise covariance matrix and predicted error covariance matrix, the above algorithms are time-consuming.

Contributions
Based on the above research, this paper proposes the strong tracking PHD-based on a VB filter in the case of the inaccurate process noise covariance and measurement noise covariance slowly time-varying. This article mainly has the following characteristics: (a) Strong tracking filter is introduced under the framework using VB approximation.
Then, the prediction state covariance is modified in real time by the fading factor from strong tracking. The factor not only enhances the function of observation in Kalman filtering but also corrects the influence of inaccuracy of process noise on state covariance. (b) This paper uses IW distribution to model only measurement noise, compared with the literature [18,19] that uses IW distributions as prior distributions of prediction state covariance and measurement noise covariance. It reduces the time consumed by the iterative process and combines the modified predictive state covariance, thus balancing the computation time and tracking accuracy. The kinematic state obeys Gaussian distribution, and then, the augmented state is modeled as the GIW joint distribution. (c) Three scenarios with different parameters are designed to evaluate the performance of filters from the four aspects of Optimal Sub-pattern Assignment (OSPA) error, localisation error, cardinality error, and calculation time. The proposed algorithm is compared to the original GM-PHD filter and the VB adaptive PHD filter with optimal tracking performance and robustness.

Paper Organization and Notation
The main distributions in the remaining chapters of this article are as follows. Section 2 briefly describes the traditional Gaussian mixture PHD filter. Section 3 presentes the proposed filter in this paper. The basic principle of the VB approximation is given in Section 3.1. Section 3.2 introduces the strong tracking algorithm on the VB basis. Then, Section 3.3 shows the derivation details, which are the GIW implementation of the proposed filter. Section 4 shows the design of a multi-target tracking simulation experiment to verify the tracking accuracy and reasonable calculation time of the filter under different parameter scenarios. Section 5 summarizes the contents of this article as a whole.
In this paper, matrices are indicated in bold, such as m. The Kronecker product of matrices is denoted by ⊗. m T is the transposition of matrix m. m and tr(m) calculate the two norm and trace of matrix m, respectively. The Kullback-Leibler divergence between q(x) and p(x) is expressed as KLD(q(x) p(x)) p(x) dx.

Traditional Gaussian Mixture PHD Filter
In the monitoring area, there are M(k) target states and N(k) measurement values at time k; the multi-target state set and multi-target observation set are denoted as X k = x k,1 , · · · , x k,M(k) and Z k = z k,1 , · · · , z k,N(k) , respectively. Assuming that the multitarget state set at time step k − 1 is X k−1 , then the current time X k and Z k are expressed as follows: where S k|k−1 (x) is the RFS of multi-target states from X k−1 at time k − 1 to k,B k|k−1 (x) and Γ k represent RFS derived and newly born, and Θ k (x) and κ k are the RFS of observations generated by X k and clutter, respectively. PHD filter is obtained by the first moment approximation of the posterior multi-target state, but its recursion does not admit closed form solutions [10]. In the linear Gaussian multi-objective system, the implementation of the Gaussian mixture is used to obtain the analytical solution of Bayesian integration and the process can more clearly show how the Gaussian component is analytically propagated to the next moment. Assume that each target follows the linear Gaussian transition model and the linear Gaussian observation model; the posterior PHD at time k subject to the Gaussian mixture of the form is as follows: where P D,k is the detection probability. Updated state m (i) k and state covariance P (i) k are obtained by prior information on the time k and Kalman gain.
The GM-PHD filter usually uses the given process noise covariance Q k and measurement noise covariance R k , but it is difficult to reflect the time-varying situation of the real environment under fixed noise statistics. In the process of multi-target tracking, the GM-PHD filter uses an inaccurate Q k to generate an inaccurate predicted state covariance P k|k−1 . At this time,P k|k−1 and inaccurate R k cause the Kalman gain K k to be inaccurate, which eventually affects the Gaussian parameter m k and P k , resulting in an inaccurate X k .

VB Approximation
When the noise statistics in the actual scene are unknown and slowly time-varying, the probability density of the inaccurate R k and the single target state X k are estimated jointly by the VB approximation. Assuming that they are independent of each other, the joint posterior probability density function (PDF) according to Bayesian rule is as follows: where g k (Z k |X k , R k ) is the likelihood function of R k and X k . Due to the unknown posterior PDF of R k , it is difficult to obtain an analytical solution. For the convenience of calculation, the VB approximation is used to find the free-form approximate parameter distribution of p(X k , R k |Z 1:k ) [20], which can be written as follows: where q(·) represents the approximate posterior PDF of p(·). q(X k ) and q(R k ) are obtained by minimizing the KLD between the approximate posterior PDF and the true posterior PDF: The variational parameters of q(X k ) and q(R k ) are coupled. The method of literature [21] is used to solve the fixed-point iteration, and the iteration converges to the local optimum, which are calculated as follows: where N is the number of variational iterations and IW(·) is the IW PDF with the dof parameter u k . Selecting the IW distribution as the conjugate prior of Gaussian distribution can avoid the restriction that the measurement noise covariance matrix must be a diagonal matrix due to the use of the inverse gamma (IG) distribution.

Strong Tracking Principle With VB Approximation
In order to improve the tracking ability of uncertain system model and to enhance the accuracy of posterior probability density PDF obtained by VB approximation, the fading factor η k of strong tracking principle is needed. Its main function is to adjust the state prediction covariance P k|k−1 through η k to correct the gain K k in real time, to force the residual sequence to be orthogonal, and to resist the performance degradation caused by the uncertain process noise. The revised state prediction covariance is as follows: and η k is defined as follows: where β is the weakening factor, which makes the estimation result smoother, and tr(·) represents the matrix trace. The output residual sequence covariance V k is given by the following: where γ k is the residual sequence. The forgetting factor ζ improves the influence of the residual sequence and enhances its role in the filter, usually taken in 0.9 ≤ ζ ≤ 1 [22]. In Equation (12), R k is fixed at each moment in the iterative process of the traditional strong tracking algorithm, but inaccurate R k affects the accuracy of η k in the changing environment [23].To solve the above problem, the same form q(R k ) = IW(R k ; u k , U k ) is introduced after the Bayesian inference, and the measurement noise covariance R k can be expressed as follows: d R is the dimension of measurement noise covariance. Since R k changes slowly and the range of change is not drastic, the estimated value at the last moment still has great reference value, and R k as the real-time factor of η k can correct P k|k−1 more accurately. The modified state prediction covariance P * k|k−1 not only improves the tracking performance but also reduces the influence of process noise on the estimation results and improves the robustness of uncertain systems.

The GIW Implementation of The VB-Based Strong Tracking PHD Filter
Combining the above theories, this paper introduces the principle of strong tracking on the basis of the VB adaptive PHD filter and then the GIW implementation of the VB-based strong tracking PHD filter is derived for linear multi-target uncertain systems. The prediction state covariance is corrected in real time by the fading factor. The GIW joint distribution is selected to variation approximate the posterior PHD of x k and R k produced by the measurement set. Assuming that the augmented state of a single target is The GIW mixed form of the posterior intensity PHD at time k − 1 can be expressed as follows: Using a one-step prediction of the posterior intensity PHD at a time by the multiobjective Bayesian, the components of the predicted intensity are the same as the Equation (1).
where P S,k is target survival probability, v S,k|k−1 (·) is the surviving target predicted intensity at time k, the spawning intensity v β,k|k−1 (·) and the birth intensity δ k (·) have the same composition as v S,k|k−1 (·), and the GIW parameters of v S,k|k−1 (φ S,k|k−1 ) are derived as follows: where ρ is a real number, which usually takes the value ρ = 1 − exp(−4) [18]. The residual sequence covariance of Equation (13) is calculated according to γ S,k , and then, the improved state prediction covariance P * k|k−1 is finally obtained by combining Equations (12) and (14).

Update
The GIW mixed form of the prediction intensity PHD at time k can be further written as follows: Then, the updated intensity PHD forms representatives at the same time: where where n ∈ {1, 2, ..., N}, N is the maximum number of variational iterations. Until m are updated and output as the input value of pruning and merging. As time progresses, the GIW mixture component increases. Therefore, after each update, it is necessary to prune the GIW terms for which the existence probability is lower than the threshold L. Then, the merging distance of the remaining terms is less than the threshold U, and we can extract the target state finally [24].

GIW-stPHD Algorithm Implementation
In order to more intuitively, introduce the detailed steps of the GIW-stPHD filter. The algorithm flow chart is summarized, as shown in Figure 1 below. The pseudocode of the GIW-stPHD filter is shown in Algorithm 1.
k−1 , total step K, measurement set Z k , number of iterations N 2: for k = 1 : K do 3: Calculating Equations (19) and (20) to get the state prediction value and residual sequence 4: Equations (12)- (14) to obtain the modified covariance of predicted states P (j) * k|k−1 5: Combine Equations (21) and (22) to predict IW distribution parameters 6: Set initial value of variational iteration m if m break 13: end if 14: end for 15: if w where K is the total simulation time, q = 1m 2 /s 2 ,r = 10m 2 . T is the sampling interval 1s. The survival probability P S,k is 0.99 for the target from the previous moment to the next moment, and clutter rate is λ=1. The three filters use the same nominal process noise covariance Q 0 = I 4 and measurement noise covariance R 0 = I 2 . I 4 and I 2 are four-dimensional and two-dimensional identity matrices, respectively.
The four aspects of OSPA error, localisation error, calculation time, and cardinality error were used as evaluation criteria to compare the performance of the three filters. The OSPA distance [25] has two subsets of X and Z.The dimensions are m and n, respectively. When m ≤ n, it can be defined as follows: Distance sensitivity parameter is 1 ≤ p < ∞, and correlation sensitivity parameter is c > 0. This article used p = 2 and c = 100 for simulation. Π n refers to all permutations and combinations of {1, ..., n}. When m > n,d . This OSPA error in [25] was decomposed into two components, each of which accounts for localization error and cardinality error c p (n − m).

Simulation Scenario
In this section, three scenes with different parameters were designed to demonstrate the effectiveness of the proposed GIW-stPHD filter, and the simulations were performed on the 2-D plane. There were four targets in the monitoring range, which appeared at time [1 41 50 71] (s) until time [40 60 70 100] (s), disappearing in sequence, and had uniform motion during survival. In the simulation process, the tracking performance of the GM-PHD filter, the GIW implementation of the VB-based PHD filter (GIW-vbPHD), and the proposed GIW-stPHD were mainly compared. Figure 2 shows the true trajectory of all targets.   Table 1 provides different parameters for the three scenes. In scene 1, the detection probability P d,k and tuning parameter τ remained unchanged, and the number of variational iterations N kept increasing. Scene 2 and scene 3 were the changes in P d,k and τ, respectively.  4 show the comparison of the OSPA error and localisation error between the GM-PHD filter, GIW-vbPHD filter, and GIW-stPHD filter when the detection probability and tuning parameter remained unchanged and the variation iteration numbers N were 2, 5, and 8, respectively. It can be seen that all three algorithms jump at the target rebirth time and the average error is higher when there are multiple targets at 50 s-60 s than at other times. As a whole, it is obvious that the GIW-stPHD algorithm is superior to the other two algorithms. In the simulation environment with noise is unknown and time-varying, the traditional GM-PHD filter uses constant and inaccurate nominal noise without involving variational iteration. Therefore, its tracking performance in the comparison algorithm is not affected and has large errors when the number of variation iterations changes.   In order to more intuitively describe the tracking performance of the algorithm under different variational iteration times, the average value of the three different algorithms under 500 MC is given in Table 2. The more iteration times the higher tracking accuracy, but the iteration to a certain number of values remain unchanged. Taking the OSPA error as an example, the error of the proposed algorithm is reduced by 21.1% compared with the GIW-vbPHD algorithm when the number of iterations is 2, reduced by 26.7% at N = 5. The GIW-stPHD and GIW-vbPHD algorithms are reduced by 7.7 % and 14.3%, respectively, when they are increased from 2 to 5 times. The error remains basically unchanged at N = 7 in scene 1, indicating that the tracking accuracy is basically saturated. At this time, even at the expense of computing time, it cannot bring higher tracking accuracy. Therefore, N = 5 is a relatively suitable number of variational iterations.

Filters
OSPA Error (m) Localisation Error (m)     Table 3 gives a more detailed description of the different tracking of the three filters. The localisation errors of the GIW-stPHD filter are reduced by 31.7%, 14.34%, and 3.5% compared with the GIW-vbPHD filter in the detection probability of 0.98, 0.88, and 0.79, respectively. With the decrease of the detection probability, the error of the filter gradually becomes larger. Under different detection probabilities, the tracking accuracy of this proposed algorithm is better than that of the other two algorithms. The greater the detection probability P d,k = 0.98, the more obvious the advantage of the proposed filter.   Table 4 clarifies that the error of GIW-stPHD filter is the smallest in the scene τ = 5, and its OSPA error and localisation error are 28.6% and 34.3% lower than those of GIW-vbPHD filter, respectively.  To sum up, in order to better analyze the overall performance of the proposed GIW-stPHD filter, scene 1 with N = 7, scene 2 with P d,k = 0.88, and scene 3 with τ = 5 were selected as case A, case B, and case C, respectively, from the above scenes. In these three different cases, it was compared with the GM-PHD filter and the GIW-vbPHD filter in the two aspects of computational time and cardinality error.
It can be seen from Figure 9 that the cardinality error of the GM-PHD filter, the GIW-vbPHD filter and the proposed GIW-stPHD filter decreases in the three cases. The GM-PHD algorithm takes the shortest time because it does not need to perform variational iteration, but the tracking accuracy is also the worst. Taking case A in Table 5 as an example, the time consumed by GIW-stPHD algorithm is shortened by 0.57s and the error is reduced by 2.5 compared with that of GM-vbPHD algorithm under the same iteration number.

Simulation Complex Scenario
The tracking performances of the GIW-stPHD filter are evaluated under long-term conditions of multiple targets in challenging scenario. Under the same total simulation step, the appearance time and disappearance time of the four targets changed to [1 20 40 60] (s) and [80 80 80 100] (s) , respectively. The number of simultaneous targets increases sequentially. Three filters ran 500 MC simulations under different values of the above three parameters. The parameters N = 5, P d,k = 0.98, τ = 3 were used as the reference group. Three experimental groups reduced the number of variational iterations and detection probability to N = 2 and P d,k = 0.88, respectively, and changed the tuning parameter to τ = 5.
The OSPA errors and localisation errors of the three filters are shown in Figures 10  and 11, and these errors are the averages of the 500 MC experiments. As the number of simultaneous targets increases, the errors corresponding to the three filters are constantly rising, until the error reaches the maximum value at 80s. When the number of variational iterations decreases to 2, the errors of the proposed filter and the GIW-vbPHD filter increase correspondingly compared with the reference group. With the detection probability of 0.88, the errors of the three comparison algorithms increased significantly. The OSPA error and positioning error of the experimental group for which the tuning parameter changed to 5 are smaller than those of the reference group. The proposed algorithm has minimal errors compared with the comparison filters, which is clearly shown in the figures.
In addition, Table 6 shows that the proposed filter consumes similar time to the GIW-vbPHD filter when the number of variational iterations is 2. However, in other cases, it consumes less time than the GIW-vbPHD filter, with an average of 0.91s. The GIW mixture component is multiplied when the number of targets increases. This phenomenon increases the workload of pruning and merging steps after updating. Therefore, the computational complexity of the filters increase as the number of targets increases.

Summary of Simulation Results
Through the simulation experiments of the two scenarios under different parameters, the following conclusions are summarized. First, the tracking effect and robustness of the GIW-stPHD filter are optimal when N = 5, P d,k = 0.98 and τ = 5 are selected. Combining the above OSPA error and localisation error tables, the fluctuation of detection probability has the greatest impact on filtering performance. The detection probability increased from 0.88 to 0.98, which changed by 11.36%, and the OSPA error changed by 43.08% accordingly. When the tuning parameter changed by 25%, the OSPA decreased by 11.9%. The number of iterations N = 2 doubled, thereby reducing the OSPA error by 16.75%. Then, the proposed algorithm achieved an appealing compromise between tracking accuracy and reasonable computing time in the comparison scene in Figure 9 and Table 5. Finally, the simulation results in the complex scenario show that the proposed filter has a stronger adaptability under harsh experimental conditions.

Conclusions
This paper mainly proposes the GIW-stPHD filter to solve inaccurate process noise and measurement noise in Gaussian linear systems. The algorithm uses the IW distribution as the conjugate prior distribution to model the measurement noise covariance. Then, the fading factor in the strong tracking principle is used to further modify the predicted state covariance to resist the influence of the uncertainty process noise covariance. The VB approach iteratively approximates the posterior probability density. The simulation experiment results show that the proposed GIW-stPHD filter, compared with the GM-PHD filter and the GIW-vbPHD filter, have obvious advantages in the four aspects OSPA error, localisation error, cardinality error, and calculation time under different parameter scenarios. Due to the limitation of the PHD filter itself, it is difficult to resist the strong clutter in the environment. This aspect will be the focus of breakthroughs in the followup work.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest:
The authors declare no conflict of interest.