Particle‐filter‐based human target tracking in image domain for through‐wall imaging radar

This study deals with a tracking problem for hidden human targets using time-division multiple-input-multiple-output through-wall imaging radar (TWIR). An efficient image-domain tracking algorithm is proposed. Specifically, the authors first utilise back-projection algorithm and the phase coherence factor (PCF) to obtain multi-frame high quality images. Then a tracking algorithm via amplitude-distribution-based particle filter is proposed. Experimental data validates that this algorithm has a commendable effectiveness for hidden target tracking.


Introduction
Moving human target tracking plays an important role in the timedivision multiple-input-multiple-output (TD-MIMO) through-wall imaging radar (TWIR) [1], and tracking after imaging technique becomes more attractive in recent years because it is no need to compute the binary four-order over-determined equation group [2].
Due to the large size of the actual target, and the limitations of the radar resolutions, the target in the image may occupy multiple range bins as well as azimuth bins, which the assumption of ideal point target model is invalid [2]. Moreover, it is hard to accurately compensate the electromagnetic propagation delay of the nonuniform wall in through-wall scenario, which leads to the serious defocusing of the final radar images [3]. Besides, with the flexible movement of the human target, its contours change significantly in the radar image over time due to the variability of the human body [4]. Therefore, these may lead to poor imaging quality, which may cause failures of the follow-up tracking.
There are many great tracking algorithms of human body in visual tracking, like particle filter [5], which can be brought in tracking after imaging. However, the size and the shape of the target in radar image varies with time, while the target features in visual tracking are generally obvious and stable, like face [6]. Besides, only amplitude and phase information are available in radar imaging, which is different from optical imaging [7]. Therefore, visual tracking algorithm cannot be directly applied to tracking after imaging in TWIR.
In this paper, we utilise an amplitude-distribution-based particle filter to settle the problem of human target tracking in imagedomain exploiting TWIR. We first give a concise signal model of the TD-MIMO through-wall imaging radar. Then, the backprojection (BP) algorithm is utilised to obtain multi-frame images and the phase coherence factor (PCF) is used to suppress the grating-lobe artefacts. Next, an adaptive method is presented to construct the target template. Finally, we propose an image-domain tracking algorithm via amplitude-distribution-based particle filter. The experimental data demonstrates the robustness of the presented tracking algorithm.
The rest of the paper is arranged as follows. In Section 2, the signal model and BP-based imaging are presented. In Section 3, we present an adaptive method of target template construction and propose a tracking algorithm using the amplitude-distributionbased particle filter. Experimental results and analysis are expounded in Section 4, and conclusions are summarized in Section 5.

BP-based imaging for TWIR
Consider a human target located at T x tar , y tar hides behind a wall, as shown in Fig. 1. The thickness of the wall is d w and the relative dielectric constant is ε. A TD-MIMO array that consists of M transmitting antennas and N receiving antennas is placed in the front of the wall parallelly with the distance d 3 . Suppose the i-th transmitting antenna radiates signal s t . Then the electromagnetic wave propagates along the path as shown in Fig. 1, and the echo received by the j-th receiving antenna is expressed as where σ T denotes the scattering coefficient of the target, ψ i j t represents the noise, and τ i j denotes the target's echo delay, represented as where c is the electromagnetic wave velocity. The imaging area is separated into X × Y pixels. For pixel x h x h , y h , its value can be calculated by [7] where τ i jh represents the focusing delay of the pixel x h . The images generated by TWIR suffer from the grating-lobe artefacts smearing the image [7], as shown in Fig. 2a.
The PCF is utilised to restrain the grating-lobe artefacts [7,8]. The PCF is synthesised by the normalised complex exponential term of all channels. The value of PCF of the q-th pixel in the image is where θ y k τ q, k is the phase of y k τ q, k , y k ( ⋅ ) is the echo return of the k-th channel, τ q, k is the propagation delay between the q-th pixel and the k-th channel. p ≥ 1 is the sensitive factor. std ⋅ denotes the standard deviation. As is shown in Fig. 2b, the grating-lobe artefacts are restrained after PCF weighting.
However, there are still some problems with the images after PCF. Duo to the extended feature of the human target, the corresponding target image may occupy several pixels. Meanwhile, duo to inaccuracies compensation of the non-uniform wall, image defocusing will cause the target image may have a plurality of maxima. The above issues will affect the robustness of common tracking methods.

Tracking algorithm in image domain
In this section, we first propose an adaptive method for target template construction, and propose a tracking algorithm based on particle filter via amplitude distribution. Fig. 3 shows the process of our proposed tracking algorithm in image domain.

Target template construction
According to the literature in video tracking, the target template, which contains the pixels of the object of interest, should be created before tracking and is usually drawn by the user [6]. In order to achieve automatic processing, we propose an adaptive method for target template construction. Then we employ amplitude distribution to represent the characteristic of the target template.
There are two steps for the adaptive method of target template construction.
First, we need to extract the local maxima of the PCF weighted image I k x , k = 1, 2, …, K. For the purpose of detecting the potential targets, we utilise the dilation to extract the local peaks in the image. The dilated image can be expressed as while E denotes the oval disc determined by theoretical resolution, Y is the dilated image, ⊕ denotes the dilation operation. The pixels of equal amplitude are extracted by comparing Y k x with I k x . We set a threshold to remove the pixels with amplitudes close to zero and preserve the pixels of the potential targets. Then, we use biaxial projection to construct the target template as follows: (1) Initialise a square region whose centre is at a maximum and whose side length is the theoretical range resolution.
(2) Calculate the sum of the all pixels in the square region along the x-axis and y-axis, respectively. For the purpose of explaining the detailed process, we provide an example in Fig. 4. Assume that T s has a value of 3. Obviously, only the edge value in the negative x direction is <3.
A target image may have a plurality of maxima and a rectangular template may wrap another maximum, as shown in Fig. 5a. These maxima are considered to be from the same extended target. To reduce complex calculations, the peaks with large values are preferred to create target template and the small peaks inside the existing rectangular template will be removed. Use the inscribed ellipse area of the rectangle as the target template. Let x i * i = 1, 2, …, N t represent the set of pixels inside the ellipse (Fig. 5b), where N t denotes the size of it.
Assume that the amplitude range of the k-th normalised image I k x can be divided into m ranges. The features of the target template x i * i = 1, 2, …, N t can be represented by its amplitude distribution q = q u u = 1, 2, …, m [5], as shown in Fig. 5c, where q u is where b x i is the range index of the amplitude distribution of the pixel x i and δ denotes the Kronecker delta function. C is a normalisation constant to ensure ∑ u = 1 m q u = 1 from where and k ∥ x i * ∥ 2 2 is the weight of pixel x i * . Epanechnikov kernel [5] is usually chosen, which is

Tracking using particle filter via amplitude distribution
The non-linear and non-Gaussian characteristic of particle filter can easily cope with the problems of human target tracking [9]. In this section, the Bhattacharyya distance is used to indicate the similarity between the target's amplitude distribution and the amplitude distribution of every sample position. The amplitudedistribution-based particle filter consists of four operation stages involving the propagation, the observation, the estimation and the resampling.
In the k-th frame, the probability distribution of the target position is approximated by a sample set S k = s k n , n = 1, …, N s , where each sample s k n = y k n , π k n contains its position y k n and weight π k n .

Propagation:
The positions of the samples are predicted in this stage. Suppose that the velocity of the target is v, which can be estimated by the previous three target positions. For each sample, the state vector in the (k − 1)-th frame can be constructed as According to a dynamical model, the state vector in current frame can be predicted as where F denotes the transition matrix and Γ is the system noise matrix. w k is a white Gaussian noise vector, and its covariance matrix is Q k = E w k w k T = σ w 2 I. The position of the sample after propagation is y k n .

Observation:
The weights of the samples are determined. In radar image, the amplitude distributions of the samples are used as the observations. Each sample can construct an elliptic area with centre y k n and half axes h x and h y . The amplitude distribution inside the elliptic region is represented by p y k n = p u y k n u = 1, 2, …, m , where p u y k n is calculated as which h = h x 2 + h y 2 is the scale of the elliptic region and N h denotes the number of pixels inside it. C h is a normalisation constant to ensure ∑ u = 1 m p u = 1, given by The samples, which resemble the target template more in amplitude distribution, are more favoured to estimate the target position. Therefore, a similarity measure is needed, which is based on amplitude distributions. The Bhattacharyya coefficient is used to describe the similarity between two amplitude distributions p y and q [5], which is The Bhattacharyya distance between the two distributions is defined as the measure Using a Gaussian distribution of the Bhattacharyya distance d y k n with a variance σ, the weight π k n of the sample s k n can be approximated as

Estimation:
The target's position in the k-th frame is estimated as the weighted average of the samples, given by Thus, the estimated positions of the target have been determined. An example of one iteration step of our algorithm is shown in Fig. 6.

Resampling:
The degeneracy phenomenon, which means that a large proportion of particles will have negligible weights after a few iterations, is an inevitable problem in the particle filter [9]. This phenomenon signifies that a large part of the calculation is wasted on the particles with almost zero weight. So in the resampling stage, resampling strategy is utilised to copy the sample with large weight and eliminate the one with small weight. A common measure of degeneracy is the effective sample size N eff recommended in [9] and denoted by where π k n is the normalised weight obtained via (15). When the degeneracy phenomenon happens (i.e. when N eff is less than a threshold N th ), the samples should be re-sampled. Systematic resampling [9] is our preferred solution and its operation is described in Algorithm 1 (see Fig. 7). After resampling, the weights are reset to π k n = 1/N s . The tracking algorithm via amplitude-distribution-based particle filter is summarised in Algorithm 1 (Fig. 7).

Trajectory management
The M/N logic is employed as trajectory initiation strategy. M/N logic means a tentative trajectory is affirmed if the number of credible estimations is more than M c in N c frames. In this paper, the value of the similarity ρ ŷ k is the criterion for judging whether the estimation ŷ k is credible. In addition, if the position is not updated in consecutive M s frames, the trajectory is terminated.

Scenario and parameters
We employ an ultra-wide bandwidth radar to detect a closed room where a human target is walking in circles. The experiment scenario is shown in Fig. 8 the array and the surface of front wall is 10 m. The step-frequency signal is used as the transmitting signal, whose initial frequency is 1.1 GHz and terminal frequency is 2.1 GHz. The frequency step is 2 MHz. The room size is 10 m × 10 m, which is separated into 301 × 301 pixels. The parameters of target template construction are m = 16, T s = 0.03A where A is the local maximum. The threshold σ T is 0.8. The size of sample set is N s = 75. The parameters of particle filter are as follows:.
The variance σ is 0.2. The threshold of resampling is N th = N s /2. As for track management, M c = 12, N c = 20, M s = 10 and if the value of the similarity ρ ŷ k ≥ 0.8, the estimation ŷ k is credible.

Results
For comparison purposes, we consider mean-shift algorithm [ 9b-c and f-g, the trajectory of mean-shift algorithm is terminated in the 104th frame and new trajectory is initialised in the following frames with the trajectory management, while our propose algorithm can keep up with the target image.
From Fig. 9, we can also see that the trajectories of mean-shift algorithm are discontinuous and rough due to its intrinsic limitation of exploring local maxima. In contrast, the trajectory of our proposed algorithm is continuous and smooth. Besides, with the track management, the false tentative tracks cannot be confirmed because they cannot satisfy the conditions of track confirming.

Conclusions
In this paper, we have proposed a robust tracking algorithm based on particle filter via amplitude distribution, which has a commendable performance for hidden target tracking. The experiment results have demonstrated that the proposed algorithm can track the human target better than mean-shift algorithm.