Tracking Objects Based on Multiple Particle Filters for Multipart Combined Moving Directions Information

Object tracking is an important procedure in the computer vision field as it estimates the position, size, and state of an object along the video's timeline. Although many algorithms were proposed with high accuracy, object tracking in diverse contexts is still a challenging problem. The paper presents some methods to track the movement of two types of objects: arbitrary objects and humans. Both problems estimate the state density function of an object using particle filters. For the videos of a static or relatively static camera, we adjusted the state transition model by integrating the movement direction of the object. Also, we propose that partitioning the object needs tracking. To track the human, we partitioned the human into N parts and, then, tracked each part. During tracking, if a part deviated from the object, it was corrected by centering rotation, and the part was, then, combined with other parts.


Introduction
e object tracking in videos is a technique that has many applications in many fields. For example, in the biomedical field [1,2], the object-tracking technique is applied to automatically track cells while they are born, duplicated, or as they move and die. Another example is the application of the technique in autopilot systems, where it is used to observe and track the vehicles around the driving car [3,4] or footballer tracking [5][6][7]. A highly accurate vehicle-tracking program is an indispensable necessity for safety. Moreover, the tracking technology is usually combined with the identification and recognition systems to create a complete tactic for real-life applications.
Tracking objects in video is difficult due to many challenges that all are needed to be considered and solved. e first challenge is that we do not know, in advance, the object that we need to track. ere may be no information about that object. In the absence of information, the object description for the program script must be highly general.
Another challenge is that the tracked object has heterogeneity in colors, which vary by each part of the object. For example, to track human movement, the head is characterized by the hair color (black and yellow), while the body and legs are described by the color of the shirt and pants that the person is wearing. Because of the challenges and difficulties mentioned above, no comprehensive tracking algorithm can be adopted for all problems.
In this paper, we present the method to modify the state model according to the direction of predicting that an object appears in the same direction of motion with a higher probability. In addition, we explore the effectivity to track partially obscured objects by tracking its visible sections. To do this, we divided the object into multiple sections and tracked these sections independently. When some parts obscure the object, our approach should still successfully track the object movement. We also present the experimental particle filter model and present a suggestion for integrating information on the direction of the object's movement, the Nparticle filter model, to track each part then combines. e rest of the paper is organized as follows: e most relevant work that motivated this paper is reviewed in Section 2. Section 3 describes, in detail, our method multiple particle filters for multipart combined moving directions information. Section 4 summarizes the results from our method. Section 5 is the discussion of our paper.

Related Work
e correlation filters approach is a powerful tool in digital signal processing [8,9].
is algorithm class utilizes the properties of Fourier transform of turning convolution in the spatial domain into function multiplication in the Fourier domain [10][11][12][13]. e original idea of the correlation filter was to solve the problem of locating an object in an image. In other words, if the object of interest appears in the image, its position including the axis coordinates is determined. e tool to solve this problem is the average synthetic exact filter [10]. e next, correlation filter, is the Total Minimum Output Error, studied by Bolme et al. [13]. is tracking method is very powerful and can cope with situations such as changing light and changing the size and shape of objects.
Aviden et al. [14] at Misubishi Electric Research Labs considered object tracking as the binary classification problem to distinguish background pixels and trackingobject pixels using the AdaBoost technique. e method's idea was training weak classification functions to classify the background and object and, then, combine them to form a strong classification based on the Adaboost mechanism. But, the author realized that if an object is not in the rectangle form, pixels in the containing object rectangle but not in the object will be labeled as belonging to the object. ese pixels are considered alien elements, while AdaBoost is sensitive to alien elements [15]. In addition, some other limitations of the approach are as follows: it has not solved the completely obstructed object's situation in a long time and the featured space used in the algorithm does not yet utilize the spatial information of the image. e approach based on random process filtering has been studied for a long time in the field of mathematical statistics, and there have been many discovered impressive results [16][17][18]. Most of the algorithms based on this approach are based on the Bayes optimal solution for the hidden Markov filtering problem [19][20][21].
at means building a hidden Markov model plays a key role, and the model is more accurate; in fact, more Bayesian solutions accurately estimate the state of the object. e work in [20] uses the featured color histogram to construct a particle filter to track objects. e work in [22] uses gentle AdaBoost to construct an updated observation model over time.
Recently, the Siamese network-based trackers have received significant attention for their well-balanced tracking accuracy and efficiency.

Methodology
Problem 1. Highlight that first frame coordinates (x 1 , y 1 , ω 1 , h 1 ) are given and we need to infer object coordinates (x k , y k , ω k , h k ) in the subsequent frames.
Filtering the state (x k , y k , ω k , h k ) of the object in the next frames, we rely on the hidden Markov model theory with the construction of two models: state transitions and observations. e state transitions model in the studies is quite similar and are all Gaussian motion, and the main difference in the algorithms depends on the observed model. Using particle filters allows us to better handle color clutter in the background, as well as tracking completely obstructed objects. e principle of the particle filter according to the Prediction: predicting the object status at time n + 1 based on the likelihoods from time n to the previous states Based on the particle filter operating mechanism in Figure 1, we present the approximate results of posterior density function p(x k |y 1:k ) at time k.
From the posterior distribution at time k -1, p(x k |y 1:k− 1 ), we calculate the prior distribution for the time k (without observing y k ) by using Chapman-Kolmogorov equality [16], p(x k |y 1:k− 1 ) � p(x k |x k− 1 ).p(x k− 1 |y 1:k− 1 ) dx k− 1 , where p(x k |x k− 1 ) already exists in the state transition model and p(x k− 1 |y 1:k− 1 ) is the posterior root of step k − 1.
After observing y k , we update the prior density function at the predicted step at the level k: p(x k |y 1:k ) � (p(y k |x k ) * p(x k |y 1:k− 1 )/p(y k |y 1:k− 1 )), where p(y k |x k ) already exists in the observation model and p(x k |y 1:k− 1 ) is an a priori at the time k calculated in the previous step.
As a result, we obtain a weighted pattern representing the posterior density function at the time k: . When an object is in motion, it usually moves in a specific trajectory. erefore, to predict the object's location, we propose integrating the direction of motion, which will be discussed in detail in Section 3.1. In addition, different parts of an object carry their distinctive characteristics of shape, color, light absorption, and reflection capacity. If we use a particle filter, it can yield false tracking results. Besides, if a part of the object has the same color and brightness level as any other object in the frame, the tracking may be distorted. To fix this problem, we propose to divide the object into many parts, each of which will have the same properties. We, then, track the movement of each part with the constraints that these parts move in the same direction and maintain similar area and shape.

Moving Direction Information.
With the videos filtered out from the dataset in which the camera was relatively stable, we modified the state transition model in the hidden Markov model by integrating the direction of the object movement. is means that instead of using the Gaussian motion state model, we projected these Gauss functions into several different directions with different ratios before we made a new pattern. Because each object moves in a specific trajectory, the direction of the object's motion will remain constant for a certain period of time. Specifically, we consider the direction of motion as a separate component. At each assessment, we will update the direction of motion. We use this direction of motion to impact the particle filter at the prediction step, with the purpose that the particle filter will predict the object appearing in the same direction with higher probability.

Multiparts of an Object.
While considering the problem where the object shapes are less changing, if the object includes many parts with dissimilarity about the colors and contrast, using 1 particle filter for tracking will lead to incorrect tracking. erefore, an object needs to separate each area with similar color, grayscale, and contrast into n parts, each part being tracked separately. In this way, the parts which affected by the environment and other object artifacts will cause incorrect position identification which will need to be adjusted. For example, a human object can be normally represented by a 3-partition structure as illustrated in Figure 3. is structure divides the human object based on the gray color changes among the black head, the white shirt body, and the black pants legs. e resulting human object will be divided into 3 parts with a border represented by a different gray level, each part using a particle filter to track and combine based on the best feature matching part.

Build Model.
e adjustment of deviated areas should take two steps: (i) e center of the similar areas changed, which allowed the incorrect position of the similar areas to be adjusted accordingly to the correct position of the object (ii) Size ratio of similar area allowed similar areas to scale the height to the height of the original object us, when one part of the object is obscured or similar to another, we can restore and track enough.

Fine Tuning Parts.
Once the anterior root of the object is defined, the object is defined into n parts in a structure H, as shown in Figure 3. We, then, used one particle filter to track each part S 1 , S 2 , . . . , S n . At each time data point, each particle filter in the tracking area can diverge from each other. erefore, a modification model is needed to correct this issue. We used the collected assessment data of each part to evaluate which tracked part behaves the best. We kept this best-tracked part fixed and applied the rotation algorithm to n − 1 other parts using the fixed part as the origin. e adjustment of N-particle filters when tracking an object in the frame time k is described below.
Step 1: we calculate the rotation from the center of each section with the remaining n − 1 based on frame 0. p(x n |y 1:n-1 ) p(x n |y 1:n ) p(x n |y 1:n ) p(x n+1 |y n:1 ) Figure 1: Demonstrate the operating mechanism of a particle filter.
Computational Intelligence and Neuroscience For example, the angle of rotation from the center of S 0 2 compared to the rest of S 0 1 , S 0 3 is the angle θ 0 12 , θ 0 32 according to Figure 4(a).
Step 2 : we suppose that S k 1 , S k 2 , . . . , S k n are estimates and S 0 1 , S 0 2 , . . . , S 0 n are parts of the object in the original image. e distance is calculated as distance i � ‖HOG(S k i ) − HOG(S 0 i )‖ 2 , i � 1, n. However, because the division of parts may not be equal, to determine which is best, we multiply the coefficient by each distance and, then, compare K 1 * distance 1 , K 2 * distance 2 , . . . , K n * distance n . e smallest value is considered the best estimate. e best estimate is placed at the kth frame as S k min .
Step 3 : when the best part (S k min ) was selected, we performed the center rotation of the remaining n − 1 relative to the best part (S k min ) with the rotation angle defined. e result is the new center coordinates of the n − 1 part. find the new center coordinates of the part S k i with i � 1, (n − 1) by calculating the center rotation of the section S k i compared to the center of the part S k min with the angle of rotation (θ k i min − θ 0 i min ) as follows:

������ �→
Translate N s · a 8 particles in pf by v π → Step 2: Step 6: Estimate the state of the object in the kth image by calculating the average of the new set of particles EstimatedStatus � N s i�1 weight i * particle[i] ALGORITHM 1: Particle filter integrated motion direction. S2 S3 S1 Figure 3: Some structure II partition objects. 4 Computational Intelligence and Neuroscience where v → Step 4 : translating the particles progress of part S k i with i � 1, (n − 1) which differs from the best part (S k min ) to the position of n-1 new part (S k i′ ) is created out through Step 3.
Step 5: the sizes of S k 1 , S k 2 , . . . , S k n parts are scaled according to the ratio of S 0 1 , S 0 2 , . . . , S 0 n . at is, we will calculate h k i and w k i with i � 1, (n − 1) of the sections S k 1 , S k 2 , . . . , S k n . First, we find the horizontal dimensions of each S k i part as follows: w k i � n i�1 w k i /n with i � 1, (n − 1).
Next, we find the height dimensions of each S k i part as follows: Step 6 : translating the particles progress of part S k i with i � 1, (n − 1) compared to the best part (S k 2 ) with distance d k i min . For example, as shown in Figure 4(b), the particles of S k 1 are translated towards the best part (S k 2 ) with about d k 12 . We propose the multiparticle filter algorithm in Algorithms 2 and 3.

Environment.
Installation environment: we experiment on computers using the Windows 10 Pro 64 bit, RAM 8 GB, Chip Intel Core (TM) 5i-3210M CPU @ 2.5GHz; Matlab programming language version R2016a.

Data Set.
In 2013, Wu et al. [33] gathered many video sources related to the track and proceeded to create ground truth for these videos to form the TB-100 dataset. Because the TB-100 is a compilation of data from many sources, the context of the videos is also very different and diverse in attributes such as the type of objects to track, color or blackand-white videos, and still or dynamic cameras. e video datasets used to support the findings of this study have been deposited in http://www.visual-tracking.net.
Challenges in the dataset include the following: IV-illumination variation: the brightness of the subject varies significantly SV-scale variation: the ratio of the rectangle containing the first image object to the current image is out of range [(1/t s ), t s ], t s > 1(t s � 2) OCC-occlusion: the object is partially or completely obscured Input: pf particles sample set (based on Algorithm 4), k th image (where k starts from the second image) Output: e new set of particles represents p ((x, y, ω, h)|image k, estimate the state of the object in the k th image.
Step 1: (x new , y new , ω new , h new ) by Algorithm 8 Update weight for i th particle: weight i � weight i * likelihood Endfor Step 2: Calculate sum sw � N s Step 5: Estimate the state of the object in the k th image by calculating the average of the new set of particles EstimatedStatus � N s i�1 weight i * particle[i] ALGORITHM 2: Particle filter for random processes (x n , y n , ω n , h n ).
Step 1: Initialize N-particles set pf 1 , pf 2 , . . . , pf n Step 2: for i � 1 to n do Beginfor Take D i patern for part i according to Algorithm 4. Train strong classification F i � gentleAdaboost(D i ) according to Algorithm 5. Endfor Step 3: while e video is not over do Beginwhile Get observation photos obs for i � 1 to n do Beginfor Use particle filter to estimate the ith state according to Algorithm 1 or Algorithm 2 Choose the best part i 0 � arg min(K 1 * distance 1 , K 2 * distance 2 , . . . , K n * distance n ) Using center i 0 part and structure H perform rotation of the remaining n − 1 centers. Tranlate n − 1 other particle set i 0 to n − 1 has just been centered on i 0 . Scale the particle sets according to the structure ratio H. Translating n − 1 set of other particles i 0 toward the part i 0 . for i � 1 to n do Beginfor Take a new sample D * i based on the new section rotated from i 0 . Update strong classification F i according to Algorithm 6 Endfor Endwhile ALGORITHM 3: (MultiPart). 6 Computational Intelligence and Neuroscience DEF-deformation: nonsolid objects that change shape MB-motion blur: the subject is blurry due to camera movement FM-fast motion: groundtruth motion is greater than t m pixels (t m � 20) IPR-in-plane rotation: objects rotate in the image domain OPR-out-of-plane rotation: object out of the image domain OV-out of view: part of the object out of the image domain BC-background clutters: the background near the object has the same color or line as the object LR-low resolution: the number of pixels in the rectangle that contains the object (considering ground truth) is less than t r (t r � 400) e abovementioned challenges are distributed in the data set, which is shown in Figure 5.

Evaluating.
We use the evaluation criteria presented at the site [33] to evaluate the tracking algorithm.
Method 1 (R 1 ): evaluation based on the Euclid distance (precision plot): we measure the distance Euclid d from the estimated center of the algorithm to the actual center of the object (ground truth), if d is less than or equal to a threshold t 0 . e view is successful according to Figure 6(a). Method 2 (R 2 ): evaluation based on levels of overlap (success plot): the number of overlapping points is defined as � (|r t ∩ r a |/|r t ∪ r a |), in which r t is the bounding rectangle determined by the algorithm and r a is the ground truth rectangle according to Figure 6(b).
We calculate the ratio of R 1 , R 2 by (the number of successful images/the total number of images of images).

Result.
e results of tracking people with the camera do not fluctuate much, the rotation angle is conserved, and the proportions on the body of people and people are not too small.
We named Program 1 as MultiPart3 using the MultiPart algorithm by dividing the object into 3 parts in a ratio of 1 : 5 : 3; Program 2 is MultiPart3_direction using the MultiPart algorithm to calculate the direction of moving objects by dividing the object into 3 parts in a ratio of 1 : 5 : 3; Program 3 is MultiPart2 using the MultiPart algorithm by dividing the object into 2 parts in a ratio of 5 : 3; Program 4 is Multi-Part2_direction using the MultiPart algorithm to calculate the direction of object movement by dividing the object into 2 parts in a ratio of 5 : 3. e abovementioned five programs compared with DiMP algorithms [31] and GradNet [32] are shown in Table 1 and Figure 7.
e MultiPart2 algorithm uses 2 particle filters in a ratio of 5 : 3 to track, allowing a large portion of the head (head and body) to be more informative, less changing over time, and "denser" than the leg. e average accuracy result (R1 � 92.2%, R2 � 87.9%) is slightly larger than that of the GradNet algorithm (R1 � 85.9%, R2 � 86.3%). With the abovementioned results, we can see that the tracking part has much information for good average results compared with the object tracking. However, for data (Dancer và Dancer2) that have a tracking object who wears a skirt or long skirt covering feet, tracking using 2 particle filters at a ratio of 5 : 3 gives a low result compared to an object trace. By tracking, the object intact in this case is best.
For the videos mentioned above, the MultiPart3 algorithm divides 3 parts in a 1 : 5 : 3 ratio, after each image has adjustments of the parts according to the rotation technique  e MultiPart3_direction algorithm approximates the average accuracy result with the MultiPart3 algorithm because the dancer data have the human object that jumps up and down suddenly and the refined data take a number of frames so that the determination of the motion direction is wrong.

Conclusions
is paper presented several methods for object tracking in the videos mainly related to particle filters. To solve the general problem, we built a hidden Markov model and applied particle filters. For tracking human videos in normal condition where the human scale is preserved, we used 3 particle filters to track each part of the body or track the part of the body containing the most information, which will, then, infer to the whole body. Experimental results show dividing the object into (n + m) parts, even when n parts of objects are partially obscured and the remaining m parts are tracked normally and do not affect the tracking of the subject in the video. e development direction of this paper is to change the observation model. We found that the gentle Adaboost training process is time consuming. However, the algorithms using correlation filters have the advantage of being fast and highly accurate. For future studies, we suggest integrating the correlation filters into the observation model to shorten the execution time. In addition, we are planning to study parts of the traced objects in parallel to shorten the execution time. Table 2 describes the notations.

Data Availability
Public data were used to research. e [TB-100] data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.