Video Tracking for Visual Degraded Aerial Vehicle with H-PMHT

The work presented in this paper describes a novel approach for automatic video tracking of visual degraded air vehicles in daylight with sky background. The offered and applied video object tracking method is based on Histogram Probabilistic Multi Hypothesis Tracker algorithm. The HPMHT is an expectation maximization based algorithm developed for tracking objects in intense clutter environment by using intensity modulated data streams. Basically H-PMHT algorithm is suitable for linear-Gaussian point spread function case. However, recent studies have indicated that the algorithm is also applicable for non-linear and non-Gaussian target shapes. Thus H-PMHT becomes a suitable alternative for tracking applications with sonar, high resolution radars, IR, UV sensors and cameras. In this work H-PMHT algorithm is used for video tracking of visual degraded air vehicles. For this purpose RGB video data is processed by using a reciprocal pixel intensity measurement for meeting the requirements of the tracking process. A simulation study is conducted in order to demonstrate the video tracking performance ofH-PMHTagainst visual degraded air vehicles. Also the results obtained with H-PMHT algorithm are compared with the results of amplitude information added Interacting Multi Model Probabilistic Data Association algorithm.


Introduction
Using conventional single point tracking algorithms is not fully compatible with video object tracking scenarios because of time-varying shapes and pixel spreading areas of objects.In conventional algorithms tracking is fulfilled by processing the point-measurement data [1]- [3].In order to realize video object tracking by using point-measurement data, a point is defined instead of pixel spreading object by using mid-point or center of gravity.In both cases point measurement does not represent the pixel-spreading nature of the object exactly, and this leads to an increment in measurement error.Using whole data streams may be an effective approach for obtaining more accurate track expectations.Also using whole data provides an optimal approach.In this context Histogram Probabilistic Multi Hypothesis Tracker (H-PMHT) presents an optimal and furthermore track-beforedetect (TkBD) algorithm.
The H-PMHT [4] is basically an EM based algorithm for handling data streams, and tracking objects in high clutter environment.H-PMHT uses histogram intensity data directly and provides rather satisfactory results for one dimensional [5], [6] and two dimensional [7] applications.In the applications taken part in references [5]- [7] spreading of the target intensities presents almost a linear-Gaussian distribution, on the other hand for non-linear and non-Gausian applications a particular solution presented in [8] with particle filters.Also video tracking applications presented in [9] with a special processed video data, and in [10] with simulated image sequence were realized by using a modified H-PMHT, which is called H-PMHT with random matrices (H-PMHT-RM).
In this study H-PMHT algorithm is used for video tracking of visual degraded air vehicles in-flight.Visual degraded air vehicles have low-emissive and environmentally compatible dark paint and peculiar structures for absorbing and scattering of light.In the look-up case detecting air vehicles in daylight is almost inevitable even though they have degraded vision or not.Since the sky is generally much brighter than the vehicle [11].Therefore the innovation of this study does not emerge from detection of air vehicle and its conventional tracking, but processing the video data and using it for tracking purposes.Visual unprocessed air vehicles produce higher pixel intensity ratio than visual degraded air vehicles, and in some conditions with regard to the position of sun and reflection of its lights visual unprocessed air vehicles produce higher pixel intensity ratio even than background intensity ratio.On the other hand visual degraded air vehicles normally produces lower pixel intensity ratios than sky-background intensity ratios.This phenomenon causes some processing complications for using with intensity ratio based tracking algorithms, including H-PMHT.
In H-PMHT algorithm to detect target location from the data streams, high intensity ratio pixel clusters should be searched, and decided whether they emerge from target or clutter.However, visual degraded aerial vehicles in daylight have lower intensity histogram data than background, which is composed of sky and clouds.This condition especially takes place when tracking air vehicles from ground to air or air to air.This results with low pixel intensity ratio for targets, and high pixel intensity ratio for background.For obtaining H-PMHT compatible measurement data, first video for aerial vehicles is taken in true color (RGB), and then reciprocal pixel intensity measurement (RPIM) technique is applied to this data.By that way proper data streams, composed of aerial vehicles and sky background, are obtained for processing with H-PMHT.The obtained data has nonlinear and non-Gaussian target distribution.Thereby not only a technique for tracking visual degraded aerial vehicles is presented but also the performance of standard H-PMHT against to objects with non-linear, non-Gaussian, and timevarying point spread function is evaluated.A simulation study is conducted for different conditions and cases, and the results of this study is presented in the related section.Also the obtained results with H-PMHT are compared with the results of IMMPDA-AI which is not only a trustworthy probabilistic algorithm but also uses amplitude information [12], [13].

Measurement Data Configuration
In this study developing a video tracking application for visual degraded aerial vehicles is aimed at.For this purpose RGB videos of different visual degraded aerial vehicles are shot with an ordinary camera.In calculations each frame is taken into account separately in time sequence.The first operation on RGB images is to convert them into intensity images.In order to obtain intensity image, averaging method is the simplest one.For this purpose average of the three colors is taken to obtain intensity or grayscale image.But in this case the RGB image turned out to be a rather black image.Weighted method provides a solution to that problem.In the presented case two assumptions are taken into account; the wavelength of red color is the longest of all the three colors, and green color gives more soothing effect to eyes than that of red and blue colors.Then the process is adjusted by decreasing the contribution of red color, increasing the contribution of green color, and selecting a proper contribution ratio for blue color.Thus NTSC standard for RGB to grayscale conversion is defined as in (1), and this general conversion method is used for obtaining intensity image in this work.I(x, y) = 0.2989R(x, y)+0.587G(x,y)+0.114B(x,y).(1) After obtainig pixel intensity ratios of measurement RGB data, the reciprocal of the pixel intensity ratios are calculated and multiplied by 255 to unify the reciprocal intensity ratios as follows; The obtained data is named as reciprocal pixel intensity measurement (RPIM), and proper to use with H-PMHT algorithm.As an illustration, RGB images captured from a video data shown in Fig. 1, the related intensity image is shown in Fig. 2. Additionally RPIM image is shown in the form of scaled image data in Fig. 3. RPIM is composed of data streams, and projects the physical features of the aerial vehicle without distortion in shape and size.In this case spread function of targets are non-linear, non-Gaussian, and time varying.

Basic Structure of H-PMHT
The H-PMHT algorithm is introduced in [4] with its theory and derivations.Only a general structural outline for H-PMHT is given here.Before giving the derivation of H-PMHT, its parametric TkBD structure is mentioned.In classical methods thresholding, clustering, extracting and tracking steps are performed consecutively.On the other hand in the TkBD method all the steps occur concurrently [14].TkBD combines target detection and estimation by removing the detection algorithm from the process and supplying the whole sensor frame directly to the tracker.This improves track accuracy and allows the tracker to follow low SNR targets [10].The H-PMHT incorporates the ability of TkBD into the algorithm and makes it able to extract extended object tracks directly from an image sequence.
In the derivation of H-PMHT intensity of sensor data is transformed into histogram data with a quantization process.Quantization process is an intermediate step for derivation purposes.At the final step of derivations limit of quantization is taken and original data is used in the implementation.Quantized data is assumed as the number of measurements drop within each cell.In order to obtain the total number of measurements, summation of total cells is taken.Total measurement is emerged from the intensities of background scattering and objects in the sensor region.The probability mass function for these discrete measurements is modeled as a multinomial distribution.The probability expression in (3) represents an individual histogram shot falls in a cell.
B (t) is the border of cell with respect to its dimension, and θ t denotes parameter vector of the sample probability density function (PDF) at time t.f (τ|θ t ) represents a sample PDF defined over all sensor output space (τ ∈ R di m(C ) ), and it is the superposition of a background clutter model (m = 0) and M target models.Also for all cells ( = 1, ..., S) an additional classification is conducted.In this context = 1, ..., L(t) represents the displayed cells, and the remaining cells ( = L(t) + 1, ..., S) are said to be truncated.Displayed cells are separated from the truncated ones by comparing a predetermined threshold.If intensity ratio of a cell is higher than the threshold level, the cell will be assumed as a displayed cell.The threshold level may be selected as zero, in this case all the cells in the sensor area are assumed as displayed cells.Selecting threshold level higher than the noise floor will result in a reduction of the number of displayed cells.By the way cell elimination can be achieved, and processing time will decrease.At low SNR values selecting low threshold level may be an advantage, on the other hand at high SNR values selecting high threshold level will be convenient.Sample PDF (4) is assumed to be the mixture density; where Π t = {π t k } are defined as mixing proportions.Mixing proportions form a probability vector, such as π t k ≥ 0, and M k=0 π t k = 1.π t k represents the fraction of the total power due to the targets k = 1, ..., M and background noise k = 0. G k (τ; X t ) models the cell-to-cell variations of targets k = 1, ..., M, and background noise for k = 0.
After obtaining probability expression for an individual histogram point measurement (shots) falls in a cell, derivation of H-PMHT algorithm is started.H-PMHT stems from PMHT [15], [16], and all derivations of PMHT are based on Expectation Maximization (EM) method.Thus H-PMHT algorithm can be outlined according to expectation (E) and maximization (M) steps.The aim of EM process in H-PMHT is to assign histogram distribution to the model components and the precise location of shots as missing data.Additionally it provides for unobserved cells that are notionally sensor pixels for which no data was collected.The H-PMHT algorithm determines the probability of the missing data in the E-step and then refines the state estimates in the M-step.Initialization and iteration steps of H-PMHT are given below according to E and M-steps without the derivation details.

Iteration Steps of H-PMHT Algorithm
The H-PMHT algorithm consists of iteratively repeating steps for each batch sequence t = 1, ..., T. Some of these iteration steps come from E-step derivations, and the remaining come from M-step derivarions.Throughout the iterations the dynamic matrix F and measurement matrix H are assumed as constant or time invariant.First of all the iteration steps emerging from E-step are taken into account.
Step-1 (Expectation): The aim of this step is to find Total Sensor Probabilities (TSPs).To accomplish this aim, first of all Target Cell Probabilities (TCP) are calculated for batch length t = 1, ..., T; for all cells = 1, ..., S; and for all target models, including background k = 0, 1, ..., M. For the background and targets TCPs are calculated by using different equations as follows.
Total cell probabilities (7) are obtained by summing the product of TCPs and mixing proportions for all target models, including background, Lastly, TSPs are obtained by using displayed cells as follows, Step-2 (Expectation): Expected measurements are calculated for batch length t = 1, ..., T and for all cells = 1, ..., S.
This time for batch length t = 0, ..., T − 1 synthetic measurement covariance matrices are calculated as follows, Step-5 (Maximization): Mixing proportions are calculated as follows, At this point EM process is completed and in the following steps of the iteration smoothing process and obtaining estimated values take place.
Step-6: At this step of the iteration a Kalman smoother filter is applied.This portion of the algorithm is composed of forward and backward filters.The forward Kalman smoother filter calculated for t = 0, ..., T − 1 is applied by using synthetic measurements in order to refine target state estimates.At this point dummy expectation is taken as y (i+1) 0 |0 (k) = 0 and dummy covariance is taken as P (i+1) 0|0 (k) = 0.The equations of forward filter are given below; The equation of backward filter for t = T − 1, ..., 1 is given as follows; At the end of this step estimated target states are obtained for the selected batch period.
Step-7: At the last step estimated measurement and target covariance matrices are obtained.First cell-level measurement covariance is calculated as Estimated measurement covariance matrix is calculated as follows; The last operation of the iteration is to obtain estimated target covariance matrices for all the target models except background,

Simulation Study
In this study standard H-PMHT algorithm is used with RPIM data indigenous to visual degraded aerial vehicles.
This study is realized in two dimensional space for location, and an intensity information adding on each two dimensional cell segment, which can be regarded as a third dimension.However this work is considered as a two-dimensional application, and the assumptions in [7] can be conducted.The most important issue of this assumptions is x and y axes are statistically independent and become salutary to remind here.Furthermore in this study there is no pre-information about point spread function of the objects.Mostly they don't have linear-Gaussian distribution, although they are likely to keep their original shapes.
In order to realize the process first RPIM data is obtained for different real life video data of degraded air vehicles in daylight with sky background.As an additional explanation, in RPIM data background clutter mostly stems from darkgrey clouds, and dark-blue sky behind grey shading clouds.Then tracking process is applied to each RPIM data by using H-PMHT algorithm.For benchmarking the obtained results each scenario is reapplied to a trustworthy probabilistic algorithm IMMPDA-AI.This algorithm is a combination of IMM estimator and PDA technique [2,3], and by adding amplitude information, the results obtained with IMMPDA-AI will be proper for comparing the results of H-PMHT.By the way RPIM data performance of the H-PMHT is compared with the performance of a proven probabilistic technique, which also uses intensity data of RPIM.
Hence IMMPDA-AI is a point measurement tracking technique, the RPIM data is modeled for meeting this requirement without violating a fair comparison.In order to establish measurement data for using with IMMPDA-AI, first mean value of total sensor area intensity ratio (E{S A}) is calculated.Threshold is selected 10% higher than the mean value, and the resulting value is taken as amplitude information threshold (AI thr ).Hence the number of point measurements is brought into a reasonable level for comparison.Then whole the sensor area is divided into 5 × 5 pixel units, and mean intensity values of each pixel unit (E{PU }) are calculated.
Mean intensity values of each pixel unit compares with the threshold magnitude.If mean intensity of a pixel unit is higher than amplitude information threshold; E{PU } ≥ AI thr , then it is taken as a measurement, otherwise it is not assumed as a measurement.For each measurement, mean intensity of the related pixel unit E{PU } is taken as amplitude information.
For carrying out the tracking process of visual degraded aerial vehicles, RPIM data is obtained for different environmental conditions and aerial vehicle types.For this purpose video data is obtained for different aerial vehicles for different conditions in daylight.Video data was taken in "640 × 480" resolution by using a 14.1 Mega-pixel, ordinary camera, and no mast was used in order to prevent shooting vibration.After obtaining RPIM data from the original video data, sensor area for tracking purposes is selected "200 × 200" pixels.
The performances of H-PMHT and IMMPDA-AI were analyzed by using different real-life scenarios with single chopper or aircraft with visual degraded paint.After taking numerous video data of visual degraded aerial vehicle, evaluation parameter diversity and requirements were taken into account for scenario selection.By the way various scenarios were constructed in order to evaluate the performances of algorithms against to environment, speed and target geometry.Hence environmental conditions directly affect the signal to noise ratio (SNR), different SNR values were used, and these were considered as relatively low levels.Beside different aerial vehicles were taken into account, by the way different velocity rates, different pixel-spreading level of target and deviation from linear-Gaussian shape were obtained.Each scenario contained 21 scans, one for initialization, and the remaining 20 scans for tracking process.The initialization equation for measurement covariance is given in (21) and ρ 2 -measurement error variance is 16 pixel 2 .The initialization equation for target covariance is given in (22); ∆ -number of frames between samples of video data is 2 frames, and σ-scale factor is 1 pixel, Initialization values for mixing proportions are selected as π (0) t0 = 1 2 for background and π (0) t1 = 1 2 for target.The initialization values are invariably used for all scenarios.Selected scenario parameters are summarized in Tab. 1.Before giving the results of the tracking processes for the considered scenarios, trajectory of target centroids and corresponding estimation values of H-PMHT and IMMPDAF-AI algorithms for a particular scenario are presented.Therefore, target trajectories throughout the simulation process and the estimation values obtained by using H-PMHT and IMMPDAF-AI algorithms for the second scenario are given in Fig. 4. Also for giving a general idea about error profiles, the RMS values of estimation errors of H-PMHT and IMMPDAF-AI algorithms for the above trajectory are shown in Fig. 5.The performances of the algorithms are primarily evaluated by " hit on the target (HoT)" and RMS estimation error.Additionally CPU time should also be added as a third component for assessment.The processor of the computer used in simulations is Intel-Core i5-3470 CPU with 4 cores at 3.20 GHz.The computer has 4 GB RAM, the OS is Win 7 Professional, and its instruction set is 64-bit.In Tab. 2 simulation results of algorithms with respect to secenarios are given for evaluation and comparison.In the table mean values of the algorithms RMS estimation errors with refrence to target centroids throughout the scenarios are given in the column of deviation from target centroid "Dev.Trg.Cent".
While establishing Table 2, displayed cell intensity ratio threshold for sensor measuremetnts used in H-PMHT, and amplitude ratio threshold for point measurements used in IMMPDA-AI were taken as the sum of mean intensity ratio of the sensor area and 10% of maximum intensity (23).It is seen from Tab. 2 that CPU time is rather high for H-PMHT, and relatively low for IMMPDA-AI.These operational differences come directly from the structures of the algorithms.IMMPDA-AI is a conventional point tracking algorithm, on the other hand H-PMHT conducts the tracking process by using all sensor area.In fact IMMPDA-AI is a time-tested, real-time, and approved algorithm.On the other hand H-PMHT is still a conceptual algorithm and its progress has not been sufficient to reach real-time applications.However, the results obtained with H-PMHT are superior to IMMPDAF-AI with regard to HoT ratios and RMS Estimation Error values.Also an additional analysis for determining the effects of the threshold values on the performance of the algorithms has taken place.In this analysis threshold values of both algorithms were taken the same for comparison fairness.Four scenarios are selected to obtain the effects of threshold on the performances of algorithms with respect to SNR and target geometry.The limits of analysis are defined as E{I S A } < Trh < I Trh for lower and I Trh < Trh < AI max for higher values of intensity ratio threshold.The performances go slightly better for higher values of the threshold up to the upper limit (max.value of amplitude information).Beyond this value IMMPDA-AI doesn't work, because no data will exist.Max value of RPIM data intensity is higher than AI max , because of this H-PMHT continues to work at this region until not enough measurement data remains.On the other hand, the performance deteriorates for decreasing values between E{I S A } < Trh < I Trh .Deterioration rate for lower values depends on SNR and target area for H-PMHT.For low values of SNR and target area the deterioration rate is high, on the other hand for high values of them the deterioration rate is low.Only slight changes occur in the performance of IMMPDA-AI at the lower threshold values.
The analysis results are given in Tab. 3 under two selected thresholds; 90% of I Trh for lower, and 110% of I Trh for higher intervals.These values also characterize the behavior of higher and lower portions.The deterioration threshold ratios for H-PMHT at higher values are also given in the