Feature Weighting and Retrieval Methods for Dynamic Texture Motion Features

Feature weighing methods are commonly used to find the relative significance among a set of features that are effectively used by the retrieval methods to search image sequences efficiently from large databases. As evidenced in the current literature, dynamic textures (image sequences with regular motion patterns) can be effectively modelled by a set of spatial and temporal motion distribution features like motion co-occurrence matrix. The aim of this paper is to develop effective feature weighting and retrieval methods for a set of dynamic textures while characterized by motion co-occurrence matrices.


Introduction
Dynamic textures are image sequences on natural scenarios possessing regular motion patterns.The motion assembly by a flock of flying birds, water streams, fluttering leaves, and waving flags are some of the most common examples of dynamic textures.Multimedia objects e.g., image sequences, are presented in terms of their features for the purpose of indexing and retrieval.A good feature has the property that it is similar among the same class of objects and dissimilar among different classes of objects.Feature weighing methods are commonly used to find the relative significance among a set of features.
Retrieval methods, on the other hand are used to search image sequences efficiently from large databases by using appropriate feature set.The growing popularity of the Internet, the introduction of new consumer products for digital image and video creation, and the emergence of digital standards for television broadcasting have resulted in a greater demand for efficient retrieval of multimedia data.Feature weighting and retrieval methods while applied on first order features deserve some special consideration.
Motion Co-occurrence Matrix (MCM), a commonly used fist order feature for analysing dynamic textures 3,7,11 , is computed from some motion related measures along a clique from time-space domain (Section 2.1).The conventional feature weighting approach considers the features altogether and computes feature weights using a heuristic search as otherwise an exhaustive search is highly time consuming under such a scenario for large feature sets.Considering domain information, however, MCMs can be organized into a hierarchy and weights can be computed at each level in the hierarchy.Such an approach facilitates using an exhaustive search at levels with a smaller number of features, and thus improving the overall classification accuracy.We thus propose in this paper a hierarchical approach for feature weighting that provides a mechanism to control motion measure specific MCM, clique and domain weights at different levels.Note that the conventional feature weighting approach is unable to control MCM weights domain wise and leaves room for some classification accuracy improvement as observed in the result section (Table 1).
Fast retrieval of image sequences from large databases is commonly performed using hierarchical retrieval methods.To perform hierarchical retrieval it is necessary to group the image sequences beforehand, based on some similarity measures.This phase is known as clustering.The goal is to group similar videos into a hierarchy of clusters and compute the cluster centres so that during retrieval, the query image sequence can be compared with the cluster centres instead of all the image sequences in the database.The cluster centre is commonly computed using the average of feature vectors of a set of videos in the cluster.Such averaging of MCMs however represents a mixed dynamic texture and commonly used distance measures between single and mixed textures suffer from the volumetric problem 1 where the MCM of the mix can represent none of the component textures.So, MCM averaging is not suitable to represent a cluster for retrieving dynamic textures based on their MCM.We thus propose a new cluster centre computation procedure and develop a Multi Feature Centre Retrieval (MFCR) method.As evidenced from experimental results, our proposed MFCR method provides better retrieval performances over the conventional hierarchical retrieval methods.
This paper is organized as follows.Some background and related works are briefed in Section 3. Some contemporary and our proposed feature weighting approaches are elaborated in Section 4. Our proposed retrieval method is elaborated in Section 5 along with some other retrieval methods applicable on MCM.Comparative study on experimental results of feature weighting and retrieval methods is presented in Section 6. Finally Section 7 concludes the paper.

Background and Related Works
MCM is the most commonly used first order motion distribution statistics in all the dynamic texture characterization techniques and for the sake of clarity of future discussion we first define MCM in this Section.There exist a number of motion based dynamic texture characterization techniques [1][2][3][4][5][6][7][8][9][10][11] .MCM as a first order feature is used in an explicit manner in Optimal Time Space Ratio technique 3,11 .The functionality of OTSR is thus also elaborated in this Section.

OTSR Technique
A dynamic texture has one temporal dimension and two spatial dimensions and in OTSR three instances of MCM namely temporal co-occurrence matrix Distance between dynamic textures is computed along each of these three cliques using KL-divergence and are fused into a single distance measure using a weighted Euclidian summation at an optimal weight.Assignment of explicit weights to time and space domains facilitates the use of any feature weighting technique to obtain the best classification results at optimal weight and hence the technique is named Optimal Time-Space Ratio (OTSR) technique.As first order features are directly used in OTSR, it provides a good platform for the application of our proposed feature weighting and retrieval methods and the reported experimental results in this paper are applied on an extended OTSR model as detailed next.

Feature Set of Extended OTSR Technique
Only a minimal set of three MCMs are considered in OTSR.In order to obtain more generalized feature weighting and retrieval methods, we need to extend the basic feature set of OTSR technique by incorporating more MCMs.Consequently, some modifications to the parameter notations of the basic OTSR technique are made below.
Assuming that each image processing element in the motion frame sequence M is associated with a motion vector ) , ( ζ be a vector-to-scalar mapping function.In the extended OTSR technique, the following two vector-to-scalar mapping functions are considered ) ( whereas in the basic OTSR technique we used only . Both magnitude and direction of motion vectors are quantized into motion measures using a linear quantization method.In order to distinguish MCMs corresponding to magnitude and direction, we

Feature Weighting Methods
The two most common groups of feature weighting methods are the filter and wrapper methods.Filter methods 16,27,29 determine the relevance of features to describe the data using statistical techniques and do not use feedback from subsequent learning or classification algorithms.Conversely, wrapper methods 14,17,26 use feedback that incorporate a Maximum Likelihood (ML) approach where weight assignment is optimized using classification accuracy as a performance measure.It is in this context that in this paper we focus on wrapper methods for feature weighting.Finding optimal weights of a large set of features using an exhaustive search is a highly time consuming process.Heuristic search algorithms are common in the literature for feature weighting under such circumstances, and Genetic Algorithm (GA) is the most commonly used method because it performs a randomized search for optimal feature weights and it is not susceptible to becoming stuck in local minima.We thus explore the GA for feature weighting in this paper.
The conventional Flat Feature Weighting (FFW) approach considers the entire feature set for computing weights without factoring in domain information and thus is able to use only heuristic search methods like GA.We thus develop a Hierarchical Feature Weighting (HFW) approach, which provides a mechanism to control MCM, clique, and domain weights at different levels.We have used both exhaustive search and GA based randomized search for feature weighting at different levels in the hierarchy.We first elaborate GA feature weighting method in this section followed by the proposed HFW approach.

Feature Weighting Using Genetic Algorithm
GA 14 performs the optimal weight search using two operators -crossover and mutation.The crossover operator has the effect of merging solutions whilst preserving the already successful feature weights.The mutation operator makes sure that GA does not get stuck to local minima.GA encodes feature weights in a string of binary digits.Each binary digit is known as a gene and the string of genes is known as a chromosome.A group of b consecutive genes in a chromosome is decoded to calculate the feature weight.For a total of n features the chromosome length is nb .
In order to find the optimal feature weights, GA starts with a large set (population) of randomly generated chromosomes.The fitness of each chromosome in the current population is evaluated from the classification accuracy of a dataset with feature weights decoded from corresponding chromosome.A global variable stores the feature weights corresponding to the chromosome with maximum fitness.A set of chromosomes, whose fitness is better than the remaining chromosomes in the current population, are used to fill in a mating pool for subsequent crossover and mutation operations.The crossover operator randomly selects two parent chromosomes from the mating pool and interchanges a portion of the gene stream between them to generate new child chromosomes (Fig 2 ).A mutation operator is then applied on the new chromosomes where every gene (bit) in a chromosome is altered with a certain probability.These newly generated chromosomes constitute the GA population for the next generation.During the evolution of GA, the classification accuracy of the underlying data set is used as a performance measure of the weight sets.GA stops when there is no improvement of average fitness of the population for a consecutive number of generations.

Hierarchical Feature Weighting (HFW)
Weight of a feature computed by any feature weighting approach is applied to the corresponding feature distance between two textures.We thus elaborate the distance measure first followed by the hierarchical feature weighting approach.
The primary dissimilarity between two textures i and j is computed in terms of motion magnitude MCM and motion direction MCM along each clique η using KL divergence (a commonly used distance metric for dynamic textures 3,7,12 ).Let the distance along clique η for magnitude MCM and direction MCM be denoted by respectively.These distances are fused into a single distance measure for each clique where are the weights of the magnitude and direction measures respectively for clique η and The optimized clique distances are then fused into two domain distance measures, one for each of the two domains as and .It is clearly evident from the above hierarchical distance measure that explicit clique and domain weights are maintained at different levels to provide a mechanism to control corresponding weights.
In order to compute feature weights, three levels of weighting are involved, using hierarchical distance measure.We use the wrapper method to optimize the distance measure at each level, i.e., the weight ratio is searched for which the dataset is classified with the highest accuracy.At the first stage, m η α and d η α are optimized for each clique η where distance between two textures is computed using (2).We use an exhaustive search up to two decimal points at this stage to find the optimal weights between magnitude and direction MCMs corresponding to each clique.Although two weights are involved, due to the where the distance between two textures is computed using (5).At this level again we use an exhaustive search up to two decimal points, as the degree of freedom is 1.
The HFW approach has the following benefits over the FFW approach: (i) As the search space is divided into smaller groups in HFW, we are able to use the exhaustive weight search at top and bottom level that FFW fails to use because of huge search space.Exhaustive search is better than randomized search and as a result HFW achieves better classification accuracy than FFW approach; (ii) for the same reason HFW is faster than FFW.If weighting needs to be adjusted adaptively, HFW is clearly a better alternative to FFW; and (iii) HFW provides a mechanism to control weights among cliques as well as domains that FFW fails to provide.HFW clearly portrays relative significance among a set of cliques within a domain to make the selection process of a minimal set of features easy.

Retrieval Methods
Time efficient retrieval of visual documents from large databases has become an important research area in recent times because of the proliferation of video and image data in digital form.The most commonly used video retrieval scheme in contemporary visual document retrieval systems 18,19,22,24,28 is Query By Example 13 (QBE), where given a query video, similar examples are retrieved by comparing the low level features of the query video with the corresponding precomputed features of the videos in the database.Such exhaustive search for similar videos in the database is highly time consuming and thus restricts the use of QBE for many practical applications.One approach to solve this problem is to create an indexing scheme by grouping similar videos beforehand into similar clusters so that at the time of the query, only the relevant set of clusters are examined.These time efficient retrieval methods are commonly known as hierarchical retrieval methods.In this section we elaborate two existing such retrieval methods, HREC (Hierarchical Retrieval using Exhaustive Centre 7 ) and HRRC (Hierarchical Retrieval using Representative Centre 15 ).
As mentioned in Section 1, to perform hierarchical retrieval similar videos are grouped into a hierarchy of clusters beforehand and the cluster centres are computed so that during retrieval, the query video is guided to the proper cluster by comparing its features with those of the cluster centres.The basic hierarchical clustering algorithm used in all retrieval methods is presented in Fig 3 .Hierarchical clustering starts by placing all the videos in the database into distinct clusters.These clusters are iteratively merged into new clusters based on some similarity measure DClust and placed higher in the hierarchy (Fig 4).The merging process ends when only the root cluster is left.
During the retrieval tree building process, the cluster centre is computed for each newly formed cluster.A cluster centre is a hypothetical video that in general is represented by a feature vector computed over the feature vectors of the videos in that cluster.During retrieval, similarity between the feature vector of the query video and that of a cluster centre is computed using some similarity measure DQuery .The search process starts at the root cluster by comparing the query video with its two children.Based on similarity, the appropriate cluster is chosen from these two for searching in the next level.The process continues until the search reaches a leaf video node.
HREC, HRRC and our proposed MFCR method differs in defining DClust , cluster centre and DQuery , as detailed next.

The HREC Method
During the retrieval tree building process, the distance DClust between two clusters k C and l C , is defined in terms of the distances of the videos as is the distance between videos i and j .
The cluster centre for cluster k C is represented by a hypothetical video whose feature vector is obtained by averaging the feature vectors of all the videos in k E .During retrieval, the distance between a query video i and a cluster k C is computed as where k c is the hypothetical video representing cluster k C .

The HRRC Method
During the retrieval tree building process, the distance DClust between two clusters k C and l C , is defined in terms of the distances of the videos as is the distance between videos i and j .
The cluster centre for cluster k C is represented by a hypothetical video whose feature vector is obtained by averaging the feature vectors of a set of the k R representative videos in k E that are selected as During retrieval, the distance between a query video i and a cluster k C is computed using (7).

The MFCR Method
The feature vector computation process for cluster centre in existing HREC and HRRC methods is not suitable when each feature in the vector is a MCM.Average of a set of MCMs actually represents a mixed texture 1 , and distance computation between single and mixed textures in terms of MCM suffers volumetric problem.Thus during retrieval the distance between the feature vector of a query video and that of the cluster centre is faulty when MCM is used as feature.We thus propose a new approach in the MFCR method to compute the cluster centre.
In the proposed MFCR method the distance DClust between two clusters k C and l C is computed using (8).The cluster centre k c for cluster k C is associated with a set of k R representative videos instead of a hypothetical mixed video.The k R representative videos for cluster k C are selected using (9).During retrieval, the distance between a query video i and a cluster k C is computed as As distance is computed between feature vectors of single videos in (10), the volumetric problem is resolved in the proposed MFCR method.A cluster centre k c is associated with a set of k R feature vectors and hence we use the name Multi Feature Centre Retrieval method.For the sake of simplicity we have used the same number of representatives for all the clusters.Thus considering R representative videos, a cluster .With an increased number of representative videos, the probability of a query video belonging to the class of any representative video will be high, leading to higher retrieval accuracy, although at the cost of higher search time (as more distances need to be computed).Thus our proposed MFCR provides a user-controlled threshold (number of cluster representatives R ) to control between retrieval accuracy and search time.In terms of storage overhead, MFCR is better than HREC and HRRC.In MFCR, when clusters are created we need to store only references to the representative videos in the database for cluster centre as we use their features directly whereas in both HREC and HRRC methods additional storage is required for the feature vector for the cluster centre that are generated by averaging multiple feature vectors.

Experimental Results
We conducted a set of feature weighting and retrieval experiments using the features of Extended OTSR technique.We used the dynamic texture dataset consisting of 124 video clips of Szummer image sequences 20,25 .Motion vectors were computed using block motion computation method 21 where π 2 is the maximum possible vector angle and 25 , , 1 , 0 K = i .For feature weighting experiments using both the FFW and HFW approach, we used the half of the dataset for learning feature weights and the full dataset for evaluation.In order to conduct the classification we used 1-NN classification method.For the FFW approach we used GA as the underlying feature weighting method.For the GA weighting method we considered 10,000 chromosomes in the GA population.The crossover operation is performed by dividing a chromosome into two equal halves and interchanging gene streams between two right corresponding halves of two different chromosomes.For mutation, a gene is altered if the corresponding random number generated using uniform distribution is less than 0.1.We used the same GA setup for the second level weighting in the HFW approach.
For retrieval experiments, the retrieval tree was built up using all the videos in the dataset.For retrieval purpose all the videos in the database were supplied as a query.We used a 1-NN classifier to evaluate the results of the retrieval methods.The feature vector of each video consists of 18 MCMs.To calculate the distance between videos we used hierarchical distance measure ∇ defined in (5) where the MCMs were weighted using the weights computed by the HFW approach.

Feature Weighting Experiments
Feature weights obtained by FFW and HFW are presented in Fig 5 and Fig 6 respectively.The performances of the feature weighting approaches are presented in Table 1.It can be observed from Table 1 that with feature weighting, classification accuracy improves by 9.1% and 10% for FFW and HFW approaches compared to that when no weighting in used.It is also observed that the HFW approach achieves better classification accuracy than the FFW approach.This can be attributed to the fact that two exhaustive search steps are used in HFW whereas FFW uses a randomized search.Note that the accuracy improvement is only 0.83%.This is because an exhaustive search is used to find weights between only two features at bottom level and two domains at top level compared to a significant number of cliques that are weighted at Step 2 in HFW that makes the influence more biased towards GA.However the influence of GA in FFW is more than that in the HFW approach.
It is evidenced from In OTSR, . It can also be observed from the middle level (Step 2) of HWF that the former clique triplet is better than the later.Finally the time-space weight ratio obtained at the top level (Step 3) of HFW portrays the importance of spatial domain over temporal that also agrees with the fact that an image sequence has two spatial and one temporal dimension.

Retrieval Experiments
In Fig 7 we report the retrieval accuracy results obtained using our proposed MFCR method for different number of cluster representatives using 1-NN classifiers.An important observation can be summarized from Fig 7.
With an increased number of cluster representatives, classification accuracy gets better in general.This is because, with an increased number of representative videos, the probability of a query video belonging to the class of any representative video will be high, leading to a higher retrieval accuracy.
Although the accuracy of MFCR increases with an increase in the number of representatives per cluster, this is not without cost.Fig 8 shows the average retrieval time computed over different dynamic texture classes.It can be observed that retrieval time increases with the increased number of representatives per cluster.This is because with a higher number of representatives more distances need to be computed, thus increasing retrieval time.Thus the number of cluster representatives can act as a good threshold to control between accuracy and retrieval time.An increased number of cluster representatives provide better classification accuracy at the cost of retrieval time.
In Table 2 a comparison of the proposed MFCR method with the HREC and the HRRC methods is provided.For MFCR we have used ⎡ ⎤ 7 124 log 2 = representatives per centre where 124 is the total number of videos in the database.It can be observed that the MFCR method outperforms the other two methods.The reason for this better accuracy level can be explained form from the fact that HREC and HRRC perform feature vector averaging, which is not suitable for features like MCM because of the volumetric problem.Thus query videos are misled to wrong clusters, resulting in poor retrieval accuracy.Our proposed MFCR method eliminates the volumetric problem and results in better retrieval accuracy.

Conclusions
In this paper we have developed effective feature weighting and retrieval methods to apply on MCMs.We have proposed a hierarchical approach for weighting a set of motion co-occurrence features to exploit the inherent domain influence by dividing the search space in such a way that the degree of freedom at the top and bottom level is low enough to use an exhaustive search.Consequently, better classification results have been achieved using this approach compared to the conventional flat feature weighting approach.We have modified the conventional hierarchical retrieval process so that the motion co-occurrence features can be used.
More precisely, we have replaced the feature computation process of cluster centres at each level by associating a set of feature vectors instead of a single average feature vector, as such average features neither represent the cluster nor are suitable for computing similarity between motion distribution features.As evidenced from the experimental results, our proposed method provides better retrieval results over the conventional hierarchical retrieval methods.Step 1 Step 2 Step 3 Use of first order motion features like MCM in OTSR facilitates the use of block motion leading to real time dynamic texture recognition.
denote the MCM corresponding to clique η by ζ higher number of cliques in the extended OTSR technique.MCMs are computed for temporal cliques t (Fig 1).Thus in the extended OTSR technique, each dynamic texture is identified by a that the remaining cliques are ignored as their respective MCMs are identical to the MCMs of the cliques considered.
optimized domain distances are fused into a single distance measure as ) the exhaustive search.At the second stage, η λ 's are optimized for each domain using the distance measures in (3) and (4).As the degree of freedom of the optimization process at temporal and spatial domains are exhaustive search is no longer feasible, hence we opt to use the randomized heuristic GA search technique.Finally domain weights t ω and s ω are optimized cluster centre k c

Fig 5 and
Fig 6 that the HFW approach provides a clear picture of MCM, clique and domain weights at different levels that FFW fails to provide.It can be observed from bottom level (Step 1) weights of HFW that on an average magnitude MCMs are getting more weights than direction MCMs and thus direction MCMs can be ignored for practical applications.

Fig 1 :
Fig 1: Spatiotemporal neighbourhood of point ) 0 , 0 ( t M .The temporal and spatial neighbouring points are in frames 1 − t M and t M .

Fig 5 :
Fig 5:Feature weights obtained using the FFW approach using the GA feature weighting method.Along each clique, the weight of magnitude and direction MCMs are marked in red and blue respectively.

Fig 6 :
Fig 6: Feature weights using the HFW approach --Step1: Along each clique weight of magnitude and direction MCMs are marked in red and blue respectively; Step2: Spatial and temporal clique weights are marked in green and blue respectively, and Step3: Spatial and temporal domain weight marked in green and blue respectively.

Fig 7 :
Fig 7: Retrieval accuracy proposed MFCR method at a different number of cluster representatives.Results are reported using 1-NN classifier.

Fig 8 :
Fig 8: Average retrieval time of proposed MFCR method at a different number of cluster representatives.

Table 1 :
Performance of different feature weighting approaches in terms of classification accuracy.