PERFORMANCE ANALYSIS OF SELECTED FEATURE DESCRIPTORS USED FOR AUTOMATIC IMAGE REGISTRATION

: Automatic detection and extraction of corresponding features is very crucial in the development of an automatic image registration algorithm. Different feature descriptors have been developed and implemented in image registration and other disciplines. These descriptors affect the speed of feature extraction and the measure of extracted conjugate features, which affects the processing speed and overall accuracy of the registration scheme. This article is aimed at reviewing the performance of most-widely implemented feature descriptors in an automatic image registration scheme. Ten (10) descriptors were selected and analysed under seven (7) conditions viz: Invariance to rotation, scale and zoom, their robustness, repeatability, localization and efficiency using UAV acquired images. The analysis shows that though four (4) descriptors performed better than the other Six (6), no single feature descriptor can be affirmed to be the best, as different descriptors perform differently under different conditions. The Modified Harris and Stephen Corner Detector (MHCD) proved to be invariant to scale and zoom while it is excellent in robustness, repeatability, localization and efficiency, but it is variant to rotation. Also, the Scale Invariant feature Transform (SIFT), Speeded Up Robust Features (SURF) and the Maximally Stable Extremal Region (MSER) algorithms proved to be invariant to scale, zoom and rotation, and very good in terms of repeatability, localization and efficiency, though MSER proved to be not as robust as SIFT and SURF. The implication of the findings of this research is that the choice of feature descriptors must be informed by the imaging conditions of the image registration analysts


INTRODUCTION
This article is aimed at providing an empirical review of the strength and weaknesses of the most implemented feature descriptors as used in automatic registration of overlapping images. The analysed descriptors are the Scale Invariant Feature Transform (SIFT), the Speeded Up Robust Features (SURF), Modified Harris and Stephens Corner Detector (MHCD), the Maximally Stable Extremal Regions (MSER), and the Features from Accelerated Segment Test (FAST). Others are Smallest Uni-value Segment Assimilating Nucleus (SUSAN), Fast Retina Key point (FREAK), Hessian, Difference of Gaussian and the Hessian-Laplace algorithms. The review first provided a broad overview of feature detection and extraction, before providing a summary of the characteristics of some of the feature descriptors. It further analysed the qualities of the selected descriptors under seven (7) conditions which are Invariance to rotation, scale and zoom, their robustness, repeatability, localization and efficiency using UAV acquired images. Finally, details of the procedures of implementing the three descriptors adjudged to outperform others were provided and experimental findings of the performance evaluation were presented.

Feature detection and extraction
In image processing, images are generally represented by the features that can be extracted from them. These features are broadly categorised into two, namely, the global features and local features while the extraction of these image features can also be categorised into both high-level features and low-level features (Nixon and Aguado, 2008).
The global feature representation depicts the image as one multi-dimensional feature vector which describes the whole image. More specifically, the global feature representation approach produces one single vector with values that measure various part of the image such as tone, texture, pattern, shape (Hassaballah et al., 2016). Though global feature representation is generally fast, simple to compute and requires small amount of memory, they are also notably limited. Specifically, they are variant to transformations and are very sensitive to occlusion and blurs. In local feature representation, images are distinctively represented based on their local structures using local features which are also known as key points or interest points, and can be described as specific and unique patterns that are distinct from the pixels within its neighbourhood (Tuytelaars and Mikolajczyk, 2007) and are generally associated with one or more properties of the image (Li et al., 2015). They are points with a welldefined position in the image space, unambiguous mathematical description, and they are stable under perturbations such as variations in brightness (Mubarak, 1997). Examples of such features are regions, edges, and corners. When compared to global feature representation, the local features are notable for superior performance, distinctiveness and better stability (Jégou et al., 2012) though they require significant amount of memory because many local features can be found on a single image. The advantages of local feature representation make it more suitable for object recognition and image matching (Hassaballah et al., 2016).
Ideally, local features are expected to have the following qualities or characteristics: distinctiveness, locality, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) accuracy, quantity, efficiency, repeatability, invariance and robustness which attests to their less sensitivity to noise or blurs (Ehab and Murad, 2017). These qualities are also expected to be inherent in the formulation of feature detection and extraction algorithms which are the algorithms that detect and extract these features and prepares them for further applications in image registration. They are also referred to as feature descriptors which are described also as the methods that are used in the computation of abstractions of the information on an image pair, which is used in making informed decisions of the identity of every image point on an image, whether there is an image feature of a particular type or not.

Characteristics of selected feature descriptors
The major characteristics of some of the selected feature descriptors are presented in Table 1 while the result of the performance evaluation of the Ten (10) descriptors is presented in Table 2. The performance evaluation shows that while some of the descriptors are invariant to the trio of scale, rotation and zoom (SIFT, SURF MSER, etc), others are only invariant to either of them. The analysis also shows that only MHCD is excellent in terms of robustness, repeatability, efficiency, and localization. Other algorithms are also very good under these four (4) conditions except for FAST and FREAK.

Implementing MHCD, SIFT and SURF
The basic steps involved in the implementation of the MHCD, SIFT and SURF are discussed under this section and each of the algorithms are discussed in the following subsections: 1.3.1 Modified Harris corner detection (MHCD) algorithm: While MHCD is partially invariant to affine intensity change, it is non-invariant to spatial scale. The activity diagram depicting the algorithmic stages of implementing the Modified Harris Corner detection (MHCD) algorithm is presented in Figure 1 while the stepby-step procedure of the algorithm's implementation are as follows: Step 1. Computation of horizontal and vertical derivatives of the stereo image.
Step 2. Computation of three images corresponding to the three terms in matrix .
Step 3. Convolving these three images with a large Gaussian window.
Step 4. Computation of scalar corner response using one of the corner response measure.
Step 5. Finding local maxima above some predefined threshold as detected interest points.
Step 6. Computation of SURF descriptor around detected interest points.
Step 7. Matching the corresponding points based on the descriptor difference.
Step 8. Filtering out the outliers from matched points using RANSAC algorithm.

Scale invariant feature transform algorithm (SIFT):
The SIFT descriptor is a vector of 128 values, each between [0 -1]. Its feature point is associated with location, orientation and scale (Lowe, 2004). It is invariant to image rotation, scale, intensity change, and to moderate affine transformations. Figure 2 presents the activity diagram showing the implementation stages of the SIFT algorithm while the step by step procedure are as described in the following steps: Step 1. Detection of key points: Locally distinct points over different image pyramid levels were detected by: a. Applying Gaussian smoothing, b. Using Difference-of-Gaussians (DoG) to find extrema (over smoothing scales), c. Maxima suppression at edges.
Step 2. Computation of SIFT descriptor which transformed image content into features that are invariant to scaling, image translation, and rotation by: i.
Computing image gradients in local 16x16 area at the selected scale, ii.
Creation of an array of orientation histograms; 8 orientations histogram array of 128 dimensions (yields best result).
Step 3. Matching of the corresponding points based on the descriptor difference.
Step 4. Filtering out outliers from matched points using RANSAC algorithm.
1.3.3 Speeded up robust feature (SURF) detection and extraction algorithm: Figure 3 presents the activity diagram for the implementation of SURF algorithm while the following procedural steps of the algorithm's implementation are as follows: Step 1. Creation of an integral image, Step 2. Extraction of key points by: a. Creating approximation of Hessian matrix. b. Calculating responses of kernel used.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) c. Finding local maxima across scale space.
Step 3. Determination of the SURF descriptor size to be used.
Step 4. Obtaining the dominant orientation.
Step 5. Extraction of the SURF descriptor.
Step 6. Matching the corresponding points based on the descriptor difference.
Step 7. Filtering out the outliers from matched points using RANSAC algorithm. It is invariant to rotation, affine transformation changes and illumination. It performs optimally in feature extraction but with a slow execution time. 6 Principal Component Analysis (PCA)-SIFT It reduced SIFT's execution time for matching (executes faster) but was proved to be less effective in feature detection compared to SIFT.
Key: Where † means Yes, X means No or None, ¶ means good, ¶ ¶ means better and ¶ ¶ ¶ means best.
Based on the performance evaluation result as shown in Table 2, the SIFT, SURF and MHCD proved to exhibit more qualities in image registration. These algorithms are all known to be invariant to zoom, noise, scale, rotation and illumination (Krishna and Varghese, 2015). Hence, detailed algorithmic procedure of implementing these three (3) selected algorithms are provided in Figures 1 -3. In order to achieve this, the mathematical description of the three (3) algorithms was first highlighted as presented in subsections 1.3.1, 1.3.2 and 1.3.3 for MHCD, SIFT and SURF respectively, and attempt was made to implement them following the procedures described in the process flow or activity diagrams (Figure 1-3) while their transformation homography was formulated using Random Sampling Consensus (RANSAC) algorithm because it only makes use of the required minimum number of input data set possible for the generation of candidate solutions, before proceeding to the enlargement of these data set with consistent data points in its estimation of model parameters (Ajayi, 2014;Fischler and Bolles, 1981) and also because of its ability to effectively cope with large percentage of outliers or mismatches in the input data set. It was also used for the exclusion of outliers from the matched points. The activity diagrams were composed within the Microsoft Enterprise Architecture software environment. The implementation phase was divided into input, processing and output stages for the three algorithms.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition)

EXPERIMENTAL ANALYSIS
Apart from the result of the weighted analysis of the ten (10) selected feature descriptors presented in Table 2, The three (3) feature descriptors discussed in subsection 1.3 were also implemented in an experimental design of an image registration scheme using UAV acquired overlapping image pairs of 80% overlap, presented in Figures 4a and 4b. While Figure 4a presents the base or reference image, Figure 4b presents the sensed or floating image. The size of each of the image pair is 3000 x 4000 pixels and it covers part of the Main campus of the Federal University of Technology, Minna, Nigeria. The result shows that the three feature descriptors proved to be indeed invariant to rotation as observed from the parameter vectors recorded in their estimated homography which shows a rotation angle that is equal to zero. The efficiency of the three feature descriptors was also tested with respect to their processing speed and the number of automatically extracted features or point correspondences. The result of this analysis is presented in Table 3, while the inliers of the automatically extracted conjugate points using the three descriptors are presented in Figures 5a, 5b and 5c for MHCD, SURF and SIFT respectively. From the experimental result (Table 3), it was discovered that the SIFT algorithm proved to be more robust than the MHCD and the SURF algorithms in the automatic detection and extraction of point correspondences. It automatically extracted 1067point correspondences which is approximately 6.20 times more than the point correspondences automatically extracted by the MHCD algorithm (172) and 1.59 times more than the point correspondences automatically extracted by the SURF algorithm (671). This observation also agreed with the findings of Vivek and Kanchan (2014) and Panchal et al., (2013) which submitted that the SIFT model is very powerful in the automatic extraction of corresponding features. Also, though SIFT extracted the highest number of corresponding points, it proved to be very slow in processing or registering the images because it expended more processing run time when compared to the other implemented algorithms. The MHCD outperformed SIFT and SURF in terms of speed. It proved to be 1.60 times faster than SIFT and approximately 2 times faster than SURF. This is also consistent with the findings of Juan andGwun (2009), andEl-gayar et al. (2013).

CONCLUSIONS
The basic characteristics of the selected Ten (10) feature descriptors have been reviewed in this article. Also, an evaluation of the performance of these descriptors was also carried out under seven different conditions. The analysis shows that each of the descriptors have different qualities which makes them suitable for different image registration conditions. From the selected feature descriptors, MHCD, SIFT and SURF were further discussed in details with emphasis on their algorithmic implementation procedures while an experimental analysis was also conducted using these three algorithms on UAV acquired overlapping images. The result of The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) the experiment shows that the three feature descriptors are indeed invariant to zoom, noise, scale, rotation and illumination. It also shows that while MHCD is very fast, it automatically extracts the least number of key points when compared to the three feature descriptors, while the SIFT automatically extracts the highest number of key points, though it expends more processing time. Finally, the choice of feature descriptor for an image registration task should be based on the peculiarities of the imaging conditions as no single feature descriptor can be acclaimed to be significantly better than others.

ACKNOWLEDGEMENT
This project was gratefully funded in part by the research grant awarded to the author by the Aubrey Barker Fund (ABF), UK and the Surveyors Council of Nigeria (SURCON). The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition)