UAV AND LIDAR IMAGE REGISTRATION: A SURF-BASED APPROACH FOR GROUND CONTROL POINTS SELECTION

: Multisource remote sensing image data provides synthesized information to support many applications including land cover mapping, urban planning, water resource management, and GIS modelling. Effectively utilizing such images however requires proper image registration, which in turn highly relies on accurate ground control points (GCP) selection. This study evaluates the performance of the interest point descriptor SURF (Speeded-Up Robust Features) for GCPs selection from UAV and LiDAR images. The main motivation for using SURF is due to it being invariant to scaling, blur and illumination, and partially invariant to rotation and view point changes. We also consider features generated by the Sobel and Canny edge detectors as complements to potentially increase the accuracy of feature matching between the UAV and LiDAR images. From our experiments, the red channel (Band-3)


INTRODUCTION
Multisource remote sensing utilizes data from various image sources in order to improve upon single data remote sensing.However, multiple sources of images are only useful after proper image registration, which can be achieved through accurate selection of ground control points (GCPs).Generally, image registration techniques are divided into two categories, (i) parametric (i.e. using all required parameters of the remote sensing platform), and (ii) non-parametric (i.e.only a set of GCPs are considered) (Bouchiha & Besbes., 2013).This work focuses on the latter category, which is further divided into manual and automatic registration methods (Wong & Clausi., 2007).
In manual registration, the selection of control points (CP) is performed by a human operator, which can be prone to inaccuracies due to human error (Eastman et al., 2007).Moreover, manual registration is impractical for complex images where the human eye might not be able to decide on suitable CPs.This is where automatic techniques come in as it potentially avoids the pitfalls of human limitations.Automatic techniques can fall into two categories, namely area-based methods (ABM) and feature-based methods (FBM) (Zitova & Flusser., 2003;Kalantar et al., 2017a;Kalantar et al., 2018).Also referred to as correlation-like or template matching methods (Fonseca & Manjunath., 1996), ABM calculates the statistics within a small pixel window, both in the sensed image and the reference image (Hong & Zhang., 2007).
The Harris corner detector (Harris & Stephens., 1988) is a popular feature based on the eigenvalues of the second moment matrix.However, Harris corners are not scale invariant making it less suitable for dynamically varying remote sensing footage.Lowe (2004) proposed using SIFT, owning to its effectiveness in domain-specific image registration, image mosaicking and image retrieval (Lowe, 2004;Ledwich & Williams., 2004;Harandi et al., 2015;Yang & Guo., 2008).SIFT's main limitation however is its high computational cost (Harandi et al., 2015).Attempts have been made to speed up the algorithm such as the work by (Mikolajczyk & Schmid., 2005).Here, the authors attempted to make SIFT lighter, more robust and discriminative by proposing a variant based on the Gradient Location-Orientati0on Histogram (GLOH) (Mikolajczyk & Schmid., 2005).GLOH requires a lot of priori samples to generate a projection matrix in the feature space.Other than that, shape context (Belongie et al., 2002) was also used to compute the correspondences between points.
Another common interest-point detector is SURF (Bay et al., 2008) as it is invariant to rotation, scale and illumination and also the computational cost is low so it is suitable for any real time application.Conceptually, it is very similar to SIFT but with faster computational time.The work in Bouchiha and Besbes (2013) proposed a SURF-based automatic registration approach for remote-sensing.They used three different image datasets where one reference image has more than four sensed images.Two sensors were utilized namely Landsat TM and Landsat ETM+.The reported results were robust to rotation, scale and illumination changes and can be used to register remotely sensed images obtained under varying conditions.Brook & Ben-Dor (2011) proposed a novel method for automatic image registration based on topology (AIRTop) for change detection and multi-sensor (airborne and spaceborne) fusion.SURF was used to extract landmark structures (roads and buildings).A remote sensing image registration method was also proposed in Panchal et al. (2013) using an optimized variant of SURF.Two test sets were used: (i) the first being Landsat 7 ETM+ as the reference image and the Landsat 4-7 sets combined as the sensed images, and (ii) the second includes SIR-C Radar as the reference image and the Landsat 4-5 MSS as the sensed image.Results showed that this variant of SURF was robust and accurate in performing registration.
To the best of our knowledge, no work has attempted to utilize interest point detection algorithms that involve UAV and LiDAR images.Therefore, this paper will attempt to investigate and evaluate the use of the SURF algorithm for UAV and LiDAR images registration.We foresee that an interest point detector would be able to accurately identify GCPs, with minimal time (hence SURF over SIFT).In the following section, the datasets and the study area are presented, including a brief explanation of the SURF algorithm as well as the overall workflow of our proposed approach.Section 4 presents the experimental results followed by discussions in Section 5.This paper is concluded in Section 6 with remarks regarding future research.

STUDY AREA AND DATA USED
The sensed image used in this study is obtained on February 16th 2016, from the fixed-wing UAV platform over the study area located between 102° 19' 55" E to 103° 27' 08" E and 02° 50' 36" N to 02° 39' 22" N, over Universiti Putra Malaysia (UPM), Malaysia (Fig. 1.).The UAV image has three bands [green (G), red (R), blue (B)] with a ground resolution of 0.068 m/pix.These images show several geometric transformations/changes such as scale (scale factor between 1 and 5), rotation (varies between 25° and 175° with a step of 25°) and photometric (e.g.changing illumination).The other dataset we used is from an airborne LiDAR (Light Detection and Ranging) system on March 8, 2013 over the same study area.Along with LiDAR point clouds, orthophotos were captured with a camera with a spatial resolution of 13 cm.The laser scanner has a scanning angle of 60° with a camera angle of ±30°.The posting density of the LiDAR data was 3-4 pts/m2 (average point spacing = 0.41 m).

METHODOLOGY
Figure 2 shows the overall workflow of the proposed work.First, the UAV and LiDAR data are preprocessed using three main steps: (i) orthorectification, (ii) mosiacking, and (iii) grayscale conversion (Kalantar et al., 2017b).Then, two edge  images are generated using the Sobel and Canny edge detectors, respectively.The SURF algorithm is then used to detect the pertinent features, followed by the computation of their descriptors.The matching pairs are then identified between the UAV and LiDAR datasets.The performance of SURF is scrutinized for each input dataset based on the matching results and visualizations.This generally entails evaluation of the preprocessed images and the edge filtered images.

Pre-Processing
UAV images were preprocessed using Agisoft PhotoScan software which allows generating georeferenced dense point clouds, textured polygonal models, digital elevation models (DEMs), and orthomosaics from a set of overlapping images with the corresponding referencing information (Kalantar et al. 2017b).On the other hand, LiDAR data were preprocessed using ArcGIS 10.3 software.The point-cloud LiDAR data was interpolated into a digital terrain model (DTM) and aerial orthophoto images were rectified and mosaicked based on the LiDAR point cloud.

The principles of SURF
As explained in Teke & Temizel (2010), SURF is based on the approximated Hessian Matrix: where is the convolution result of the second order derivative of Gaussian filter with the image in point x, and similarly for and .The first step of SURF includes fixing a reproducible orientation according to the information from a circular region around the interest point.Secondly, a square region aligned to the selected orientation is constructed where the SURF descriptor are extracted from.The Haar-wavelet responses in x and y direction is calculated in order to be invariant to rotation.

Implementation
We implemented the SURF algorithm in Matlab (version R2015) on a personal computer with a Core i5 CPU (2.4 GHz) and 4-gigabytes of RAM.UAV and LiDAR orthophoto images were used as inputs.Both images were then converted to grayscale followed by Sobel and Canny edge detection.The SURF features were then extracted from the images followed by computation of the descriptors at the interest points.These descriptors are subsequently used to match the UAV and LiDAR features.A geometric transformation is utilized to locate the matched features in the scene.Finally, after removing the outliers, the matched features are displayed.The average time required for executing the code was approximately 6seconds.

RESULTS
In all, ten (10) different experiments were conducted with respect to three different geometric transformations namely projective, similarity, and affine (Table 1).The initial observation from the table reveals that best registration is achieved by using band 3 (red channel) from both the UAV and LiDAR orthophoto images, with an accuracy of 100%.Note however, that this accuracy is not the registration accuracy, but only an indication about the accuracy of feature matching.To obtain the accurate registration accuracy, the number of detected features without outliers in each dataset was carefully analysed.This is because the geometric transformation can only be executed (and its accuracy estimated) if at least four accurate matching features are found.
The results show that the highest number of features was identified from the Canny filtered images.This was achieved using band 2 channels from the UAV and LiDAR orthophoto images.In contrast, in the Canny images, very low number of features were matched between the UAV and LiDAR orthophoto images.The highest accuracy rate achieved is 40 by using band 2, but with the similarity transformation method applied.In addition, the experiments indicate that the Sobel filtered images are not suitable for registration of UAV and LiDAR orthophoto images.This is because very low number of features were identified.When comparing the three geometric transformations, the Similarity method was most efficient based on all the experiments conducted in the current study.However, the best feature matching is achieved by the Projective method when using band 3 for both datasets.In this experiment, five (5) features could be correctly matched between the UAV and LiDAR orthophoto images.Based on these results, it can be postulated that the edge filters are not recommended for UAV-LiDAR data registration.The use of a single band instead of taking the average of three available bands is more suitable as feature matching accuracy is highest.Specifically, the use of band 3 (red channel) was found to most efficient and practical.

DISCUSSION
Registration of two different data sources such as UAV and LiDAR is a challenging task using automatic procedures because of the difference of spatial resolution and the amount of information contained in the two images are different.Based on the results, band 3 seems to provide the highest accuracy for matching features.This further validates results from a previous study by Teke and Temizel (2010) who also found that single bands (i.e.red and green) performs better than the average of three RGB bands.Although sensitive to image features, the red band seems to highlight important scene features (e.g. Figure 3.), increasing the number of matched points.Conversely, as projective transformation uses a more complex equation than similarity and affine methods and requires a minimum of four identified features, this projective transformation method performs better than other methods in terms of identifying more number of features without outliers (Table 1).On the other hand, similarity transformation scales, rotates, and translates the data.It also maintains the aspect ratio of the features transformed which was seen to be more effective than affine methods that can skew the data.Furthermore, the Canny edge filter highlights the edge in the images so that more features could be determined.However, albeit having more features, the result of matching these features in both UAV and LiDAR orthophoto images did not achieve high accuracy rates.This might be due to the complex features presented in the study area.In contrast, a study by Pandya et al. (2013) showed that the use of Sobel increased the accuracy rate of matching features.In their study, a simple photo captured by normal handheld camera was used.In the current study, we suggest the use of single band such as red channel for detecting features and image registration for remote sensing datasets.
LiDAR can be a significant complementary data for UAV images that can serve many applications such as crop mapping, object detection and recognition, and urban planning.However, proper registration of UAV and LiDAR data is necessary for accurate information extraction.Several methods are available in the literature to register UAV and Lidar data sets such as SIFT, SURF, and GIS-based registration.As each method has its own advantages and disadvantages, the evaluation of the methods is important to determine the best method for UAV and Lidar data registration.This study evaluates SURF algorithm with the aim of providing guidelines for the practical projects.

CONCLUSION
This study evaluated the SURF algorithm for UAV and LiDAR orthophoto images registration.In addition, two edge filters (Sobel and Canny) were also utilized in the preprocessing stage, with the hope of identifying more useful features.Results show that best registration can be achieved by merely considering band 3 (the red band) from the input data.Although filtering the RGB images into edge images increased the number of features extracted by SURF, the features seemed to just add more noise and overall ineffective.On the other hand, grayscale images could be matched better than using edge images.Overall, this study suggests using SURF method for UAV and LiDAR image registration is effective.Our future plan is to consider transforming the input images using high pass image filtering techniques.

Figure 1 .
Figure 1.The study area located over UPM.

Figure 2 .
Figure 2. Overall workflow of this study.

Figure 5 .
Figure 5. Best matching features by using band 3 of UAV and LiDAR orthophoto and projective transformation method.

Table 1 .
Results of the experiments analysed in the current study for selecting optimal case for registration UAV and LiDAR datasets.