ASSESSMENT OF OBJECT SEGMENTATION IN AERIAL IMAGE USING GEO-HAUSDORFF DISTANCE

Aerial Image records the large-range earth objects with the ever-improving spatial and radiometric resolution. It becomes a powerful tool for earth observation, land-coverage survey, geographical census, etc., and helps delineating the boundary of different kinds of objects on the earth both manually and automatically. In light of the geo-spatial correspondence between the pixel locations of aerial image and the spatial coordinates of ground objects, there is an increasing need of super-pixel segmentation and high-accuracy positioning of objects in aerial image. Besides the commercial software package of eCognition and ENVI, many algorithms have also been developed in the literature to segment objects of aerial images. But how to evaluate the segmentation results remains a challenge, especially in the context of the geo-spatial correspondence. The Geo-Hausdorff Distance (GHD) is proposed to measure the geo-spatial distance between the results of various object segmentation that can be done with the manual ground truth or with the automatic algorithms.Based on the early-breaking and random-sampling design, the GHD calculates the geographical Hausdorff distance with nearly-linear complexity. Segmentation results of several state-of-the-art algorithms, including those of the commercial packages, are evaluated with a diverse set of aerial images. They have different signal-to-noise ratio around the object boundaries and are hard to trace correctly even for human operators. The GHD value is analyzed to comprehensively measure the suitability of different object segmentation methods for aerial images of different spatial resolution. By critically assessing the strengths and limitations of the existing algorithms, the paper provides valuable insight and guideline for extensive research in automating object detection and classification of aerial image in the nation-wide geographic census. It is also promising for the optimal design of operational specification of remote sensing interpretation under the constraints of limited resource.


INTRODUCTION
Aerial image records the large-range earth objects with the everimproving spatial and radiometric resolution.With decades of development, a great number of applications of aerial images have been widely found in many domains, such as land-coverage survey, geographical census, etc. Object segmentation is a fundamental process among these applications of aerial image.It aims to partition an image into separate regions, which ideally correspond to different real-world objects.It is a critical step towards content analysis and image comprehension.
In light of the geo-spatial correspondence between the pixel locations of aerial image and the spatial coordinates of ground objects, there is an increasing need of super-pixel segmentation and high-accuracy positioning of objects in aerial image.Besides the commercial software package of eCognition and ENVI, many algorithms proposed based on region growing and merging , watershed and HIS model (Liu et al., 2013), graph (Felzenszwalb and Huttenlocher, 2004),and many state of art algorithms such as MCG (Arbelaez et al., 2014),gPb-UCM (Browet et al., 2011), IS-CRA (Ren and Shakhnarovich, 2013),LEP (Zhao, 2015) have recently been developed in the literature to segment objects.These segmentation proposed for different application and may vary in different perspective for segmentation, therefore summary segmentation methods play an important role in performance comparison.For users,they have a large segmentation algorithms list to choose from; on the other hand, for algorithm researchers,it is also a challenge to receive a satisfactory performance in param- * Corresponding author eter tuning.Thus the question that now occur is how to evaluate the goodness of different algorithms, simultaneously the performance of an algorithm between different parameters.
Many evaluation methods for image segmentation have been proposed over the last decades.Most evaluation methods require access to a ground truth reference, i.e. a manually-segmented reference image.Such as Directional Hamming distance (Zhang et al., 2013) , Bipartite graph matching (Zhang et al., 2013), Bidirectional consistency error (Unnikrishnan et al., 2007a), Probabilistic Rand index (Unnikrishnan et al., 2007b), Precision-Recall for regions (Unnikrishnan et al., 2007a) and Precision-Recall for boundaries (Unnikrishnan et al., 2007a).Conversely, the unsupervised objective evaluation, in which a quality score is based solely on the segmented image, i.e. it does not require comparison with ground truth image.Erdem et al. (Johnson and Xie, 2011) proposed Vest, and in 2011 Johnson et al. proposed an unsupervised image segmentation method using a multi-scale approach (Liu et al., 2013), based on image gray-level distribution, Liu Jinping et al. use an unsupervised Method for Flotation Froth Image Segmentation.
Although many evaluation methods have been proposed, how to evaluate the segmentation results remains a challenge, especially in the context of the geo-spatial correspondence.The measure we use to grade the segmentation is the cornerstone of the evaluation.The proposed GHD can help researchers discover the weak and strong points of their segmentation.For different segmentation algorithms, the GHD compares the segmentation results with an annotated ground truth.Due to the difficulty of setting up the ground truth, it is impracticable to manually delineating the object boundary for each segmentation.It is subjective, laborious, and time-consuming.Therefore, besides comparing with the ground truth, the alternative of the fine-tuned over-segmentation result is also used in this paper as the reference of evaluating different segmentation in the practical use.It is demonstrated that the over-segmentation-based evaluation is effective and greatly facilitates the quality check of the object segmentation from aerial image.
The remainder of this paper is organized as follows: In section 2,we show a detailed analysis of the several state of art image segmentation methods and software tools that used in the experiment.Next the detailed characteristics about GHD measure is presented in section 3.And section 4 presents the experimental validation of this paper.We further analyze GHD between different algorithms and software results with ground truth ,over segmented result and show qualitative results illustrating the excellence of the GHD.Finally, conclusions and future directions for research in automatically evaluation are discussed in Section 5.

OBJECT SEGEMENTATION REVIEW
Image segmentation is a technique that divides the image into non overlapping regions and extract the target of interesting.Image segmentation is the key step from image processing to image analysis.On the one hand, it is the basis of the target expression, has an important influence on the characteristic measurement.On the other hand, because of the image segmentation and object representation, feature extraction and parameter measurement, etc., both transform the original image into more abstract and more compact form, which makes it possible to analyze and understand the higher level image.
There are thousands of methods for image segmentation,many new methods appear every year.Generally speaking, the existing segmentation algorithms can be divided into the following categories: threshold segmentation, edge detection, region based, clustering based and image texture segmentation method.Although there is no general theory of segmentation, the current segmentation algorithms mostly aim at specific issues, the general rules for image segmentation have basically reached a consensus.In this paper, we focus on the evaluation of two kinds of classic commercial software:eCognition and ENVI , two typical segmentation algorithms :EGB and MeanShift ,newly proposed algorithm LEP and the over segmentation algorithm :TurboPixel.The following is a review of these six algorithms.

ECognition image segmentation
ECognition is the present first commercial remote sensing software based on target information, it uses fuzzy classification algorithm based on the expert decision making system, which traditional commercial remote sensing software based solely on the spectral information for image classification, revolutionary classification technology (Yang et al., 2014).Object-oriented classification method can greatly improve automatic identification accuracy of high spatial resolution data,which greatly meet the needs of scientific research and engineering applications.
Multi-resolution segmentation is one normally used segmentation procedure in the software eCognition for object-oriented image analysis.In this paper we use it to produce image object boundary for later experiment research.Multi-resolution segmentation occurs by defining small groups of pixels as segments and merging similar neighboring segments together in subsequent steps until a heterogeneity threshold, set by the scale parameter is reached (Benz et al., 2004).The final segments will have the geometrical shape and boundary as per the real world objects present in the image.As the segments grow, the spectral homogeneity decreases till the point they match the objects in the real world size.Spectral homogeneity of the segments was measured as the spectral angle (Kruse et al., 1993) between each pair of two pixels within the segment.The index measuring the spectral homogeneity of segments were used for segmentation scale parameter selection.

ENVI Feature Extraction
ENVI Feature Extraction use the Mumford-Shah active contour model, and introduction the threshold to form a multi-scale segmentation system (Blei and Jordan, 2003).The operation of EN-VI FX can be divided into two parts: the discovery of objects and feature extract,and this tool is divided into three independent process tools: based on the rules, based on the sample and the last based on image segmentation.According to the data source and the type of feature extraction and so on, we can choose to do some preprocessing of the data,such as Spatial resolution adjustment,Spectral resolution adjustment, multi-source data combination,spatial filtering.FX image segmentation based on the adjacent pixel brightness, texture, color, etc., it uses an edge based segmentation algorithm, this algorithm is very fast, and only one input parameter can generate multi scale segmentation results.By controlling the difference of different scales on the boundary, multi-scale segmentation can be generated from fine to coarse.Level Scale determines the size of the scale of segmentation.Level Merge can solve the problem that some of feature is divided into wrong class.Texture Kernal Size can be adjusted according to the size of the data area and the size of the texture difference.

EGB
EGB based on pairwise region comparison, this segmentation algorithm makes simple greedy decisions, and yet produces segmentations that obey the global properties of being not too coarse and not too fine using a particular region comparison function (Felzenszwalb and Huttenlocher, 2004).It define a predicate for measuring the evidence for a boundary between two regions using a graph-based representation of the image.An important characteristic of the method is its ability to preserve detail in lowvariability image regions while ignoring detail in high-variability regions.The method runs in O(mlogm) time for m graph edges and is also fast in practice, generally running in a fraction of a second.

LEP
LEP focus on behavior of human subjects in segmenting images,it investigates the effort made by human subjects and proposes an empirical method to estimate the boundary tracing loads, then establishes a model for natural image segmentation based on the least effort principle (Zhao, 2015).This algorithm sort the hierarchies exhibited in human segmentation processes, with the monotonicity observed in the region merging processes into two constraints.Adopting the monotonic merging strategy ,the experiment result show that the algorithm segment natural images from scratch with pretty high efficiency .

MeanShift
MeanShift is a general nonparametric technique, proposed for the analysis of a complex multi-modal feature space and to delineate arbitrarily shaped clusters in it (Comaniciu and Meer, 2002).Probability density distribution is the basis of this algorithm, and is a kind of non parametric sampling.Not considering gray level or color images are accepted as input, this algorithm take the resolution of the analysis as parameter,which successfully resolved the problem that the employed techniques often rely upon the user correctly guessing the values for the tuning parameters.

TurboPixel
TurboPixel Segmentation (Shiming et al., 2010) is a powerful tool for image over-segmentation.Its effective and a lattice-like structure of superpixel regions with uniform size make it very useful for the basic segment of different regions,of course these small regions are over fragmented for a complete object area ,it's meaningful for merging these small regions for the true region for the object.in this paper ,we calculate the intersection, set threshold, combined with over segmented regions, to extract the boundary of the target region.

GEO-HAUSDORFF DISTANCE (GHD) MEASURE
Hausdorff distance (HD) is a measure of dissimilarity between two point sets (Huttenlocher et al., 1993).The HD is an important measure that is commonly used in many domains like image processing and pattern matching as well as evaluating the quality of clustering.The HD is a max-min distance, and takes into account the spatial position of each individual point.Therefore it is capable of incorporating the real-world spatial coordinates of the object boundary.
For two point sets A and B,for arbitrary point x ∈ A ,point y ∈ B the HD between them is calculate by: where ., . is any norm e.g., the euclidean distance function.
Note that h(A, B) = h(B, A) and thus the Hausdorff distance is not symmetric.Figure 1 illustrates this distance measure effectively.The Hausdorff distance H is the maximum of the directed Hausdorff distances in both directions and thus it is symmetric.H is given by To measure the spatial distance between two point sets, many researchers developed different variations of HD in their applications.Inspired by the latest improvement of HD in (Taha and Hanbury, 2015), the geo-spatial HD (GHD) is proposed to efficiently calculate the exact HD with optimized runtime and memory requirement.The GHD combines the early-breaking and random-sampling strategy during the calculation of the geo-spatial distance.In the following we highlight the prime characteristics of the proposed GHD in more details.

Actual geographic coordinates
With high resolution aerial image , algorithms can obtain pretty good segmentation to generate amount of location-based service or applications , nevertheless the geographical coordinates of object in segmentation are generally ignored by most segmentation evaluation measure.Therefore GHD is calculated between two point sets which the coordinates are transformed to the real coordinate of aerial image objects .The earth is an almost standard ellipsoid, its equatorial radius of 6378.140km, a polar radius of 6356.755km, an average radius of 6371.004km.If we assume that the earth is a perfect sphere, then its radius is the average radius of the earth, denoted as R.If 0 degrees longitude as the benchmark, then according to the latitude and longitude of the earth of any two points on the surface can calculate the distance between the two points on the surface.There we ignore the error caused by the earth's surface topography, it is only a theoretical estimate.The latitude and longitude of the first point A is (LonA, LatA), the latitude and longitude of second point B is (LonB, LatB),The 0 degree line as a reference,if it is east longitude,set as(Longitude),west longitude set as (−Longitude),if it is north latitude,set as (90 − Latitude), south latitude set as (90 + Latitude), the two points after the above treatment were counted (M LonA, M LatA) and (M LonB, M LatB).
According to the triangle, we can get the following formula to calculate the distance between two points: Here, R and Distance units are the same, if it is to use 6371.004km as a radius, then Distance is a kilometer.

Early breaking
For calculate the exact HD between point set A and B, for ,for point x in A, each point in B is scanned to calculate the distance to find min value, the same calculation will be done for all point in A ,and then the GHD is the max value of numerous min distance.However it is not always necessary to scan every point in the inner loop, i.e.Scan all point in B to find the min value.Since the GHD aims to find the maximum of the minimums, the inner loop can actually break as soon as a distance is found that is below the temporary HD , because in this case temporary HD will definitely not change in the rest of the loop.This means the algorithm can break the inner loop and continue with the next point of the outer loop.

Random sampling
According to adjacent spatial locality of point sets, the distance between adjacent points with point in another point set is pretty appropriate.In this case it is better to search point in another region which is spatially far from the current point.The GHD use random sampling instead of the trivial scanning to improve performance.In random sampling aim to avoid similar distances in successive iterations.This is achieved by randomly change the order of the point set in the inner and outer loop.

EXPERIMENTAL RESULTS
This part will present the evaluation results of several state-ofthe-art segmentation algorithms and software tools.The paper uses GF-2 aerial image of Yinchuan, China, with the elaborated ground truth of object boundary.Three segmentation methods of LEP, EGB, and MeanShift, and two commercial software packages of ENVI and eCognition are used in the experiments.The version of eCognition is eCognition Developer 8.7 while ENVI is 5.1.According to the different choice of reference, the experiments are conducted on both the ground truth and the alternative over-segmentation.
Section 4.1 presents a qualitative comparison with ground truth and Section 4.2 describes the experiments on comparison with an over segmented results from TurboPixel.And finally this part make a short summary for result analysis.

Supervised evaluation
The supervised evaluation is based on the ground truth, which is elaborately provided by the human operator.In this case, the reference for evaluating different segmentation algorithms is traditionally reliable and acceptable for most scenarios.With the ground truth result, different algorithms are evaluated and the GHD results is presented in table 1 and Figure  The GHD represents the max-min geographic distance between the two point sets of the object boundary that are gotten by the manual delineation and automatic algorithm.With the reference of the ground truth segmentation, comparison of the GHD values for all algorithms with different objects shows that the segmentation accuracy is about 0.001km to 0.1km.For the GHD values of the same object, the difference between algorithms is tiny enough.The difference of the group of waterbody2's GHD value is ±0.0011km.Analogously the average difference between several algorithms in the group of road is nearly ±0.0045km.For vegetation and waterbody3, the discrepancy is about ±0.009km and ±0.0258km.while in some other groups, such as in waterbody 1, the max discrepancy is 0.103342km and min value is 0.00141km.The best segmentation comes from eCog meanwhile the worst one is from LEP. Detailed comparison between the two method is in Figure 3.The table lists the GHD values between the ground truth reference and the algorithm result.It can be seen that the EGB and MeanShift method outperform other algorithms for different object groups.

Unsupervised evaluation
It is well known that manually generating a reference image is a difficult, subjective, and time-consuming task.Moreover, for most images, especially natural images, we usually cannot guarantee that one manually-generated segmentation image is better than another.In this sense,comparisons based on such reference images are somewhat subjective.
The key advantage of unsupervised segmentation evaluation is that it does not require segmentation to be compared against a manually-segmented reference image.The ability to evaluate segmentation independently of a manually-segmented reference image not only enables evaluation of any segmented image, but also enables the unique potential for self-tuning.
From the table 1, it is obvious to find a general trend that the GHD values for various methods that are generated from the Ground truth(GHD-gt) and the segmented result (GHD-overseg) are nearly similar.The discrepancy is about ±0.0003 Km.There is no doubt that in a certain group the difference is a little larger, for instance, the group of waterbody 1, the max difference is 0.076km and the min difference is 0.00084km.Apart from this group,in other groups the difference between GHD-overseg and GHD-gt is nearly equal to 0.002km.Despite that the discrepancy between GHD-overseg and GHD-gt is vary in quantity, it is fortunately enough that the ranking of different algorithms is still stable according to the ranking of GHD in the order of the second column.
In groups of road and vegetation, the ranking is the same as the GHD-gt and for some other groups the alteration is also quite tiny.Hence, generally speaking, the GHD with the reference of over segmentation is helpful to guide the choice of automatic segmentation algorithms of aerial objects.
To sum up, the GHD measure can calculate the real geographic distance between two point sets of object boundary.The GHD value can intuitively show the performance of segmentation results.And for most cases, with the alternative of the over-segmented result as the ground truth reference, GHD measure can make the real-time feedback possible while changing the segmentation method or tuning its parameters.Although the over-segmented results depend on the parameter setting, they could be well tuned for different objects and have little influence on our evaluation.

CONCLUSION
Aerial object segmentation is the crucial first step for object detection.Its accuracy and reliability are critical for the subsequent classification and other applications.This paper firstly summarizes some state-of-the-art segmentation algorithms in the literature, and pinpoints the weak and strong points of the segmentation and object proposal algorithms.The Hausdorff measure has the advantage that it involves the spatial position of each individual point, which makes it capable of incorporating the spatial properties in the measurement.The proposed GHD measure integrates the HD and geo-spatial registration of different segmentations.It combines the early-breaking and random-sampling stragety with HD, which makes it efficient and intuitive.It also   1 Figure 2: overlapped results of different segmentation makes unsupervised evaluation uniquely suitable for automatic control of online object segmentation from aerial image.Experimental results demonstrate that the GHD measure is promising for selecting among numerous algorithms and parameter tuning for a single algorithm.

Figure 1 :
Figure 1: Hausdorff distance between A and B

Figure 4 :
Figure 4: Lake segmentation results, with the GHD values shown in Table 1

Table 1 :
2. More details of a group of object boundary, original image, ground truth and over-segmented result are shown in Figure4, 5, and 6.GHD values of different algorithms for different objects

Table 1
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B4, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech RepublicThis contribution has been peer-reviewed.Road segmentation results, with the GHD values shown in Table1Vegetation segmentation results, with the GHD values shown in Table doi:10.5194/isprsarchives-XLI-B4-187-2016