RECONSTRUCTION OF BUILDING OUTLINES IN DENSE URBAN AREAS BASED ON LIDAR DATA AND ADDRESS POINTS

The paper presents a comprehensive method for automated extraction and delineation of building outlines in densely built-up areas. A novel approach to outline reconstruction is the use of geocoded building address points. They give information about building location thus highly reduce task complexity. Reconstruction process is executed on 3D point clouds acquired by airborne laser scanner. The method consists of three steps: building detection, delineation and contours refinement. The algorithm is tested against a data set that presents the old market town and its surroundings. The results are discussed and evaluated by comparison to reference cadastral data.


INTRODUCTION 1.1 Motivation
The two dimensional building outlines reconstruction is expected to be a fully automated process that produces a high level of detail output.Development of numerous disciplines dealing with spatial data, like real estate industry or GIS, has caused increasing requirements for building footprints.Therefore, current high interest is to implement automatic outlining of existing buildings followed by change detection algorithms.(Champion et al., 2008).In addition, extraction of building boundaries can also be an important step towards 3D buildings modelling.Objects are commonly reconstructed from data captured by laser scanning.Increasing accessibility and operating ability of LIDAR sensors allow acquisition of very dense point clouds that leads to detailed modelling.Reconstruction process starts from object detection, which is complex task crucial for entire modelling.Many proposed methods available in literature rely on additional available data, like spectral images or topographic databases (Awrangjeb et al., 2010, Haala et al. 1998, Vosselman, Dijkman, 2001).Such information significantly improves determination of building boundaries and its accuracy.However, for some areas especially in developing countries, additional information sources are difficult to obtain.According to above, a method presented below utilizes building address points, that gives initial information about buildings location.The points are easily accessible from open web portals.

Geocoded address data
Each address point is assigned to one building and has a random location within planar building outline.The point position is determined by x and y coordinates.A set of points, used in that work for algorithm testing, was obtained from regional agency cadastral database.It is worth to mention, that there are several on-going projects that aim at an implementation of a free, participatory, community oriented geocoding services.Among them are for example OpenGeocoding (http://www.opengeocoding.org),Open Addresses (http://openaddresses.org)or OpenStreetMap (http://www.openstreetmap.org).The main objective of these projects is to provide a worldwide free address database with focus on areas where 2D databases are not completed (Behr and Rimayanti, 2008).Web based services on a worldwide level aim at collection of geocoded address data, like building postal addresses and coordinates, in order to make them freely available.Utilization of such information facilitates building reconstruction and completion of topographic databases.

Aims
The objective of this work is to develop a comprehensive method for automated extraction and delineation of complex building outlines in densely built-up areas.Objects are extracted from raw 3D point cloud acquired by airborne LIDAR sensors.The method is unique with respect to other algorithms used for building detection because it benefits from including building address points.They give initial information about building location and serve as the seed points during building detection.The presented reconstruction approach is focused on the areas where buildings are tightly adjacent to each other creating complex and irregular outlines.Especially in such scenes full automated and exact building extraction poses a challenge.Incorporated address points simplify detection process and highly reduce the complexity of the entire modelling task.
The presented approach to building outline determination is solved in three steps.First, individual buildings are detected based on their address points position.Second, the detected regions of interest are delineated.Third, the initial contours are subjected to the refinement.The examples presented in this paper were computed using 3D point clouds with the density of 12 points/m 2 .The data was captured by airborne LiDAR full waveform system in the old town of Brzeg (Poland).The results show the high potential of the presented approach.

Related work
Building reconstruction problem has been studied for many years (Weidner and Foerstner, 1995, Rottensteiner and Briese 2002, Dorniger and Pfeifer, 2008, Haala and Kada, 2010), however, in current research it is still an open issue.Most of the existing building detection techniques proposed in last few years relied on aerial images.The domination of the image based techniques can be explained by the insufficient accuracy and too big point spacing of ALS systems of the past.Improvement of laser scanning technology enabled the acquisition of very dense 3D point clouds and thus triggered the development of numerous methods that use LIDAR data.
Reconstruction of building outlines comprises three parts, buildings detection followed by contour tracing and regularization.Numerous approaches to building detection suggest to transform ALS data into planar, grid structure (Alharty and Bethel, 2002;Rottensteiner and Briese, 2002).It facilitates computation since extraction of 2D features is more accurate using 2D inputs than 3D data (Kaartinen and Hyyppa, 2006).In such methods buildings are usually identified from normalized digital surface model that is computed by the comparisons of two models, digital terrain and digital surface (Weidner and Foerstner, 1995).Although the methods provide good results, building outlines are strongly influenced by the poor resolution of the interpolated DSM.The outline determination directly from ALS data has a potential to deliver better accuracy of reconstructed objects.The example of such approach is presented in Sampath and Shan (2007), where the data is separated into building and non-building points by slope-based algorithm.Detected building points are segmented in order to obtain single clusters.In Matikainen et al. ( 2010) building detection method is based on region-based segmentation and laser points classification.
The final part of outline reconstruction, building boundary tracing and regularization can also be solved in various approaches.Vosselman (1999) reconstructs buildings applying Hough transform on dense height data.Regularization of building outline is performed using main orientation of the building.The orientation is determined by the direction of the ridge line computed as the horizontal intersection between roof faces.Sampath and Shan (2007) propose a new procedure that utilize Jarvis algorithm.The contour is regularized in hierarchical least squares adjustment.In order to delineate building footprints Neidhart and Sester (2008) perform Delaunay-Triangulation.They propose three versions of outline simplification, modified Douglas-Peucker algorithm, graphbased approach and RANSAC algorithm.
Revision on the existing approaches for building boundary reconstruction is outlined in Vosselman and Maas (2010).The evaluation of different algorithms for the detection of building footprints and their changes is given in Champion (2009).

OUTLINE RECONSTRUCTION METHOD
The workflow of the building outlines extraction is presented in Fig. 1.The method consists of three main steps: building detection based on address points, identification of the initial boundary and regularization of the contour.The input for the reconstruction algorithm contains a data set provided by airborne LIDAR sensors and a list of buildings address points.

Derivation of building footprints
In the pre-processing step raster height image is interpolated from the original data.Image resolution depends on the density of points.This pre-processing step simplifies neighbourhood relation within the data and thus optimizes algorithm time performance.Building detection is carried out by region growing.As the seeds it utilizes the pixels associated with consecutive address points.During that process we obtain not only a building mask, as it is often performed in other approaches, but the group of pixels constituting individual objects.The use of the initial information about buildings position significantly improves the time performance of building detection.As well, it prevents classification errors that assign compact group of trees to buildings.As the output, the method provides the set of separated building clusters composed of adjacent pixels.On that stage, a building cluster may contain pixels that belong to the trees, which are adjacent to the building or above it.In order to remove outliers, detected pixels are mapped onto original point cloud, which is then segmented according to the local normal vectors and connectivity.

Initial boundary extraction
Once building regions are extracted from the image, they can be utilized to determine bands of pixels that constitute building boundaries (c.f.Fig. 2.b).The boundary pixels are detected by connected components analysis.The computation is executed using resampled image, hence, the precision of extracted building boundaries is deteriorated by the interpolation.
In order to maintain the level of detail provided by laser scanning, detected pixel are mapped onto the original LIDAR point cloud.Because one pixel may contain more than one point the mapping process delivers a set of 2D points (projected on The accepted distance from the point to the line is equal to the double point spacing.The hypothesis is accepted when the size of the consensus set exceeds predefined threshold.The line equation is updated using all the points that fit to the line.They are stored as a boundary line segment and excluded from the data.Then, the whole process is repeated in order to detect the next line segment.The algorithms stops when there is no line composed of required number of points. The boundary lines are detected based on their equations.Therefore, non-adjacent segments on the both sides of buildings' protrusions or insets are merged together.According to above, modifications of the algorithm are required in order to disjoint non-adjacent segments that feature the same line equation.Depending on the connectivity threshold (the accepted distance between the subsequent points within one line segment) erroneous connections are eliminated and the line is divided into new, shorter intervals.The line parameters are updated based on the new set of points.The result of boundary lines detection algorithm is presented in Fig. 2c.

Outline improvement
The outlines provided in the previous step are composed of unstructured line intervals.In order to produce realistic building shapes, the boundaries have to be simplified, appropriately merged and ultimately, adjusted.The first task -line simplification -starts with determination of the sequence of line intervals within the building boundary.Each interval is assigned to a set of points, hence, we can easily determine the two opposite points with the maximum distance from the interval gravity centre.In order to establish topology relation between all such points, a binary search tree is generated.Based on the closets neighbours of their end points, line intervals are ordered clockwise and enumerated.Once the line topology is established, consecutive lines are investigated for their mutual orientation.Nearly parallel segments are joined, thus, reducing the number of lines that determine building outlines.According to the settled threshold (depended on the desired generalization level), generalization process reduces unwanted small details in the boundary and maintains the basic essentials of shape.
The next part consists of segments merging and lines adjustment.The process is based on the main orientation, which is determined by the mean direction calculated from the longest building segments.Each segment gets parallel or rectangular label with respect to the difference between its own direction and the main orientation.The consecutive lines are investigated in order to find potential gaps within the contour.The gaps are detected when the two following lines get the same label.With respect to the distance between their end points, the lines are either merged or separated by the new, perpendicular interval.The interval is inserted halfway between their ends.In the next step line equations are adjusted in order to form regular building outlines.According to the segments' label, rectangularity or parallelism constraint is enforced.The adjusted lines shorter than assumed generalization threshold are removed from the contour.The final results of boundary regularization is illustrated in Fig. 2d.
Each line of a contour is either parallel or rectangular to the main direction.This assumption is motivated by the fact that buildings mainly consist of parallel and rectangular facades.
Although the presented algorithm gives very precise results in the most cases, it fails when adjacent buildings create the boundary shape with any possible variety of angles.Thus, in order to improve the implementation additional research on that task will be necessary.

Study area and data
The proposed approach was tested against LIDAR data enhanced by a list of building address points.The data was collected with an approximate density of 12 points per m 2 .The scene is located in the old town of Brzeg (Poland) and presents a market square and its surroundings (an overview of the area is illustrated in Fig. 4a).The size of the area is about 0.5 km 2 .It comprehends 361 individual buildings that constitute 105 adjacent building clusters.The area corresponds to a very dense urban settlement with various building size and shapes.
Complex urban configuration additionally complicates the task of building boundary reconstruction.

Quality assessment
The correctness verification was performed by the comparison of the extraction results with the building contours obtained from cadastre.The quality was estimated by using area-based accuracy measures (Song and Haithcoat, 2005).Their indexes are as follows: Matched overlay (the percentage of overlapping parts of reconstructed buildings to the total area of reference building regions): 90%.The overlay with cadastral information is illustrated by Fig. 3a.Area omission errors (total area of non-detected building parts divided by the total area of reference objects): 10% (marked as blue regions in Fig. 3b).
Area commission errors (total area of incorrectly detected building parts divided by the total area of detected objects): 8% (marked as red regions in Fig. 3b).The visual check of the results reveals that the indexes are strongly deteriorated by improper handling of closed building clusters and false enforcement of regular angles.All that errors arise from the last reconstruction step -boundary regularization.Therefore, application of more robust approach (like for example presented by Guercke and Sester, 2011) in the future research should significantly increase the quality of the whole results.

Results and discussion
Input data is presented in Fig. 4b. Figure 4c illustrates the results of building detection.The buildings are extracted based on their address points from the height image interpolated with resolution of 0.5 m.The visualisation shows that the algorithm provides good results.Although the gridded image facilitates detection process and efficient computation, its level of detail is deteriorated during interpolation.Hence, the image is only used to detect an approximate set of boundary points (real boundary points and outliers) from the original data.Such initially extracted boundaries are presented in Fig. 4d.Final results of the building outlines reconstruction -computed from original LIDAR data and adjusted -are illustrated in Fig. 4e. Figure 4f shows the reconstructed outlines superimposed on the orthophoto of the area.It is seen that the most of buildings are outlined very precise.Although the algorithm generally works promising, in some cases it returns poor results.The most important problem arises from the right angle constraint.From the visual check it might be inferred that the regularization step improves the shapes of more standard objects.However, when the objects do not feature parallelism and rectangularity the final results are completely corrupted.In such cases there is especially hard to maintain a trade-off between the regularity constraint and the level of freedom.For the complex shapes (e.q.churches or castles) the adjustment step deteriorates initial boundaries.Another problem is observed for the building regions that contain an empty space inside.In such situation only the outer boundary is extracted.Finally, not individual buildings but their clusters are reconstructed.No automatic tool can determine a border between neighbouring buildings where there is no gap between them.However, for the clusters with differences in the roof structures, an improvement to the results could be partially achieved by analysing normal vectors in local neighbourhood.

CONCLUSSION
The paper has presented the fully automatic framework for efficient reconstruction of building outlines from LIDAR data based on their geocoded address information.The presented results were obtained without any manual refinement.In the first step, separated buildings regions were easily marked in the interpolated high image based on their initial location from the address points.Then, the pixels detected as a boundary were projected onto the original data in order to deliver the set of boundary points contaminated by outliers.The data set serves as the input for RANSAC algorithm, which detects straight lines and delivers initial boundary.Finally, the boundaries were subjected to the regularization according to parallelism and rectangularity constraints that usually characterize a building.
The presented approach was applied to the dense residential area with complex building shapes.The work presented in this paper is still in progress and improvement in regularization approach would significantly increase the whole algorithm performance.
The idea to utilize building address points for building outlines reconstruction is new and it has shown good potential.The number of web portals that freely share geocoded information increases rapidly together with a development of information society.Although the data -collected in different ways -cannot be treated as completely reliable information, it might be sufficient to serve as the initial hint for further computation and analyses.Moreover, it gives an opportunity to easily connect reconstructed buildings with all the information available in open databases.The work presented in that paper was focused on the methodology of building reconstruction using initial information about their location.In the further work real open source information will be utilized.The quantitative accuracy analysis indicates that 90% of buildings were detected well in comparison to the reference cadastral data.
The task is co-financed by the European Union under the European Social Fund.

Figure 1 .
Figure 1.The workflow of outline extraction.
the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia the plane) that make up the outline contaminated by outliers.Thus, similar toNeidhart and Sester (2008), straight lines detection is performed by Random Sample Consensus (RANSAC)(Fischer and Bolles, 1981).The algorithm enables to estimate the parameters of a model in a noise data set.Each set of building boundary points is processed separately.First, the hypothesis is advanced based on the line equation computed from the two randomly chosen points.The hypothesis is tested by checking all the points from the data set.If they nearly lye on the candidate line, the points are added to the consensus set.

Figure 2b .
Figure 2b.Building region outlines detected in the high image.

Figure 2c .
Figure 2c.Set of refined straight lines resulting from the modified RANSAC algorithm

Figure 3 .
Figure 3.Comparison with a reference data, (a) building footprints from cadastre (green) and reconstructed building outlines (black); (b) omission errors (blue) and commission errors (red).

Figure 4c .
Figure 4c.Detected separated clusters of adjacent buildings.

Figure 4f .
Figure 4f.Resulted outlines with an orthophoto image.

Figure 4b .
Figure 4b.Input data: LIDAR point cloud (after interpolation) and address points.