IMAGE-BASED MODELING TECHNIQUES FOR ARCHITECTURAL HERITAGE 3D DIGITALIZATION: LIMITS AND POTENTIALITIES

: 3D reconstruction from images has undergone a revolution in the last few years. Computer vision techniques use photographs from data set collection to rapidly build detailed 3D models. The simultaneous applications of different algorithms ( MVS ), the different techniques of image matching, feature extracting and mesh optimization are inside an active field of research in computer vision. The results are promising: the obtained models are beginning to challenge the precision of laser-based reconstructions. Among all the possibilities we can mainly distinguish desktop and web-based packages. Those last ones offer the opportunity to exploit the power of cloud computing in order to carry out a semi-automatic data processing, thus allowing the user to fulfill other tasks on its computer; whereas desktop systems employ too much processing time and hard heavy approaches. Computer vision researchers have explored many applications to verify the visual accuracy of 3D model but the approaches to verify metric accuracy are few and no one is on Autodesk 123D Catch applied on Architectural Heritage Documentation. Our approach to this challenging problem is to compare the 3Dmodels by Autodesk 123D Catch and 3D models by terrestrial LIDAR considering different object size, from the detail (capitals, moldings, bases) to large scale buildings for practitioner purpose.


INTRODUCTION
The diffusion of Image-based 3D modeling techniques, through free, low cost e open source software, has increased drastically in the past few years, especially in the domain of Cultural Heritage (Architecture, Archeology, Urban planning).Some software process 3D reconstruction using only structured photos data set (ARC3D,123D Catch,Hyp3D,my3Dscanner) and some other software use both structured and unstructured photos data set for example downloaded from Flirck.com (VisualSfM, PhotoSynth).Obviously, this approach has been developed to be used by not expert operators to view images and browse photo collections through the Structure from Motion (SfM) techniques.Furthermore, some algorithms can be downloaded; thus, it is possible to manage them in order to overcome the limits identified during the application (Vu et al, 2009).Other software works like a real "black box" (Nguyen et al., 2012) whose algorithms are secrets and the user cannot understand the "formula" that allows to get the outcome.Nevertheless, web-based software (ARC3D, 123D Catch, Hyp3D, my3Dscanner) offer another opportunity: they use the power of cloud computing to carry out a semi-automatic data processing instead of considerably slowing-down the computer during data processing as desktop systems do.In this paper we focus on architectural heritage digitalization by using 123D Catch by Autodesk, one of the more used web based packages.Among all available web based software (ARC3D, 123D Catch, Hyp3D, my3Dscanner) we chose Catch for the easiness of use, the visual quality of the reconstructed scene and the possibility to interact with and develop the results (by manual stitching of homologous points on triplets of images).Furthermore, Catch 3D mesh is suitable for all 3D modeling software.123D Catch by Autodesk is a free (at present time) web-based service in beta release that overcomes the previous Autodesk's Photofly technology preview project launched in the summer of 2010 using technology developed by Realviz (now Acute 3D).Our goal is to identify the methodology for using the software and to verify and demonstrate its metric reliability.The aim of this work is to give a simple guide to the practitioner so that he will be able to work in architectural survey field, without using expensive technologies and software and without having an extremely specific expertise.Our applications have been addressed to small and large scale buildings in the field of architectural heritage.We deeply investigated on metric reliability of 123D Catch models comparing them with terrestrial laser scanner acquisitions or reliable Ground Control Points (GCP), on the surfaces reconstruction quality and on the detail quality in relation with the number of images and their resolution.After a description of related work, we describe the correct use of the software and demonstrate its visual and metric accuracy.

Previous work
The main part of the studies aimed at testing and exploring 123D Catch potentialities and limits is addressed to digitalization of little objects (Nguyen et al, 2012), of archaeological finds such as fragments and furnishings (Kersten, Lindstaed, 2012), small archaeological site parts (Lo Brutto, Meli, 2012;Dellepiane et al., 2013) and statues, such as the excellent outcomes obtained by Kestern (Kersten, Lindstaed, 2012) on Moai statues of Easter Island.Instead, in architectural field there is a lack of systematic studies, Kestern made some applications on very simple buildings and Manferdini (Manferdini, Galassi, 2013) worked mainly on 123D Catch verifications.Hence, rather than getting a comparison between 123D Catch and other image based modeling tools, we consider more useful investigate on the 123D Catch architectonic mesh metric and visual accuracy (Santagati, Inzerillo, 2013) in order to fill this gape and provide some methodological directions.

123D CATCH: METHODOLOGICAL APPROACH
Among all the web-services packages actually available, 123D Catch by Autodesk is the only one that allows to improve the result of the 3D scene reconstruction through the manual stitching of homologous points on triplets of images and the resubmission of the scene to the service.The used approach underlying 123D Catch technology is well described in (Vu, H-H. et al., 2009).Exploiting the photogrammetric approach and the algorithms of Computer Vision, 123D Catch is able to reconstruct internal parameters of the digital camera and the position in space of the homologous points from a number of correspondences between sequences of photographic images, suitably taken.Indeed, through the correspondence pixel-pixel, the 3D coordinates of all points of the scene are found and the polygonal model is reconstructed.The main steps to use 123D Catch are to: -Capture a photographic sequence of an object in order that the angle between one shot and the other is about 5-10 degree and the overlapping is about 70%; -use the iPhone, iPad, web, or desktop app to upload the photos to the Autodesk cloud (user can decide whether to wait the 3D reconstruction or to be advised by email); -Improve the results by manual stitching of homologous points on triplets of images and submit again the scene to the cloud; -Create a video, share with others, or even fabricate your project with 123D's 3D printing or laser cutting services.Furthermore, for a use in cultural heritage visualization field it is necessary to scale and post-process the obtained model to fix all the imperfections (noise and holes) in mesh quality, also by using open source software such as Meshlab (Cignoni, P. et al., 2008).All those advantages cut down the costs not only in terms of required equipment but also in terms of hours/man, hours/machine you have to consider when you start a digitalization project.The reconstruction process begins by estimating parameters for the sequence of data set photos.The sequence is very important to reach satisfactory outcomes.In fact if you change the sequence also the result changes.The pictures must be taken according to a path of continuity around the object and their submission on Catch must be the same.The right sequence to take the pictures is well shown in figures 2-3.As before said, it is necessary to capture a photo data set sequence of an object in order that the angle between one shot and the other is about 5-10 degree and the overlapping is about 70%.This is a strong condition to ensure a good result.So if your architecture building is hampered by bottlenecks it is not possible to create a cluster of good structured pictures.Another strong condition is that it is necessary to frame the building in its entirety, therefore, be at a reasonable distance.This excludes the possibility of performing calculations on buildings that are located in narrow streets or which have obstacles (such as statues, trees, etc.) that prevent the shot distance.In figure 4 we demonstrate that 123D Catch result is not useable.Given a set of matching images, the goal of this stage is to recover simultaneously the geometry of the scene and the Structure for Motion (SfM) (Furukawa and Ponce, 2007;Remondino, et al., 2012;Snavely, 2008;Wu, 2011).SfM includes the extrinsic (position, orientation) and intrinsic parameters of the camera for the captured images.123D Catch processes only photos taken by a single camera.In case of change of lens or camera gear, or use of wide-angle, or use images downloaded from the web, Catch's algorithm is not able to process the mesh because it does not recognize the homologous points.Photos dataset have to follow the parameters shown in the previous section.The amount of pictures to take is relative to the object to be processed and the amount of detail to provide.Catch allows to choose an output quality for mesh.There are three choices: the mobile one which is fast, suitable for viewing on mobile devices; the standard one -which is the recommended one from Catch-with high resolution textured mesh and it is the best for the visualization on the desktop; the maximum one which is a very high density mesh, suitable for manipulating in external applications.

METRIC ACCURACY
As stated in the introduction, our methodological approach is focused on testing 123D Catch on different architectural objects to verify its reliability and give to practitioners some milestone on which to build.Thus, our chosen case studies span from architectural detail to large scale architecture.

Entire architectural object
In this section, we will get 123D Catch 3D model of an entire architectural object.The third strong condition, just treated on section 2.2, has reduced our choice on architectural building.It was really difficult to find a building isolated from the context and proportionate to the road so that it could be photographed entirely.We tested Catch on a little chapel in Catania: Auteri Chapel.We chose this architectural building for its simple cylindrical shape.

Table 2. Auteri Chapel dataset
In Table 2 you get all information you need about Auteri Chapel dataset: dimension of the object, mesh quality, number of vertices and triangles (faces).123D Catch 3D model visual accuracy is very good apart from the covering (this problem is because we couldn't take picture of it). Figure 6 shows the excellent sharpness of cylindrical surface both of the building and of niches and columns.Table 3 summarizes all the information about dimension, resolution, images number and information on the mesh quality in terms of number of vertices and triangles (faces).The shots have been carried out by a Coolpix L22.The dataset includes 20 images and its visual appearance is very good.We observe that 123D Catch model is more detailed than point cloud one.The outcomes of the comparison denote an very good overlap.The average error is 0.0125m.Furthermore, both the horizontal and vertical cross sections carried out with a step of 0.05 m reveal a very good quality of 123D Catch model (figure 10).The comparison outcomes reveal a very good overlap.The average error is 0.03 m.In the vertex quality evaluation some of the blue zone are in correspondence of lacking in the laser scan mesh.Hence, if we consider that used terrestrial laser scanner has an accuracy of 0.006, we still assert that on large-scale architecture we get good outcomes.Nevertheless, practitioner should attention all the erroneous reconstruction carried out by Catch.Furthermore, both horizontal and vertical cross sections, carried out with a step of 0.50 m, denote 123D Catch model quality.

Entire architectural object Auteri Chapel
Figure 12: San Nicola façade mesh alignment error evaluation histogram: red (good) blue (bad) Figure 13: San Nicola façade mesh alignment error evaluation through horizontal cross sections

Input Image Resolution:
The last test we carried out deals with the incidence of image resolution on metric accuracy.We worked on San Nicola datasets considering four different datasets accordingly with images number and resolution as shown in Table 5.
Table 5. Datasets for image resolution tests Since in the previous section we demonstrated that quality of mesh affects the accuracy of metric model, it seemed sufficient to perform just only a visual comparative analysis.The comparisons were made in Meshlab visualizing the meshes both in textured and in smooth mode.The presence of holes has been considered of secondary importance compared to the quality of mesh's graphic detail.
In figure 14 are clearly visible some common 3D reconstruction deficiencies: holes and lack of sharpness.The latter is more important to carry out a high visual and metric mesh accuracy.123D Catch manages leaner, faster and reliable a photos dataset not so much wide.On the other side a wide range of photos dataset guarantees more resolution, sharpness and geometry accuracy, especially in details.
123D Catch tends to close automatically mesh holes without considering the actual geometry of the object.The reliability of the holes closed automatically by Catch is not acceptable.
We can conclude that it is preferable to give a photo dataset with high resolution images.

CONCLUSIONS AND FUTURE WORKS
We described the overall design of image-based reconstruction algorithms, and evaluated a number of 3D reconstruction models.We can conclude that 123D Catch is an excellent tool for Image Based Modeling.However, to give a reliable overview, we have noted Catch tips, advantages and disadvantages.

Suggestions:
Metric accuracy was significantly affected by mesh quality.Therefore, we must apply the two parameters that control mesh quality, namely: resolution of dataset and number of images.
The number of pictures must be appropriately selected depending on size and level of detail and according to parameters that regulate photogrammetry.For large scale Architectures is better produce your photo datasets as large as possible to ensure a metric accuracy of a few centimeters.
Advantages: In order to obtain processing reliable according to metric accuracy, it is necessary to use cameras with a resolution between 6-12 Mpixel.Therefore, it is possible to use non professional cameras without specific lens.All those advantages cut down the costs not only in terms of required equipment but also in terms of hours/man, hours/machine you have to consider when you start a 3D digitalization project.As a matter of fact, when you manage to plan a laser scanner project, all those ratios affect very weightily on the intervention.Disadvantages: • Photos dataset must be structured.

•
The building or the object to capture should be shot in its entireness.123D Catch is not able to manage the overlapping between two frames in height.This latter condition strongly limits the use of this tools in several architectural applications.Almost of time, you are not in the optimal condition to capture a good photo dataset.Practitioner needs to have as much as possible information on the entire architectural building, to carry out horizontal and vertical cross-sections, elevations, etc, for its professional activity.Since our study is addressed to verify 123D Catch, giving a guide to practitioner to use such a powerful low cost tool, we can assume that: even though visual and metric accuracy are excellent, nevertheless, the need to capture the building in its entireness, considerably reduces the possible case studies, and then, a full use by practitioner.
Otherwise, it's an excellent tools for other application fields such as: researcher investigations; archeological survey; museums visual art collections survey; architectural elements survey.123D Catch visual and metric accuracy and reliability testing on both the small and on a large scale was a critical step, so far lacking in literature.

Figure 1 .
Figure 1.Datasets used for visual accuracy tests

Figure 2 :
Figure 2: The photographic sequence needed for the acquisition of an architectural element.

Figure 3 :
Figure 3: 123D Catch: civil building (A) in Palermo For testing we chose an architectural detail and the façade of the church San Nicola l'Arena in Catania and the Auteri Chapel in Catania.For metric comparison we used both reliable direct surveys and point clouds carried out by TOF (Time of Flight) laser scanner 3000 HDS by Leica Geostystem of Laboratory of Architectural Photogrammetry and Survey "Luigi Andreozzi" (University of Catania).

Figure 5 :
Figure 5: Datasets used for metric accuracy tests 3D model made in 123D Catch has been exported in obj format.Metric comparison has been carried out in Meshlab, able to scale, align and process both point clouds and meshes.The alignment has been carried out glueing and scaling Catch mesh on laser mesh using ICP algorithm.Alignment outcomes between two meshes have been verified applying Hausdorff distance filter and visualized through vertex quality filter.The achievements are shown in a red-green-blue scale, where red means good and blue means bad.Furthermore, we carried out a series of vertical and horizontal cross-sections on the two aligned meshes in JRC Reconstructor environment to quantify/visualize better the gaps between the two meshes.Dealing with Auteri Chapel the comparison has been carried out in CAD environment overlapping the cross-section created in JRC Reconstructor on CAD drawings.

Figure 6 :
Figure 6: Auteri Chapel 123D Catch model, on the right smooth visualizationAs previously verified, 123D Catch 3D reconstruction encountered some problems for the lacking of continuities of images in dataset, due obstacles.For metric accuracy test we used horizontal and vertical cross sections compared with Ground Control Points (GCP) carried out by a reliable direct survey .Therefore, 123D Catch model was scaled and referred for overlapping to direct survey drawings.Figure7shows metric accuracy evaluation considering two significant horizontal cross-sections: one at 0.75 and the other 1.50 m.Except for the areas where 3D reconstruction is not geometrically exact, the gaps are about of 0.01m, thus confirming what previously tested.

Figure 7 :
Figure 7: Auteri Chapel reconstruction error evaluation on horizontal profiles: yellow 123D Catch model, red GCP survey 3.2 Large scale architecture 3.2.1 Part of the lateral entrance: We chose a part of the lateral entrance of the church of San Nicola l'Arena in Catania.Table3summarizes all the information about dimension, resolution, images number and information on the mesh quality in terms of number of vertices and triangles (faces).The shots have been carried out by a Coolpix L22.The dataset includes 20 images and its visual appearance is very good.We observe that 123D Catch model is more detailed than point cloud one.The outcomes of the comparison denote an very good overlap.The average error is 0.0125m.Furthermore, both the horizontal and vertical cross sections carried out with a step of 0.05 m reveal a very good quality of 123D Catch model (figure10).

Figure 8 :Figure 9 :Figure 10 :
Figure 8: Part of the lateral entrance123D Catch model Architectural element Part of lateral entrance Dimension of the object 8x12 m Number of images 21 Resolution 8,5 Mpixel 123D Catch mesh 1,768,595 triangles 886,705 vertices Laser scan point cloud 169,363 vertices Average error 0.0125 mTable 3. Part of lateral entrance dataset

Figure 14 :
Figure 14: San Nicola 136 dataset -on the left High Resolution photos 3D model, on the right Low Resolution photos 3D model

Table 1 .
Datasets used for visual accuracy tests

Table 4 .
(Santagati, Inzerillo, 2013)got two datasets with 74 and 136 photos.Here we report the results of 136 photos dataset.Otherwise in(Santagati, Inzerillo, 2013)we verified an error of 0.10 on the model carried out by 74 photos dataset.Table4reveals all resolution, dimension of the object, images number information.During Catch processing it was necessary to stitch manually 50 images to sketch in 86 automatically processed.Because of terrestrial laser scanner mesh heaviness, we carried out a light mesh in JRC Reconstructor preserving all sharpen edges.Then we got meshes alignment.Whereas, the comparison has been carried out with original point cloud mesh.San Nicola dataset Table 4 denotes all the information regarding dataset dimension, resolution and images number.Dealing with a large scale architecture, we verified how and if the images number affected both visual accuracy and metric accuracy of The achieved outcomes are promising.Nevertheless, among some issues, that still remain open, we suggest: • Comparison on the same datasets with other available SfM tools both on line and desktop; • Possibility of integrating different SfM packages; Possibility of use this tool for integrating lacking in laser scan point clouds without losing metric accuracy.