AN APPROACH TO BUILD A COMPLETE DIGITAL REPORT OF THE NOTRE DAME CATHEDRAL AFTER THE FIRE, USING THE AIOLI PLATFORM

: With the development of digital technologies in the documentation methods for cultural and architectural heritage, platforms and tools have emerged as solutions to collect, manage and produce data for multidisciplinary monitoring and studying purposes. In the context of an incident such as the fire of the Notre-Dame de Paris in April 2019, the considerable mobilization of the architects, scientists and researchers, has led the "digital data" working group to conceive a digital ecosystem capable to collect and integrate existing data, produce new data, share and archive, and finally structure and semantically enrich. One of the particular aspects of this works involves an approach to build a complete digital report of the Notre-Dame de Paris cathedral. With the help of photogrammetric acquisitions of the cathedral, leading to point clouds generation, projects were created in the aïoli platform. The structuring of the collected data (condition report, architectural descriptions, chemical and physical analysis…) has then led to the construction of semantic annotations, accompanied by description sheets and attachments. The vast multidisciplinary studies conducted on the cathedral as well as the large number of projects and annotations built over the course of 4 years constitute a singular collection of data, that will be the object of cross comparison and data interpretation in the future years.


Operational context
Since the fire on April 15th 2019 of the Notre-Dame de Paris cathedral, the scientific community has been willing and mobilized to assist the restoration worksite. Indeed, the cathedral is not only a monument of the French architectural heritage, but also renowned internationally : its reconstruction and analysis is of interest not only to scientists, architects and engineers, but also to the public. As a consequence, this event was a unique opportunity for the research on cultural heritage to take part in an experimental framework : the scientific worksite on Notre-Dame de Paris, established by the CNRS and French Ministry of Culture. This framework allows researchers and professionals from various fields, including architecture, archaeology, history, computer science, anthropology, chemistry, and physics, to gather and build data on the cathedral and the broader field of cultural heritage science.

Objectives of the digital ecosystem and of this work
Considering the extent of the case study, and the numerous people and fields of work involved, the "digital data" working group aims to create a "digital ecosystem" (De Luca et al., n.d.) in an effort to collect and integrate existing data, produce new data, share and archive, and finally structure and semantically enrich. This working group is currently involving 12 research units and 2 consortia from the TGIR HumaNum, for a total of approximately 35 researchers and engineers. Various tools, such as Esmeralda, ArcheoGRID, 3DHop, Opentheso, NDPviewer and Aïoli (further details on https://www.notredame.science/outils/) are currently being developed and enable documentary categorization, semantical categorization and 3D spatialization and annotation. The ultimate aim is to interconnect these tools to analyze and correlate the multi-dimensional data they produce. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-2-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy As a consequence, this particular work takes place in the larger framework of the digital data working group and the digital ecosystem . Current works allow building a multi-focused approach on key issues (data structuring, photogrammetric acquisition, software engineering, semantic annotation…). The work here presented is mainly focused on the issue of 2D-3D semantic annotation. With the previous cited goals in mind, it aims to create a complete digital version of the condition report and scientific analysis for the cathedral. It uses the aïoli platform in order to structure, spatialize and semantically enrich the architectural analysis before and after the fire, as well as scientific analysis, through the construction of hybrid 2D-3D projects, as well as the semantic annotations. The first step, with the use of photogrammetry, was the generation of 3D point clouds, representing the base material, then followed by their annotations and semantic enrichment in the platform.

RELATED WORKS
As stated previously, this work is closely tied to other tools and vision developed in the digital data working group. However, it also draws upon previous research and experiments focused on creating digital and 3D condition reports. In the field of architectural heritage, reality-based 3D digitization has been employed for documenting conservation and restoration efforts (Apollonio et al., 2018). More recent works have explored the use of machine learning for image and point cloud segmentation by utilizing user annotations (Teruggi et al., 2020;Croce et al., 2021). In the CNRS-MAP laboratory, several significant experiments have been conducted on reality-based 2D-3D annotation while developing the Aïoli platform (Manuel et al., 2016), such as annotating weathering phenomena (Roussel et al., 2019) and documenting multimodal scientific imagery for restoration (Pamart et al., 2022). However, these experiments were limited to common amount of images and physical dimensions.

DATA AND IMAGE MANAGEMENT
The photogrammetric approach has resulted in the capture of over 58,000 images after the fire, amounting to a total of 1.36 terabytes of data. An additional collection of 40,000 images was gathered for documentation purposes, some of which are more or less compatible with photogrammetric processing. The challenge lies in structuring a corpus of 98,000 images in accordance with photogrammetric processing and semantic annotation specifications. The data comprises a heterogeneous collection of images captured over different acquisition campaigns throughout the years, with varying sizes and accessibility of the observed zones, obtained from different sources, and covering a temporal range from 2002 to 2021. The images were captured under different acquisition configurations (terrestrial or aerial, manual or robotized, with sparse or dense overlapping, etc.), from different sources (companies working on the restoration site (AGP, Bestrema, Mercurio, etc.), research laboratories involved in the Notre-Dame scientific action (MAP, LRMH, etc.), donors such as Professor Andrew Tallon (Vassar College New York) and Didier Groux (A-Bime), and spanning a period from 2002 to 2021. Therefore, although the overwhelming amount of images is a great asset for constructing a digital report with photogrammetric acquisitions as a base, managing and processing the data requires a specifically thought-out methodology and approach.

The aïoli platform
The annotation projects were built on last the version of the aïoli platform (Manuel et al., 2018), based on a web service and allowing multi-user collaborative documentation of 3D realitybased digitization of heritage artefacts. Under development in the MAP laboratory (Models and Simulations for Architecture and Cultural Heritage) with the support of the CNRS and French ministry of culture, web application allows users to upload a photogrammetric dataset, which is then fully automatically processed to build a dense 3D point cloud, by basing on the MicMac processing pipeline (Pierrot-Deseilligny et al.). One processed, the dense point-cloud is used as a common geometric framework for spreading and correlation 2D annotations within the entire image set. This 2Dto3Dto2D projection process, implemented as web service by the docker technology, is replicated hundreds of times during a work session, as background task, in order to let users build 2D-3D annotations, organized in hierarchical structures and multiple thematic layers. This digital environment also allows to enrich the 2D-3D annotation with custom description sheets and multimedia attachments. One of the latest advancements in the platform includes the ability to import previously generated Micmac or Metashape projects, using specialized image orientation conversion procedures (Pamart et al., 2022). Within our experience, we also implemented a special feature to align photogrammetry dataset to topographic references in order to manage the overall spatial referencing of the cathedral's data corpus. This reference frame is available is also used for merging all 3D spatialize data into web viewers based on ThreeJS and PotreeJS libraries), to ensure the data and metric consistency with other tools from the project's digital ecosystem.

Annotation methodology
As previously mentioned, due to the substantial amount of data and its heterogeneous nature, it was not appropriate or pertinent to conceive a unified project for the entire cathedral. Instead, a set of smaller projects was developed, focusing on specific sections of the cathedral during particular time periods. The building was segmented into smaller regions, based on its architectural structure and photogrammetric acquisitions. For instance, the cathedral's exterior was divided horizontally (at levels 1, 2, and 3) and vertically (by span). In the interior, each chapel was treated as a project, and the vaults and walls were subdivided into sections (Nave, Choir, Transept). Other projects focused on gables, sideboard walls, buttresses, the western facade, or a subset of the current collection of 240 aïoli projects. Among these projects, 186 have annotations: 137 pertain to the exterior of the cathedral, and 49 pertain to the interior. The smaller size of the exterior projects, as compared to the interior ones, is due to the use of a different segmentation method and accounts for the difference in project numbers. This discrepancy does not necessarily imply that the interior received less attention or study. The 51 remaining unannotated projects are part of a point cloud collection constructed using image datasets that were not analyzed or accessible. As the analysis progresses, these previously generated projects can be annotated rather than recreated.

Figure 5. The projects and their annotations
Since the experimentation of creating aïoli projects and their semantic annotation in the summer of 2020, the method has undergone evolution as the platform integrated new features and lessons were learned from previous projects. As time passed and the import function was added, projects representing broader sections of the building (originally limited to a single span, but now extending to the entirety of the vaults) and with a larger number of images (from 30 to 600) became easier and faster to process, thereby enabling a larger-scale segmentation of the cathedral than was originally feasible. The method and software tools employed to generate the point clouds have also undergone evolution, which are elaborated upon below.

First method : aïoli with its MicMac photogrammetric engine
Initially, in 2020 and 2021, photogrammetric processing were generated in the aïoli platform using customized function called "create a spatialize/scaled model". This required the uploading of photogrammetric datasets, along with a set of 2D / 3D correspondences file in the .txt format. The 2D markers were selected directly from the images (for the 2D file) and point clouds displayed on a general the 3D viewer containing the topographic reference (laser scan). Subsequently, the photogrammetric dataset underwent processing using the MicMac engine integrated within the platform.

Second method : import from MetashapePro
As previously mentioned, in recent years, an import function for projects built with MetashapePro has been developed at the CNRS-MAP lab. By utilizing a conversion (python) script employing the Metashape API, a compressed folder can be produced within the software and subsequently uploaded to aïoli as a pre-processed project conforming to the MicMac format. This necessitates the inclusion of all dataset images within the ZIP file, along with a list of these images containing their camera calibration, GP's, and 2D/3D correspondence, as well as the sparse and dense point cloud (in medium or high quality). This method is potentially less time-consuming than generating a project directly in aïoli, as it grants greater control over the individual processing steps and also enables the deletion of irrelevant portions of the point cloud, which is not feasible when utilizing the first method directly on aïoli.

Base documentary sources for 3D annotation
The semantic annotations developed in the digital condition report were derived not from personal observation, but from reports provided by professionals in the field. The data concerning the project's overall details, including prior restorations, nomenclature, painted decorations, furniture and statuary elements, as well as an architectural diagnosis and projected interventions, were extracted from the conditions report prepared by architects Philippe Villeneuve, Remi Fromont, Pascal Prunet, and their respective teams between 2019 and 2021 (Villeneuve et al., 2019). These observations encompassed a variety of architectural elements and supports, such as masonry, stained glass, electrical networks, and arches. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-2-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy derived from this data constitute the majority of the annotations, representing 91% of them, and comprise thousands of pages of analysis scrutinized since the beginning of the project in 2020. The other focal point and source of data is the information generated by the scientific worksite at the cathedral. Firstly, some of the work conducted was included in the general condition reports as annexes. For example, we can cite the radar auscultation of walls and vaults impacted by the fire (LERM laboratory), coring analysis of collapsed stones (LRMH laboratory), stratigraphic analysis of painted decorations (Parant Andalro restoration), and the deformation of vaults after the fire (MAP laboratory + Bestrema). The ongoing work involves a close collaboration with various scientific project working groups. Through joint meetings and workshops, one of the objectives is to consider a structuring of their data (which constitutes another axis of work) and to annotate these data on the platform. This new step goes beyond the mere annotation of analyzed data and involves working together in real-time, incorporating the various actors in the process, so that the organization and representation of the source data are as accurate and useful as possible for their work. Such projects began in the second half of 2022, covering various topics, such as stratigraphic analysis of the cathedral's facades or the location of metallic elements. Finally, other information and sources, such as general data regarding the restoration site monitored in the SafeWorks platform designed by OSMOS, will be added.

Collaborative framework
To identify the sources of annotations, several user accounts were established on the platform. A general account denoted as "NDP" representing the entire cathedral was created. The various entities, such as laboratories, enterprises, or professional (architects, restorers, ...), are represented by dedicated accounts, with each of them annotating the information they produced. The NDP account can share any project it has created with the accounts of the laboratories or architects who conducted studies related to that project. Presently, there are seven accounts responsible for annotations: MOE (the architects in charge of the restoration), LRMH, LERM, Parant.restauration, SignesL, GTmetal, GT-num, GT-bois, GT-Décor. New accounts can be created and added as new actors in the production of data, contributing to this work. The annotations within the projects are organized into groups and layers. As previously noted, users are owners of their own groups and layers, with each group representing a field of work (e.g., architectural condition report, radar analysis, stratigraphic analysis, etc.), and the layers separating this information by zone or type (e.g., within the architectural condition report group: masonry, arches, painted decor, future interventions, etc.). The number of layers and groups is not restricted, with users having the ability to create as many as needed. This multi-user approach to the projects allows for a multidisciplinary approach to the 3D object.

2D/3D annotation
Once the images, point cloud, and sources of information are gathered, the next step is to link these resources with additional information. The 2D/3D hybrid annotation process in aïoli enables the tracing of a contour on a selected image where the subject of observation is clearly visible. The projective relation then interprets the orientation and calibration data for the digital images, converting a 3D point to a 2D pixel position for the selected image. This process allows for correspondences between the 2D and 3D environments. For the presented work, the annotation method is interactive (based on manual selection), as the observations such as form, location, and sometimes color are already specified in the written and illustrated condition report. The aim is to integrate already identified observations rather than creating new observations since we are not professionals in the multiple fields of work present. However, the possibility of utilizing such methods to generate a new layer of information regarding the architectural components and segmentation is currently being experimented with in this context. Through careful observation and examination of the base documentary material, observations are annotated, striving to replicate them as accurately as possible on the platform, with appropriate contour drawing and color. In more challenging projects, which may date back several decades or occur in a specific context, the source and location of physical or chemical analyses sometimes lack precision. Investigative work may be carried out by cross-referencing data such as text descriptions with location plans to deduce a possible location and nomenclature of the observed item. Currently, after three years of work, the digital condition report contains over 280 groups, 635 layers, and 9,000 annotations across the seven annotation accounts.

Description sheets
To add further documentation and additional information to the annotations, description sheets can be added with various types defined, such as text, note, decimal, term from thesaurus, hyperlink, boolean, or date, with the field name left free. In this work, the description sheets were mainly text or note fields, with the approach varying depending on the types of layers and disciplines observed. For the architect's condition report, the addition of fields allows for the creation of a caption to explain the chosen color or observed alteration, as well as providing additional information on the observations, such as the likely causes of the present alterations, special precautions to be taken, or restorations considered. These fields also specify the necessary information to understand the source of the annotation, such as the report or work from which it is extracted and the date. The MOE owner's projects typically have a first field "comment" representing the caption, a second field called "general" including written information added in the report, and a field called "source of the annotation." Regarding the work based on research laboratory studies and the established structuring of their data, the objective was to consider the fields of the description sheets like columns of a spreadsheet to integrate precise information into each field, such as a sampling date, report date, author, location, or detailed stratigraphic analysis by layer. As a consequence, the description sheets are more numerous but contain less text. In any case, the structuring and nomenclature for the sheets, in terms of field titles and content, is crucial to create continuity between annotations, layers, and even projects, facilitating data analysis and query construction in the future.

Attachments
In addition to the description sheets, the platform allows for the attachment of additional information linked to specific annotations. These attachments can be of various types, including images, videos, audio sources, text or PDF reports, and are intended to provide users with access to the original source data, thereby enhancing the knowledge already available in the platform. Final users (having access only to the platform web viewer) cannot download this information, as it is meant for consultation purposes only.
In this project, selected items from the architectural condition report were attached as attachments, such as historical photographs, documentary photographs, specific maps, or charts. In some cases, specific pages were extracted from the reports and linked to provide more context than the image itself. In the case of scientific analysis, the entire report was sometimes linked since the information could be dispersed throughout the document.
A future feature could enable users to link an entire report and redirect them to specific pages, allowing them to have a clear vision of the base information for the specific annotation while also enabling them to explore the report.

DATA ANALYSIS AND FUTURE WORKS
The nomenclature and data structuring of the projects to designate the annotations, layers, groups and description sheet represents a key factor in the possibility to envision the operation of the collected data.
At present, queries allow us to extract numerical data, by searching for a specific term or set of terms within the titles of annotations or description sheets. For example, we can find the term "diag" 3438 times in the names for the annotations, or the word "fissure" (meaning "crack") 218 times. Similarly, searching for comments or projected interventions can shed light on the extent and proportion of proposed actions as well as the recurrence of certain observations. For instance, the mention "à remplacer" (meaning "to be replaced") appears 2540 times in the description sheets, of which it is associated 2202 times with the word "pierre" (meaning "stone"). Similarly, the term "remplacer" (meaning "replace") is associated 156 times with "Pign" (meaning "gable wall") or 53 times with "NF" (meaning "nave"), therefore providing us with information on the location and object for the proposed action. The different elements and their relations, such as annotations, groups, layers, description sheets, alignment with thesaurus, can be visualized in graphs. This analysis of data and constructed annotations has not yet truly begun, as the project creation and annotation process is still ongoing. More complex queries could be considered by combining analytical data with spatial, dimensional, or morphological data. Moreover, with the projective relation in aïoli, every pixel in the image set is directly connected to the data within the image, such as color and gradient, as well as to the data of the point cloud, including 3D point coordinates and normals. As a result, this projective relation opens up possibilities for interaction between images and point clouds. Likewise, the annotations, which are expected to exceed 10,000 in number, can serve as a distinctive database for experiments identifying phenomena of alteration using deep learning methods.

LIMITS
Working with a tool that is still under development in a research laboratory entails certain limitations, including the absence of useful features and the presence of bugs that may require the use of workarounds. In particular, the lack of finalized thesaurus integration function has led to the use of free-form fields for description sheets. This requires careful attention to ensure consistency in the way elements are named, both in terms of word choice and text formatting, to enable later analysis and interpretation of the data. Furthermore, this annotation method is limited in its ability to integrate elements whose locations are unknown. Semantic enrichment is based on the annotation of specific positions within the aïoli projects and the 3D referential of the cathedral. As a result, elements that have been removed from their environment or analyzed with less precision regarding their positioning may be challenging to include. Moreover, the construction of this work is contingent upon the acquisition of source data for a given area. The absence of acquisition data may render project creation impossible, despite the vast collection of available images. Additionally, certain areas of interest in the building, hidden or inaccessible, may not have been photographed yet. However, these spaces will be photographed in the future, enabling the construction of new projects and annotations. It is essential to approach this task with a long-term perspective.

CONCLUSION
This article describes an approach to create a comprehensive digital report of Notre-Dame de Paris after the fire, through the acquisition of the cathedral and the gathering of analysis and condition reports from professionals. The photogrammetric point clouds are built using MetashapePro or aïoli software with the aid of markers to ensure a consistent spatial reference system with a overall topographic reference. 2D/3D annotations are added to the aïoli platform and semantically enriched through description sheets (with customizable fields) and attachments. The ongoing work includes the integration of additional elements such as projects for other areas of the cathedral, now temporal slots (during the restoration and beyond), new thematic groups and layers for various fields of study, architectural morphology's description, and multidisciplinary data analysis and comparison.
The digital data working group in the scientific worksite on Notre-Dame de Paris will continue this work over the next few years.