VALIDATION OF FULL-RESOLUTION DINSAR-DERIVED VERTICAL DISPLACEMENT IN CULTURAL HERITAGE MONITORING: INTEGRATION WITH GEODETIC LEVELLING MEASUREMENTS

: Towards revealing the potential of satellite Synthetic Aperture Radar (SAR) Interferometry (InSAR) for efficient detection and monitoring of Cultural Heritage (CH) encouraging resilient built CH, this study is devoted to the validation of InSAR-derived vertical displacements with a full-resolution perspective taking advantage of high-precision geodetic levelling measurements. Considering the Cathedral of Como, northern Italy, as the case study, two different Persistent Scatterer Interferometry (PSI) techniques have been applied to Cosmo-SkyMed high-resolution SAR images acquired in both ascending and descending orbit tacks within the time interval of 2010-2012. Besides using the simplified approach for obtaining the vertical displacement velocity from Line of Sight (LOS) velocity, a weighted, localized, multi-track Vertical Displacement Extraction (VDE) approach is proposed and evaluated, which uses the technical outcome of Differential InSAR (DInSAR) and spatial information. The results, using a proper PSI technique, showed that the accuracy level of extracted vertical displacement velocities in a full-resolution application is ca. 0.6 [mm/year] with a dense concentration of InSAR-Levelling absolute errors lower than 0.3 [mm/year] which are reliable and reasonable levels based on the employed validation framework in this study. Also, the weighted localized VDE can significantly decrease the InSAR-Levelling errors, adding to the reliability of the InSAR application for CH monitoring and condition assessment in practice.


INTRODUCTION
Built heritage and archaeological sites are prone to tangible and intangible damages caused by progressive ground deformation due to earthquakes, anthropogenic/natural subsidence, landslide, etc. The high physical and systemic vulnerability against this geohazard (Themistocleous et al., 2016) and the unique socioeconomic values (Rudokas et al., 2019) of these cultural assets particularly demand proper frameworks for assessment and monitoring purposes. In this context, the framework requires a high level of interdisciplinarity and reliability for better identification of the current condition and evolutionary processes to support the planning of conservation and mitigation measures towards resilient Cultural Heritage (CH).
Among the Structural Health Monitoring (SHM) techniques for measuring deformations, performing on-site measurements is characterized by being expensive, time-consuming, and demanding in terms of necessary expert operators (Yang et al., 2022;Scaioni et al., 2018), which leads to low operational frequency. The limited number of measurement points (e.g. in the case of optical levelling measurements and LVDT sensors) and low level of precision (e.g. in the case of Terrestrial Laser Scanning (TLS) (Wojtkowska et al., 2021)) are other disadvantages of these techniques.
On the other hand, Synthetic Aperture Radar (SAR) Interferometry (InSAR), as a powerful remote sensing technology, has been proven to be able to partially, but effectively, overcome these challenges in different scales of application. Persistent Scatterer InSAR (PS-InSAR) (Ferretti et al., 2000(Ferretti et al., , 2001 was the first pioneering technique introduced in the concept of Persistent Scatterer (PS) Interferometry (PSI), which was followed by further developments, such as Interferometric Point Target Analysis (IPTA) (Werner et al., 2003), PS Pairs (PSP) (Costantini et al., 2008), and Quasi-PS-InSAR (namely partially coherent targets) (Perissin and Wang, 2012). The availability of Multi-Temporal (MT) SAR images, acquired in relatively short time spans and from slightly different satellite positions, makes these techniques able to estimate useful information (such as height relative to the ground, deformation trend, and seasonal displacements) corresponding to the sensorfriendly targets on the earth surface through Differential InSAR (DInSAR). For a comprehensive and complete review, refer to (Crosetto et al., 2016;Ho Tong Minh et al., 2020).
Among the broad field of InSAR applications, Archeological and Cultural Heritage (ACH) and historic areas are involved in a special class of application which has increasingly gained the attention of researchers in the last decade. The low to medium resolution (C-band) SAR images (such as ERS1/2 (Tapete et al., 2012;Eskandari and Scaioni, 2023), Envisat (Tapete et al., 2013), and Sentinel-1 (Tzouvaras et al., 2019;Moise et al., 2021), etc.) are reported to have the capability of deformation detection from large-scale historic sites to local-scale of multiple buildings (in some cases, few measure points on a single building with a large plan (Karila et al., 2013)). This limited number of target points may represent the overall phenomenon progressing over the heritage of interest, therefore, they may be sufficient for monitoring, decision-making, and taking actions (Tapete and Cigna, 2017b). However, the satellite missions such as Cosmo-SkyMed (CSK) and TerraSAR-X launched by the European Space Agency (ESA) are providing high and very high-resolution SAR images (X-band), practically from the early 2010s, which have opened the doors to monitoring and condition assessment of built CH with a high level of detail. Here, a brief review of the recent relevant applications is provided.
In the case of using CSK, Cavalagli et al. have established the relation between the InSAR-derived information and the outcomes of on-site monitoring systems implemented on some built heritage of Gubbio, Italy (Cavalagli et al., 2019;Cavalagli et al., 2021). Bozzano et al. have used the deformation history of the PSs detected on a historic Monument in Rome, Italy, to retrieve a comprehension of the deformations and damages to the asset, supported by geological data (Bozzano et al., 2020). By considering two case studies of CH, Berto et al. have evaluated the reliability of PS information in this class of application (Berto et al., 2021). Tapete et al. have fully investigated the CSK capabilities and applications developed and implemented from 2017 to 2019, specifically for the detection and assessment of archaeological and heritage sites, using SAR images acquired in both StripMap and Spotlight image modes (3m and 1m spatial resolution, respectively) (Tapete and Cigna, 2019). The recent exploitation of TerraSAR-X images integrated with other techniques for monitoring CH can be seen in (Chen et al., 2021;Chen et al., 2022). For a complete review of the primitive applications of space-borne InSAR for ACH, interested readers may refer to (Tapete and Cigna, 2017a;Chen et al., 2017;Lasaponara and Masini, 2020).
According to the literature, there is always the need for further investigation into the reliability and precision of information obtained by InSAR. Furthermore, addressing how to overcome the encountered challenges and evaluating the further potential of InSAR for monitoring ACH play an important role in encouraging the transition of satellite InSAR monitoring from research to practice. In this letter, using different PSI algorithms implemented in the SARPROZ software package (Perissin et al., 2011), the reliability of InSAR-detected vertical displacements (properly extracted using simplified and proposed vertical component extraction techniques) over Cathedral of Como, Northern Italy, is evaluated through integration with the precise levelling measurements. The results particularly contribute to the performance assessment of fully-exploited InSAR using highresolution SAR images for detection and monitoring of cultural heritage, by assessing the modified framework for validation purposes and determining an accuracy level for this type of deformation monitoring.

Cathedral of Como, Italy
The Cathedral of Como (in Italian: Duomo di Como) placed at the historic centre of Como, northern Italy, is considered as the case study of this work. Como town located adjacent to the Lake of Como is known to be subjected to various rates of natural/anthropogenic subsidence (or uplift) which has been evaluated and monitored through optical levelling operations in the past and satellite InSAR in recent years, to shed light on the triggering factors of the phenomenon (Nappo et al., 2020), the corresponding effects on the built historic area (Nappo et al., 2021), and hydro-geotechnical settings of the area (Bajni et al., 2019) via multidisciplinary approaches. Also, through a statistical approach, the PSI measurements using the archive of medium-resolution ERS 1/2 SAR images have been validated using the precise historic vertical displacement rates collected from the urban-scale levelling network of the Como area (Eskandari and Scaioni, 2023).

Raw Data: InSAR and Levelling
To perform the multitemporal InSAR analysis, SAR stacks of Single-Look Complex (SLC) products acquired by Cosmo-SkyMed (CSK) satellite mission in the Stripmap HIMAGE mode (spatial resolution of 3 meters) orbiting in both tracks (ascending and descending) are used, and the corresponding details are listed in Table 1. The external data implemented for the DInSAR processing include DTM 5x5 2015 extracted from Geoportal of Lombardy Region (as the Digital Elevation Model (DEM) for topographic removal) and the mean daily temperature corresponding to scene acquisition (for estimation of thermal fluctuations).
According to the archive of levelling measurements obtained within the network of Como Cathedral, available at the repository of the Survey Department of Politecnico di Milano, highprecision levelling surveys were started in November 1985 on 46 benchmarks inside and outside the cathedral, which were followed by installation and measuring more benchmarks in the nearby area. With almost annual campaigns, the surveys have been carried out until May 2012. Due to the practical availability of CSK SAR images from 2010 over this area and, at the same time, the aim of this study, only the measurements acquired in the last three levelling campaigns (12 Apr 2010, 18 Apr 2011, 09 May 2012) have been employed to determine the vertical deformation rate at the benchmarks at which the measurement has been performed in all these three surveys (Figure 1(a)).

MT-DInSAR processing
In the context of DInSAR, one of the most challenging components which acts like an obstacle in the proper estimation of useful information is the atmospheric artefacts. One of the effective ways to remove this component is to use the PSP technique in order to, first, estimate and then remove the atmospheric phase screen (APS) from the whole scene. The alternative approach, which is implemented in this study, to tackle this issue is to limit the scene under study by focusing on the small portion containing the interested targets instead of processing the whole scene. The basic assumption is that the atmospheric delay is spatially uniform over this selected area, therefore, the differential phase between any arbitrary PS and reference point does not include the atmospheric component. This "Small Area" idea can significantly decrease the processing cost and can be effective when a single CH building and nearby surrounding site are concerned. However, it is important to check the validity of the assumption by controlling the outcomes.
Selecting a set of pixels as Persistent Scatterer Candidates (PSC), which are characterized by high quality in terms of being a good target for SAR sensors, is an initial step in a standard PS-InSAR. Firstly, it is done by neglecting the pixels which are detected as sidelobes around the pixels with a strong signal called Local Maxima (LM). Then, among these selected LMs, a second criterion is applied to sort out the best candidates, and this criterion can be chosen based on the value of parameters such as Amplitude Dispersion , Amplitude Stability Index , and/or Spatial Coherence ɣ . Through this cancelling-out procedure which aims at decreasing the processing time and obtaining more reliable information, several points with the potential of representing valuable information will be neglected.
Hence, in this study, a full-resolution perspective (considering all the pixels in the scene) has been considered to determine the potential of these commonly-neglected pixels in terms of estimated information. Here, due to utilizing the Small Area approach, the computational expenses are hammered, and thanks to Temporal Coherence ɣ (as one of the most crucial outcomes of MT-PS processing indicating the integrity and quality of estimated information at the corresponding pixel), the unreliable measurement points can be detected and then cancelled out. This can significantly increase the spatial distribution of points with reliable information in the matter of the real height of the targets (which is beneficial for 3D reconstruction of CH), seasonal fluctuations (showing the effect of climate change all over the CH), and displacement velocity.
Another important aspect is a proper selection of the image connection graph by which the SAR images are connected to establish the interferometric pairs in an MT approach, and each connection is characterized by a mean interferometric coherence ɣ . A standard PS-InSAR analysis works with a single master configuration (namely STAR graph), while other approaches such as Small-Baseline Subset (SBAS) (Berardino et al., 2002;Lanari et al., 2004) use redundant multiple connections (and sometimes full graph of connections in the case of QPS). Both configurations, besides other possible connection graphs, are available in the SARPROZ software package. Here, considering the full-resolution perspective, two processing techniques have been performed:


: Using a single master configuration with a STAR graph, and the master has been selected in the way of optimizing the values of the perpendicular (normal) and temporal baselines.  : limiting the values of ɣ , , and in a full graph configuration; and then, weighting the remained connections with mean ɣ calculated in a × window centred at the pixel under processing.
The latter is mainly used in the case of Distributed Scatterer (DS) processing (such as QPS), through which the phase time series of weak pixels, being placed in a neighbourhood of similar pixels, is enhanced utilizing different techniques, such as spatial phase filtering or phase model estimation (Even and Schulz, 2018). This approach is commonly employed for non-urban areas with non-complex surfaces and particular applications such as landslide detection and monitoring. However, in the case of built ACH with particular geometric complexity, using this procedure may lead to undesirable results. Thus, the phase enhancement in is not performed in this work and the phase time series are directly called from the unfiltered, spatially wrapped, and weighted interferograms for MT-PS Processing (similarly as ).

Vertical Displacement Extraction
Regarding precise levelling data, the time series obtained by the last three campaigns were already the vertical displacement of the benchmarks concerning the reference point located at the front entrance of the building. In order to obtain the annual vertical displacement rate ( ), here, a Least-Squares (LS) linear curve fitting is performed. On the other hand, the displacement velocities obtained by DInSAR are in the direction of Line Of Sight (LOS) corresponding to the viewing geometry and look angle (with respect to the nadir) of satellite imagery. Different studies have been carried out to extract the 3D displacement component of the sensed ground target from LOS Velocity ( ) (Pepe and Calò, 2017). One branch uses external displacement information, such as GNSS (Susaki et al., 2020) and GPS (Polcari et al., 2016). On the other hand, the data-driven approaches are devoted to the extraction of Up-Down ( ), East-West ( ), and North-South ( ) components in a multi-track and multi-angle approach (using both ascending and descending from different satellites) considering their dependency on the viewing geometry and orbital parameters (Dai et al., 2015). An important issue in this branch is that most of the well-known satellites carrying SAR sensors are orbiting in a sunsynchronized near-polar track, which means that the azimuth direction is nearly parallel to the N-S direction, therefore, extracting is not relevant using data-driven approaches, and in this study, this component is neglected. Defining the projection trigonometry as , = cos − sin cos where and are the incident angle and satellite heading angle (with respect to the north), respectively, only two points with different satellite parameters is enough to retrieve the unknown displacement rates. In Equation (1), depending on whether ascending or descending data stacks from on satellite mission are used, corresponds to ascending or descending.
Due to the full-resolution perspective considered in this study, besides the meter-level-accuracy estimation of the height and rough positioning of the data points, a localized approach is implemented here through which a 3D portion is considered and all the data points falling within this space are involved in the extraction of displacement components. Therefore, when using ascending and descending data stacks acquired by constellations of a single satellite mission, for an arbitrary 3D portion Ω including and InSAR-derived data points from ascending and descending datasets, respectively, the matrix version of Equation (1) Each equation in this over-determinate linear system corresponds to a data point, which is characterized by temporal coherence ɣ showing the quality of the information estimation. Furthermore, despite the complex geometry of the building under study, it is assumed that in a localized space, the neighbouring data points falling within the Ω make a planar surface. Hence, the absolute distance ∆ of each data point from the fitted planar surface in the Ω is used as an identifier for spatial integrity of the data points. Here, a diagonal weight matrix composed of the contributions from ɣ and ∆ for an arbitrary Ω is utilized: where is the identity matrix, ⊙ is the matrix-to-vector Hadamard (elementwise) product, and ≥ 0 is the influence power of the temporal coherence in the weighting procedure. Assuming that Equation (2) is expressed as . = , the LS method can be used to solve the weighted over-determinate linear system of and the deformation components of the interested 3D portion are retrieved. In the case that there are less than 4 data points within Ω , the component related to ∆ will be equal to 1, since the minimum points required for LS planar surface fitting is 4 points. The effectiveness of this weighting procedure will be discussed in Subsection 4.2.
The procedure described above is suitable for Ω within which there are at least one data point from each ascending and descending dataset. A simplified method can be adopted to retrieve the vertical component of the LOS velocity which assumes that the E-W component is negligible, as well as N-S contribution. Therefore, for an arbitrary Ω , from Equation (1), it can be concluded that = , / cos (5) This simplification of LOS projection is successfully used by different researchers (Bayramov et al., 2021) and has been also adopted in this study.

Validation Stages
The benchmarks for levelling operations have been placed on the ground level. Due to the viewing characteristics of satellite SAR imagery and the height of the buildings in the area of study, most of the levelling benchmarks and the corresponding nearby neighbourhood close to ground level are not simultaneously covered by both ascending and descending tracks. Hence, after merging ascending and descending InSAR-derived datasets, two major stages for the validation process are employed: i) using simplified cosine correction of LOS velocity (Equation (5)) for the data points falling within Ω defined by a circular buffer of 5-meter geodesic radius cantered at levelling benchmark and a height interval of [199][200][201][202][203][204][205][206][207][208] meters above sea level (asl) forming a cylindershaped space; then obtaining the weighted average of vertical velocity at each height sub-interval. Here, for weighting, the value of of each data point is multiplied by the corresponding ɣ where = 5 is chosen.
ii) Using the weighted multi-track displacement components extraction (Equation (4)) for the data points falling within Ω defined by a circular buffer of 5-meter geodesic radius XY-cantered at levelling benchmark and Zcentered at a series of height values starting from 199 m asl to the highest value of height over the area of study with a spacing step of 1 m. The height of each cylindershaped Ω is considered 1 m. It is important to note that the quantities of in this stage are derived by previously weighted equations.
In each stage, the value of for each Ω is compared to vertical the displacement rate of the corresponding levelling benchmark in terms of absolute error.

Preliminary Challenges, Solutions, and Results
After cropping and co-registration of both ascending and descending SAR images, the stacks have been manually geocoded, and then two techniques and have been performed. From an InSAR processing point of view, the main encountered challenges and the implemented solutions in different steps are: 1) selection of Ground Control Point (GCP) and the corresponding pixel in manual geocoding, which may result in a wrong estimation of residual height (with respect to the DEM) as a crucial component in DInSAR context. It was seen that choosing a GCP close to the building of interest may result in a more accurate estimation of information; 2) selection of Reference Point (RP) for MT PS processing, which may not affect the results where differential displacements are concerned. However, in cases such as comparison with the measurements obtained by other technologies (like this study), a wrong selection of RP can lead to very high values of the difference between the two techniques. Therefore, the closest InSAR data point to the RP of levelling measurements (Benchmark 1 in Figure 1(a)) is not necessarily the best option for RP of MT PS Processing (Purple point in Figure 1(b) and (c)); 3) validity evaluation of choosing Small Area hypothesis, which is done by checking if there are high-quality InSAR data points with high values of ɣ far from the selected RP. If so, basically, the atmospheric artefacts have not affected the differential phases between RP and the interested data point, which demonstrates the correctness of spatial uniformity of APS. The threshold of Temporal Coherence ɣ ≥ 0.65 is selected for all the InSAR results hereafter, meaning that the data points characterized by lower values of ɣ are neglected. The results of are shown in Figure 1 (c), which are similar to the results of in this level of interpretation. Figure 1(b) shows the results of a standard PS analysis after applying the Local Maxima criterion for PSC selection (neglecting sidelobes). Comparing (b) and (c) proves the significant enhancement in spatial density of data points which is beneficial for 3D reconstruction of ACH and point cloud generation and processing. Recalling, once again, the necessity of evaluation of the information estimation quality through DInSAR, the deformation trend (velocity in mm per year) parameter obtained by different techniques is discussed in the next subsection.

Validation outcomes
Following the strategy developed in subsection 3.3, the outcome of Validation stage (i) is shown in Figure 2 (a) and (b) respectively for and . Here, the points which could be chosen as Local Maxima (LM) are excluded from the fullresolution analysis, and Sidelobes (SL) and ML points have been assessed separately. As mentioned previously, the Ω blocks in this stage are Z-cantered at different heights (forming several subintervals) at each benchmark. The weighted average absolute errors at each Ω (excluding LM points) are shown with the times-sign-shaped data points, while the mean value of all the subinterval errors is shown with diamond blue shapes. The weighted average of ML absolute error is shown with the red circle.
At first glance, it is obvious that except for some rare violations, the errors are approximately limited to 0.6 [mm/year] for both employed DInSAR techniques. Inspecting the dense concentration of errors, from a statistical point of view, this approximate limit goes down to 0.3 and 0.5 [mm/year] for and , respectively. Furthermore, Figure 3, illustrating the correlation between the errors in both techniques, shows that the is more realistic than when processing the ML pixels (low number of red circles at the right side of the dashed grey line). Such a claim can barely be made for mean SL points based on Figure 3, since these types of data points have different deformation detection capabilities at different benchmarks. However, considering the expensive processing time for due to a higher number of interferometric pairs (sometimes 10 times more with respect to ) and calculation of spatial coherence for interferogram weighting through DInSAR processing, and also by considering the denser concentration of mean SL data points in Figure 2(a) close to zero, it can be concluded that is a better option with higher performance for exploiting the full-resolution potential of SAR images in this class of applications.
Since then, the validation outcomes have been concerning the simplified cosine correction of LOS velocity (Equation (5)). On ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume X-M-1-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy the performance of weighted localized Vertical Displacement Extraction (VDE) (Equation (4)), Figure 4 shows the absolute errors in the case of using VDE with and without weighting (taking advantage of ɣ and ∆ if available) in the benchmarks where this type of VDE is possible at one or multiple Ω blocks. Comparing hollow red circles with filled green points, weighting may increase the errors by a very small amount in a few instances, however, the several significant drops from very high values of the error to lower values of 0.3 [mm/year] through weighted VDE is remarkable. Similar to Figure 2(a), the dense concentration of errors using weighted VDE can be seen at values lower than 0.3 [mm/year] which reveals the high quality of the detected displacements. Most of the Ω are Z-centred at approximately 220 [m asl] at (or close) to the edges of the main roof (where the walls meet the first-level roof) approximately 20 meters upper than ground level. From benchmark 26 to 32 located on the southern side of the cathedral, almost all the Ω can be found on ground level or at the 220-meter level. The InSAR-derived vertical velocity close to benchmarks 47 and 49 at the northern side of the building, which could not be assessed through the Validation stage (i), are representing very low values of errors while obtained 20 m higher than ground level. It is important to note that the InSAR-detected vertical velocities at benchmarks 54 and 57 installed on the ground level of interior columns holding the dome of the cathedral are estimated in Ω located on the dome at the height of ca. 243 [m asl] with high precision. The same for interior benchmarks 51 and 58 can be mentioned, where Ω are located on the main roof of the building. Comparing the Validation stage (i) and (ii) where the Ω blocks are Z-centred within the height interval of stage (i), it was seen that no correlation between absolute errors of the stage (i) and horizontal East-West velocities of the stage (ii) could be found, which means that the errors in simplified cosine correction of the stage (i) does not correspond to the presence of horizontal displacement.

CONCLUSIONS
In this work, Cosmo-SkyMed SAR images have been used to determine the vertical displacement rates of the Cathedral of Como, using different DInSAR algorithms and different vertical component extraction techniques. The performance of the algorithms and techniques in exploiting the potential of fullresolution perspective (using all the pixels in the SAR image) has been assessed by comparing the outcomes with high-precision levelling measurements acquired on the cathedral and nearby area. Different challenges and obstacles in the proper framework of InSAR analysis and validation are discussed in detail. As the key outcome, the results showed that using a single-master configuration for DInSAR processing ( algorithm) is more efficient in terms of higher accuracy and lower processing time. In this case, the simplified cosine correction of LOS velocities shows reasonable results, and the observed errors may not be affected by possible horizontal movements. Using both temporal coherence and the distance from the fitted planar surface in the localized block as the weight of the Localized Vertical Displacement Extraction procedure can significantly decrease the high values of error in such a framework of validation. Besides the significant contribution of this work to the reliability assessment of InSAR displacement monitoring of cultural heritage by defining an accuracy level, the other parameters (such as residual height, seasonal fluctuations, and geographical positioning) estimated through DInSAR processing and postprocessing are still required to be evaluated in such a reliable framework. This can be a prospective view for future study.