Unsupervised weathering identification of grottoes sandstone via statistical features of acoustic emission signals and graph neural network

Weathering features of sandstone heritage can be recognized by using artificial intelligence (AI) based surrogate models, and most models perform classification tasks for types based on precise labels. But there are lack of prior validated knowledge of the weathering or untagged historical data for complex weathering conditions in many cases. To this aim, a unsupervised graph neural network (GNN) based on the statistical features of the acoustic emission (AE) signals is constructed. Firstly, taking unweathered sandstone as a reference, we define 4 weathering levels of sandstone ranging from I to IV based on pore indicators. We selected 11 statistical features that are high correlated with pore of sandstone. Then, this GNN is constructed and trained by 2880 sets of statistical measured AE signals. Compared with AEs, LOF and IF models, GNN achieves the best identification performance among the four evaluation criteria. Each iteration of the GNN network is fitting the feature information of the signals and their neighbors. By data dimensionality reduction techniques, when the GNN stops iterating, it will be easy to distinguish unweathered AE signals from weathered one by comparing the reconstruction error of each signal. Furthermore, when the nearest neighbor’s k gradually increases, the AUC of GNN also gradually increases and then tend to stable when k equals to 50–100. While the hidden layers of the network aggregates less information about the neighborhood features of the signals and cannot distinguish significantly between unweathered and weathered signals when the value of k is small. As the depth of the network deepens, the feature values between signals become more and more similar, their reconstruction errors in the output layer of the network to become more similar, making it difficult to distinguish unweathered AE signals from weathered AE signals via GNN. Meanwhile, GNN adopts more AE features and considers the similarity between each features. This can greatly eliminate various errors caused by wave velocity measurement, greatly improving the robustness of AE detection. Hence, the GNN model presented addresses the limitations of relying solely on P-wave velocity measurements to assess the degree of sandstone weathering at stone cultural heritage.


Introduction
Stone cultural heritages spread widely throughout the many countries hold notable historical, and cultural meaning, etc. [1].Among these heritages, grottoes represent a category that is not only the most complete and content-rich but also the most fragile [2][3][4].Immovable cultural heritages such as grottoes undergo long-term influences from natural forces and human activities, with weathered layer in shallow surface of sandstone being the most common and notable form of deterioration [5,6].The weathering degree of large exposed stone heritage, is an important indicator for their preservation and a significant parameter that should be determined before the heritage are restored [7][8][9].
Weathering diseases are caused by multiple factors coupled and long-term effects, mainly induced by freeze-thaw [10], chemical reaction from pollutants or precipitations [11,12], temperature [13], moisture [14] and coupling effect of this factors.In general, weathering diseases can be classified into different types.Huang and his colleagues [15] first classified weathering diseases into four types based on the weathering characteristics of sandstone heritages: powdery weathering, sheet-like weathering, strip-like weathering, and platelike weathering.These weathering characteristics may be induced by physical, chemical or biological factors.But generally, the internal pore-structure of sandstone will be changed with the degrees of weathering (the increasing total bulk porosity and the proportion of large-size pore, etc.).There are many effective methods for evaluating the weathering degrees of grotto heritages [16] including high-density electrical methods, spectroscopic techniques, hardness tester, and ultrasonic methods, etc. [17][18][19][20][21][22].The indicators obtained by these non-destructive or micro-destructive methods can define the degree of weathering of the sandstone, but these indicators can not describe the properties of the sandstone [23][24][25] (e.g., elastic modulus, pore structure, etc.) directly.The properties of the sandstone itself are generally tested by cumbersome (non-portable) or destructive methods, such as using uniaxial compression test (UCT) to determine the elastic modulus of the sandstone, and using nuclear magnetic resonance (NMR) or mercury intrusion porosimetry (MIP) techniques to determine the pore parameters of the sandstone [26,27].The methods mentioned are generally difficult to implement at grottoes cultural heritage in-situ.Meanwhile, these methods require complete a priori knowledge, as well as manual involvement in extracting features.This leads to low accuracy of weathering detection.The human errors that experts make in weathering recognition will accelerate the weathering process in building stones.Meanwhile, there are also some factors that can lead to misjudgement.Taking the acoustic emission (AE) detection of stone cultural relics as an example, if there are not enough testing frequencies, the values of P-wave velocity will undergo significant variability.This variability may be attributed to the inherent heterogeneity of the sandstone itself and potential inaccuracies in portable device, etc.Thus, universal approaches that minimize subjectivity must be developed.Moreover, it is necessary to propose a nondestructive and ready-to-use method to determine weathering degree of grotto sandstone.
In the past 2 decades, artificial intelligence (AI) based surrogate models have been investigated in stone cultural heritage conservation for the disease features identification [28][29][30].Bewes et al. [31] predicted sex from skeletal remains using deep learning (DL) model; Cintas et al. [32] used the pottery profile classification method.There are also damage or disease features assessment in the limited components of historic monuments [33,34].Hatır et.al [35][36][37][38] use vision-based DL methods to classify the types of deterioration with high accuracy.These DL methods are the majority of supervised approaches.Supervised disease features identification methods all inevitably require labels.However, there are lack of prior validated knowledge of the weathered or untagged historical data for complex weathering conditions in-situ.Unsupervised learning methods have been extensively studied and applied in many fields and statistical analysis methods are utilized in unsupervised learning methods [i.e., constrained Boltzmann machines (RBM) [39], autoencoders (AEs) [40], etc.].In most cases, samples' similarity or probability density function estimates are analyzed.Obtaining the intrinsic structure of the sample data and then the correct classification of the samples is achieved [41].At present, there are also a few unsupervised ML methods applied in the weathering or diseases identification of sandstone cultural heritage [42].The results obtained by these methods often differ from the true weathering labels significantly, and cannot identify the shallow weathering of sandstone cultural heritage, only the surface weathering.Some more non-destructive, easy and ready-to-use, and inexpensive detection signals should be studied in combination with AI techniques.
To this aim, the sandstone from the Yungang Grottoes (Datong, China) are selected as typical object, and a graph neural network (GNN) based on the statistical features of the AE signals is constructed.The Euclidean structured data can be converted into graph like data by GNN and this allows AE features that were previously independent of each other to be related.In fact, GNN has been studied in other fields (e.g., damage detection in practical engineering, medicine, industrial outlier detection, etc.) [43][44][45]

Characteristics analysis for weathering signals of Yungang sandstone
The stratigraphic structure of the area where the Yungang grottoes are located is relatively simple.It belongs to the upper Middle Jurassic formations of the Mesozoic era and the middle to upper Quaternary formations.These include the Middle Pleistocene (residual-alluvial), Upper Pleistocene (alluvial-pluvial), Holocene (alluvialcolluvial), and the Middle Jurassic Yungang Formation.The grotto area is a typical continental semi-arid climate, and average temperature of the coldest month and the hottest month, annual average temperature and annual average precipitation are − 13.3 ℃, 23.1 ℃, 9.1 ℃ and 423.8 mm.The rock samples used in this study were taken from the exposed strata in the eastern part of the Yungang Grottoes area, which date back to the same period as the Yungang Grottoes.And X-ray diffraction (XRD) analysis determined that the main mineral components of the Yungang sandstone is feldspathic quartz sandstone, consisting of 62.15% quartz, 11.48% feldspar, 4.76% calcite, along with illite, kaolinite, and other clay minerals.The sandstone samples were processed into cube of 5 × 5 × 5 cm 3 , and samples with similar physical and mechanical properties such as velocity of longitudinal wave (P-wave), porosity (φ), Leeb hardness, and apparent density were obtained.Table 1 shows the basic properties of 39 selected unweathered sandstones.The In order to obtain sandstone with different degrees of weathering, freeze-thaw cycle test is used to accelerate sandstone weathering.The freeze-thaw cycle procedure is as follows: (a)All samples are dried to a constant weight in an oven at 105 °C, then placed in a sealed container to cool down to room temperature.(b) With the exception of the unweathered sandstone in the control group (0 cycles), 39 samples undergo vacuum saturation.The specific steps are as follows: place the samples in a desiccator filled with distilled water (ensuring that the samples are fully submerged); seal the container and connect the vacuum pump to evacuate the air from the desiccator; after 4 h of evacuation, allow the samples to remain in distilled water at atmospheric pressure for an additional 4 h.(c) Considering that the minimum temperature in the Yungang area may exceed -20 ℃, the minimum temperature for the freeze-thaw cycle in our experiment is -20 ℃, and the thawing temperature at 20 °C, with both freezing and thawing phases lasting 4 h each.The freeze-thaw cycle process for the saturated sandstone samples is as follows: place the samples in a refrigerator set to − 20 °C and freeze for 4 h; then, thaw them in distilled water for 4 h.Each 8-h period constitutes one cycle.After every 5 cycles, all samples are dried in an oven at 105 °C for 12 h and then allowed to cool to room temperature.
Many previous studies have used various identification methods to obtain weathering characteristics, such as using X-ray fluorescence (XRF) to obtain the changes of chemical elements, or using XRD and some petrographic methods to obtain the changes of the relative content of major minerals.However, these methods are difficult to show the precise changes of pore characteristics, especially for weathering under freeze-thaw cycles.This section mainly discusses the relationship between the pore characteristics (porosity and pore size distribution) and acoustic emission characteristics (P-wave velocity and frequency spectral characteristics), and the different weathered sandstone can be characterized by this two indicators quickly and directly.Using nuclear magnetic resonance (NMR) and ultrasonic oscilloscope to detect sandstone pores and acoustic emission characteristics respectively.The main experimental instruments are presented in Fig. 2.

Analysis of weathered sandstone filtered AE signals
In the vast majority of current research, the reductions of P-wave velocity (V P ) are generally used to quickly evaluate the degrees of weathering of sandstone heritages.This study also conducted rapid wave velocity identification on 39 samples.Due to the randomness of AE detection, there will exist a large number of outliers in the P-wave velocity.In order to eliminate errors in the sample itself and operation as much as possible, we conducted three measurements along the 3 directions of the square sandstone sample.Each measuring direction is 16 points and 3 values will be measured in each points.We took the average as the values of P-wave velocity for each individual sample.Figure 3 shows the results of the P-wave velocity and porosity.For the reason of the large amount of artificial error when the P-wave velocity is measured, it should be noted that Fig. 4a is the result of removing outliers.In Fig. 3a, green box plots include all P-wave velocity data and blue dots represent the average value in every freeze-thaw cycle; similarly, blue box plots include all porosity data and green dots represent the average value in Fig. 3b.It can be seen that with increasing cycles, the P-wave velocities are decreased and the porosity are increased monotonically.Specifically, after 60 cycles, the average P-wave velocity is decreased by about 30% and the porosity is increased by about 40%.This result is similar to some previous studies on freeze-thaw weathering of Yungang sandstone [13].At the same time, other physical and mechanical properties will also be changed, such as a decrease of static elastic modulus (or uniaxial compressive strength, UCS), surface hardness, and other indicators (as reported by Hong [5]).The weathering characteristics of sandstone can be described when these indicators are combined, but the micro phenomenon of internal sandstone (e.g., pore structure, mineral composition, etc.) brought by external weathering cannot be reflected by most of these macro indicators.Meanwhile, although there exist a strong correlation between P-wave velocity and total porosity of the sample, the internal microstructure of sandstone can be hardly reflected (e.g., pore size distribution).Hence, we will focus on other Figure 4b shows the normalized AE time series signals of 0, 20, 40, and 60 freeze-thaw cycles, and Fig. 4c shows the corresponding Fourier amplitude spectra.The displayed signals are the average signals of one sandstone sample, and the jumping time (blue area) for calculating wave velocity is also displayed in the time series data.It can be seen that in addition to the amplitude and propagation time of the time series signal [as shown in Fig. 4b], there also exist some significant differences in the frequency domain spectra.From Fig. 4c, it can be seen that with increasing cycles, the dominant frequencies of the signal (the frequency corresponding to the maximum amplitude) are also decreased.The average dominant frequency of 0 cycles is 68.8 kHz.After 40 and 60

Pore characteristics of weathered sandstone and determination of weathering degrees
To explain the variation pattern of AE signals, the NMR signals of saturated sandstone will be analyzed correspondingly, which can represent the pore size distribution characteristics.The detection of pore parameters was conducted using the MacroMr12-150H-I nuclear magnetic resonance analyzer from NiuMag. Figure 5a is a schematic diagram of sandstone NMR identification, and we adopt a layered identification method to observe the pore characteristics from inside and shallow surface.Figure 5b and c show the NMR relaxation time (T 2 )-amplitude curves and the corresponding cumulative signal curve.The superscript "C" in the legend represents the cumulant of NMR signal.Observing the cumulant signals in Fig. 5b and c, it can be seen that the level of cumulant in the center of the sandstone is lower than that at the edge, indicating that the porosity inside after weathering is slightly lower than that outside.But the variation pattern inside and outside is basically consistent during the whole freeze-thaw cycles.For pore size distribution, both the maximum and minimum values of T 2 (with nonzero amplitude) are increased with increasing cycles and the ranges of T 2 are also expanded significantly, indicating that the pore size distribution range of sandstone is transfered towards larger pores.The maximum T 2 of sandstone at center and edge positions for 0, 20, 40, and 60 freeze-thaw cycles are 109.7 ms, 126.1 ms, 155.23 ms, 191.2 ms, and 102.34 ms, 155.23 ms, 155.23 ms, and 219.6 ms respectively.At the same time, the porosity occupied by larger pore sizes are also increased.Observing the NMR curves of 40 and 60 cycles, it can be seen that the NMR signals begin to increase with the increasing cycle times when the T 2 is with 3-5 ms, but the increases are not significant.When it is greater than 5 ms, the cumulant of NMR signals are increased significantly and discontinued increasing time is around 110 ms (marked using red lines).It is worth noting that the signal amplitudes of the NMR curves for 40 and 60 cycles are increased within 1-100 ms, while the signal amplitudes below 1 ms are decreased simultaneously (signals below 0.6 ms disappear), indicating that the pore size transfer of micropores and mesopores in Yungang sandstone may be synchronous.
Compared with the AE signals, it can be seen that: (a) with the pore size increasing, the dominant frequency of the AE signal will be decreased to varying degrees; (b) with the transfer pore size from small to large pores, the low-frequency bandwidth will be increased; (c) the larger the porosity occupied by larger pore sizes, the higher the proportion of low-frequency energy in AE signals.Specifically, we examine the relationship between two key parameters (namely porosity and pore size distribution), and AE characteristics (especially frequency characteristics).Firstly, higher porosity typically results in a decrease in the propagation velocity of ultrasound in sandstone, as the medium within the pores (air or fluid, in this case, dry sandstone) has a lower acoustic impedance compared to the mineral components of the sandstone.When ultrasound propagates through rock with high porosity, there is more pronounced attenuation and a reduction in wave velocity.Consequently, this affects the intensity and frequency components of the received ultrasound signals.
In rocks with high porosity, lower frequency components are more likely to be received, while higher frequency components, due to faster attenuation, are more difficult to detect.This leads to a shift of the dominant frequency from high to low.Secondly, non-uniform pore distribution, especially when the proportion of larger pores increases, can result in greater scattering and attenuation of the wave.In turn, it will lead to an increase in the frequency bandwidth of the received AE signals.This effect is more pronounced in low-frequency regions (in this work, it can be regarded as (40 kHz, 60 kHz]).
Considering the weathering status of sandstone can be reflected via NMR signals (pore characteristics) directly, we define normalized weathering indicators related to pore characteristics: (a) define the ratio of total bulk porosity before and after weathering, i.e. porosity ratio γ N ; (b) it can be seen that the maximum effective T 2 value of weathered sandstone have been transfered significantly to the righthand (larger pore size), indicating an increase in pore size distribution.For this purpose, the ratio of the effective T 2 interval (ΔT 2 ) before and after sandstone weathering is defined as κ Nm , which is the maximum pore size ratio.Meanwhile, the geometric mean of T 2 is a key parameter of the T 2 spectra, which represents the overall pore distribution level of the sandstone [46].A larger value indicates a larger average pore size.Thus, we define the ratio of geometric mean of T 2 before and after weathering as the average pore size ratio κ Ngm ; (c) The T 2 spectra of NMR of weathered sandstone shows a significant increase in amplitude within the large pore range, and the cumulative NMR signal is significantly greater than that of unweathered sandstone, indicating an increase in the proportion of pores within the large pore size ranges.For this purpose, the cumulative NMR signal ratio of weathered sandstone within a specific range of T 2 we are interested in is defined, which is called the porosity development ratio μ N .The definitions of porosity ratio γ N , maximum pore size ratio κ Nm , average pore size ratio κ Ngm and porosity development ratio μ N are as follows: where, T 2initial and T 2cut are the effective relaxation times for the initial and cut, corresponding to the minimum and maximum pore size signals; T 2i is the i-th T 2 component, A(T 2i ) is the corresponding amplitude at T 2i , and A t is the total signal amplitude.In addition, pores are classified into micropores (≤ 0.1 μm), mesopores (0.1-1 μm), macropores (1-10 μm) based on sizes generally.In previous studies [47], the numerical relationship between pore size D (μm) and relaxation time T 2 (ms) of Yungang sandstone was roughly D = 0.09T 2 .This article selects υ = (5, 110) ms as the range of T 2 to describe the weathering process.During the mesopores stage, the pore size transfer of Yungang sandstone is more pronounced [pore size range is about (0.45, 9.90)μm], and this range can be used to characterize the change in pore size of Yungang sandstone from micropores to mesopores, serving as an important indicator for quantifying weathering; S N is the cumulant of NMR in υ.In order to quantify the characteristics of weathered pores more conveniently, we define a weathering degrees classification coefficient α based on 4 pore normalized pore indicators: where, C 1 , C 2 , C 3 and C 4 are weight coefficients and the total bulk porosity, average pore size, and pore development coefficient can directly quantify the pore size characteristics of sandstone, e.g., porosity proportion of a larger range of pore size.For this reason, the values of C 1 , C 3 and C 4 can be set as the same weight, that is 0.3; the maximum relaxation time can only reflect the maximum (1) value of the pore size but cannot reflect the corresponding NMR signal height (i.e. the proportion of pores with the maximum pore size).Therefore, the value of C 2 can be set to 0.1 as the smallest weight.In fact, although the expansion trend of pore size can be reflected via this indicator, the maximum pore size corresponds to a small porosity, which can be ignored and cannot be used as a precise quantitative indicator.Table 2 presents the mean values of normalized indicators for sandstone weathering based on pore characteristics under different cycles, as well as the calculated weathering degrees classification coefficient α.
It can be seen that with increasing the times of freezethaw cycles, the value of α is also increased, indicating more severe sandstone weathering.In fact, in the experiment mentioned above, except for propagation time (i.e.V P ), we found that the AE signals of weathered sandstone within 10 cycles are changed inconspicuously, which are not much different from unweathered sandstone.Therefore, we defined the sandstones that are before 10 cycles as a same level of weathering degree, that is, unweathered.Calculate the α of all samples and classify them into 4 levels, namely I, II, III and IV.The relationship between the weathering level and the range of α-values is shown in Table 3.At the same time, interpolation method is used to provide the value range of porosity, range of relaxation time, geometric mean of relaxation time, and porosity development ratio for each level.This level will also be used for true labels in subsequent verification for GNN model based on AE statistical features.In Sect."Feature extraction of AE signals via statistical features", we will discuss on the relationship between statistical features in the AE received signal; The "mean", "max", "min" and "SD" are represented to solve mean, maximum, minimum, and standard deviation, respectively; The subscript "υ" represents the low frequency band (namely, (40 kHz, 60 kHz]), where υ u and υ l are the upper and lower boundary of υ respectively; the subscript "cut" is cut-off frequency, ω cut is fixed as 100 kHz.
Figure 6 shows the personal correlations of 11 AE statistical features and pore indicators.It can be seen that the linear relationship between jumping time, frequency domain statistical features and pore indicators are the stronger, while the correlation between other time domain statistical features and pore indicators are the relative weaker.But all the absolute values of the correlation coefficients of these selected statistical features are greater than 0.71 (except for time indicator TI 3 ), indicating a close relationship between them and pore parameters.In fact, there are many statistical features in the time-frequency domain, such as the maximum and minimum values, as well as other characteristics that indicate the aggregation and dispersion of the dominant frequency band of the AE signals.However, this article only lists the features that are highly correlated with sandstone weathering.By calculating 11 indicators in the time and frequency domains of AE datasets with different degrees of weathering, we obtain an indicator set (also known as a feature set).In the next section, we will use this set as the input (Dataset X) of the GNN model for weathering identification.

A graph neural network for weathering identification of sandstone grottoes
Since the weathering characteristics of grottoes sandstone based on AE data are not obvious, traditional weathering identification techniques struggle to accurately identify weathering without the assistance of additional detection methods.To address this issue, a graph neural network utilizing AE data is proposed for weathering evaluation.The model consists of three main processes: graph construction, the graph neural network, and weathering identification.

Construct graph
Considering that normal Euclidean structured data (e.g., time series AE data, frequency domain data, etc.) are independent of each other and lack any inherent connection relationships, it becomes necessary to convert these unconnected relationships into connected ones through graph construction.In graph-like structured data, connections do exist.By constructing a graph, the original unconnected data can be transformed into a connected relationship.Generally, graph construction mainly considers the similarity between samples, where higher similarity implies a greater weight in the connections between samples [48].
Construct graph is consisted of calculate the similarities between each samples in AE data, assign weights and output graph (i.e., adjacency matrix).Construct graph can be summarized as 3 steps: Firstly, Sim (X i , X j ) is defined to measure the similarity between signals, where X i , X j ∈ X and X i ≠ X j (Sim (X i , X i ) or Sim (X j , X j ) = 1).When the value of Sim (X i , X j ) tends to 1, the two signals are more similar.Commonly used similarity measures are Euclidean distance, Cosine similarity, etc.
Secondly, the k signals that are most similar to x i are selected as its neighbors, and the set of neighbors of x i are formed, which is denoted as N k (x i ).The higher the similarity between X j and X i , the higher the weights on their connected edges.The weights on the edge connecting X j to X i are calculated using Eq. ( 4): Finally, each signal in data set X is taken as a vertex, and the vertex are combined with the connection relations between them to form the directed graph, namely, adjacency matrix A. The adjacency matrix A can be expressed as: The graph-like structure data can be denoted by the adjacency matrix A and the diagonal value in A is set to 1 which indicates that the subsample itself is connected to itself.In this way, the feature information of the subsample itself can be effectively prevented from being lost during the training process of the subsequent model.The flowchart of constructing graph is shown in Fig. 7.

Graph neural network
Euclidean data can be used as input in traditional neural network structures such as convolutional neural networks (CNNs) and artificial neural networks (ANNs).However, these types of neural networks cannot handle non-Euclidean structured data, such as graph-like structured data.Therefore, a graph neural network (GNN) can be introduced.A recurrent neural structure is used to propagate neighbor information until a stable equilibrium point is reached, allowing the representation of the target node to be learned.This process facilitates subsequent weathering identification.After (5) feature extraction by the GNN, each graph node contains not only its own information but also the feature information of its neighbors [49].Simple form of forward model is taken: where, X denotes the data set including features, the adjacency matrix A is constructed from X, and W denotes the weight matrix between hidden layers, h ac denotes the activation function.ReLU function is used as the activation function to solve gradient dispersion.When the input value is greater than 0, the derivative of the ReLU function is always 1.The weights W(0), W(1) and biases b(0), b(1) are trained using the gradient descent.The loss function L of the graph neural network is: where, Z is the output of the GNN.In this work, we perform batch gradient descent using the full dataset for every training iteration.The GNN employs a feed-forward multilayer graph neural network with one hidden layer positioned between the input and output layers.At each layer of the network, the GNN incorporates graph structures.Signals propagate their feature values to each other based on their connection relationships.The dataset X is reconstructed by the graph neural network into a (6) matrix Z, where each signal contains not only information about itself but also information about its neighbors.
The training process of GNN is shown in Fig. 8.
The training stage of GNN is to learn the feature between X i and its neighbors and reconstruct it in the output layer to minimize the mean squared error between X' and Z.Since the majority of signals in the dataset are from unweathered sandstone, unweathered signals will be easier to reconstruct by the GNN in order to minimize the reconstruction error of the overall dataset.

Weathering identification
The weathering factor (WF) of the i-th data record WF i can be defined as the measure of the probability of weathering in sandstone.WF i is defined by the average reconstruction error over all features: The WF is evaluated for all data by using the trained GNN.After the WF of each signal is calculated and sorted, then the top-n signals with the largest weathering factor are output as weathering sandstones.By jointly learning the features of the sandstone signal and its neighbors, and then reconstructing them in the output layer of the GNN, weathered sandstone signals that are mixed in the unweathered sandstone region or around the dense region will exhibit larger WF and can (8 thus be detected by the GNN.The process of weathering identification is illustrated in Fig. 9.
In the Sect."Case study for weathering identification of Sandstone grottoes", we will conduct a case study on the performance of GNN based on the architecture presented in this Section, including different datasets, comparisons with different algorithms, parameters influence, and generalization ability to new data, etc.

Evaluation methods
There are 2880 measured AE signals for each level of weathering with 2160 for unweathered (all for level I) and 720 weathered sandstone (levels II-IV are all 240).In order to investigate the recognition performance of GNN on AE data with different degrees of weathering mixed, we set up conditions containing three weathering combinations.The dataset of three cases used for the GNN model are shown in Table 4.
Meanwhile, it is necessary to introduce multiple performance criteria for evaluating GNN in identifying sandstone weathering.The Receiver Operating Characteristic (ROC) curve reflects the balance between sensitivity and specificity, and is widely used as a performance evaluation criterion for classifiers in ML classification tasks.Many previous studies have suggested using models with a larger area under the ROC curve (AUC value).Therefore, in combination with other frequently used criteria, we choose AUC, ACC (Accuracy), DR (Identification Rate), and FAR (False Alarm Rate) as performance evaluation criteria for the models.The higher the AUC, ACC, and DR values, the smaller the FAR, indicates a better performance in ML model.The calculation formulas for each evaluation criterion are as follows: where, n w is the actual quantities of weathered sandstone's signals, and n f is the actual quantities of ; FAR = TP TN + FP unweathered sandstone's signals; S is the ranking sum of actual weathering, where r i is the ranking of the ith weathered sandstone's signals; TP is the number of labeled weathered sandstones as "weathered" by the algorithm correctly; TN is the number of labeled unweathered sandstones as "unweathered" by the algorithm correctly; FP is the number of labeled unweathered sandstones as "weathered" by the algorithm incorrectly; And FN is the number of labeled weathered sandstones as "unweathered" by the algorithm incorrectly.

Performance and comparative analyses
In order to verify the effectiveness of GNN in multiple weathering evaluation, we selected 3 different types of state-of-the-art signal statistical characteristics based unsupervised identification models for performance comparisons with the proposed GNN, that is: (a) Neuron Network-based autoencoders (AEs); (b) Local outlier factor (LOF); (c) Isolation-based Forest (IF).Due to the parameter settings for each type of model are different, we summarized the parameter settings of GNN and above mentioned 3 models in Table 5.
The GNN method is compared with AE, LOF and IF for each model on each weathering type, we adjust the parameters to run 50 times and select the best results as the final performance of models.The criteria of different models are shown in Fig. 10.The GNN method achieves the best identification performance among the four evaluation criteria: AUC, ACC, DR, and FAR.This demonstrates that GNN is effective in identifying weathered signals in the dataset, including those weathered statistical features that are difficult to detect with other models.Essentially, AEs and LOF can be regarded as the degradation models by observing internal parameters of the model in Table 5.Compared to AEs, each iteration of the GNN network fits not only the feature information of the signals themselves but also the feature information of the signals and their neighbors.Additionally, The execution time of the model is also a crucial metric, and slower models are challenging to use on a large scale in field applications.Table 6 shows the actual execution time of four algorithms.From Table 6, we can see that the actual execution time for GNN is slightly slower than AEs, but faster than LOF and IF.This is due to the additional graph-like data generation (construct graph) required by GNN with the same number of network layers.
In addition, dimensionality reduction techniques of data can be used for visualizing input features during machine learning training.T-Distributed Stochastic Neighbor Embedding (t-SNE) is a visualization model for mapping high-dimensional data to low dimensional  data, and data points that are separated from each other in the high-dimensional space remain unchanged in the low dimensional space.Further more, t-SNE can determine whether the dataset has good separability by mapping into two-dimensional (or three-dimensions) space, that is, whether there are small intervals between isomorphic groups and large intervals between heterogeneous groups.The visualization results of the original and identified by GNN dataset are shown in Fig. 11.At the same time, we also displayed the original labels (defined in Table 3) and unlabeled situation of case A-C in the dimensionality reduction data.From Fig. 11a, some data of weathered sandstone are mixed with the data of unweathered sandstone.Usually, the reason for this phenomenon is that the AE features of weathered sandstone are not obvious, which is difficult to identify directly by manual or other algorithms.From Fig. 11c, the data of weathered sandstone deviate more from unweathered sandstone.The main reason for this difference is GNN aggregates the AE statistical feature neighbor values at hidden layers, making the distribution pattern of the data changed, thus enabling the feature of weathered sandstone originally hidden in the unweathered area to be detected.In fact, further comparison with the Fig. 11b and d containing true labels shows that the data of the other three levels of weathered sandstone also gradually separated, indicating that GNN can roughly classify the weathering level of sandstone through the AE statistical characteristics signals.

Parameters analysis and cross validation
We also investigate the influences of the number of neighbor's k, the number of layers on the performance of the GNN.The results are shown in Fig. 12. From Fig. 12 Fig.11 Feature visualization of the dataset before and after identified by GNN (a), we can see that when the nearest neighbor's k gradually increases, the AUC of GNN also gradually increases and then tend to stable.The hidden layers of the network aggregates less information about the neighborhood features of the signals and cannot distinguish significantly between unweathered and weathered signals when the value of k is small.Some results can also be seen from To ensure that GNN is blind to the samples when performing weathered identification, we used random sampling and randomly sorted samples.Observing the results in Table Fig.13, it can be seen that the GNN model still has good identification results with a small number of samples respectively.The average AUC values in 3 datasets are 0.992, 0.989 and 0.982, proving that the GNN can identify the weathered sandstones' AE signals effectively.

Identification performance verification of GNN via new weathered sandstone
In order to further investigate the identification performance of GNN on the weathered sandstone, we selected three 3 × 3 × 3 cm 3 sandstone samples with different degrees of weathering (as shown in Fig. 14), which were collected from the same location and initial lithology as the 5 × 5 × 5 cm 3 samples.We also conducted weathering experiments using freeze-thaw cycles and named them as TS1, TS2, and TS3 with porosity of 7.68%, 9.25%, and 10.03% respectively.The NMR tests of three samples are also carried out and Fig. 15 shows the NMR T 2 spectra.As shown in Fig. 15, there is no a significant increasing of porosity in the interval that is less than 5 ms, and even the porosity of unweathered sandstone was higher than that of weathered sandstone.When the relaxation time is between 5 and 100 ms (i.e. the pore size is between 0.5 μm and 10 μm), we believe that the sandstone has undergone varying degrees of weathering in this interval.But there exist more detailed signals within a relaxation time of 0.01-1 ms in the 3 × 3 × 3 cm 3 sandstone samples while the sandstone sample with a size  of 5 × 5 × 5 cm 3 has missing signals between 0.01 and 0.5 ms.This may be due to differences in NMR testing accuracy caused by size of sample.However, 0.01-1 ms indicates that the pore size of sandstone is relatively small, ranging from 0.001 μm to 0.1 μm.In general, this trend of pore characteristics changes after weathering of 3 × 3 × 3 cm 3 sandstone samples is consistent with the that of samples with a size of 5 × 5 × 5 cm 3 basically.Meanwhile, the AE signals of three samples are collected and Fig. 16 shows the normalized signals of each sample's AE signals using the mean of 6 times measurements.It should be noted that we randomly selected 6 points on 3 sandstone samples for testing.It can be observed that although the dominant frequencies of the three sandstones show a small variation, samples with higher porosity exhibit a broader low-frequency bandwidth.
The each measured P-wave velocity can be obtained as shown in Table 7 and the identification results obtained by inputting the AE time-frequency domain statistical features into the GNN model are also shown in Table 8.From Table 7, it is evident that the randomness in measurements prevents us from obtaining stable P-wave velocities, even when measured on the same sandstone sample, which will result in sandstone with high porosity having a higher P-wave velocity than it of sandstone with low porosity.And the changes of mean values of the three samples are not significant.This variability may be attributed to the inherent heterogeneity of the sandstone itself, potential inaccuracies in portable AE device and artificial errors, etc.And GNN can correctly identify the weathering situation of three sandstone samples (see Table 8) even if the degrees of weathering of the samples is not significantly different.GNN adopts more AE features and considers the similarity between each features and this can greatly eliminate various errors caused by P-wave velocity measurement, greatly improving the robustness of AE detection.
When conducting AE detection on the laboratory test samples, since the degree of weathering of each sample is known, we can easily eliminate outliers of P-wave velocity caused by the reasons mentioned above.However, the degree of weathering at the site of the stone cultural heritage is unknown, and we cannot easily eliminate the measured outliers.Hence, the limitations of many previous studies on weathering assessment of sandstone cultural heritage of relying solely on P-wave velocity measurements (single indicator) to assess the degree of sandstone weathering at stone cultural heritage.Such measurements can often be inaccurate due to the variability of discrete data points, leading to potential misjudgements of the weathering degree.GNN based AE detection for weathering identification can be used in insite stone cultural heritage with complex and uncertain weathering conditions, because the model defines the weathering degrees are only based on the pore characteristics of weathered sandstone.Considering its operation only requires portable AE device, a large number of onsite weathering identification can be achieved.Finally, a simpler and more practical operating procedure of GNN based AE detection is summarized: (a) Obtain the original AE signals and calculating the P-wave velocities; (b) Determine the degree of dispersion of P-wave velocities.If the dispersion of P-wave velocities in the same detection area is large, calculating the AE statistical features; (c) Input the obtained AE statistical features into GNN and determine whether sandstone heritage have weathered.

Conclusion
The sandstone from the Yungang Grottoes (Datong, China) are selected as typical object, and a graph neural network (GNN) based on the statistical features of the AE signals is constructed.The unsupervised weathering identification are realized based on AE statistical features of Yungang sandstone.The main conclusions are as follows: (a) Taking unweathered sandstone as a reference, a linear interpolation method was used to define 4 weathering levels of sandstone ranging from I to IV based on four pore indicators (porosity ratio γ N , maximum pore size ratio κ Nm , average pore size ratio κ Ngm and porosity development ratio μ N ).All the absolute values of the correlation coefficients of the selected statistical features are greater than 0.71 (except for time indicator TI 3 ), indicating a close relationship between them and pore parameters.GNN can directly identify weathered sandstone based on the features of AE signals.However, this model still exists a obvious limitation: it cannot directly determine specific weathering metrics, such as porosity.But for other porous materials, such as sandstones from different regions or weathered sandstones in various environments, the model's effectiveness remains consistent and does not vary significantly.This consistency is because the model is designed to distinguish the degree of sandstone weathering based solely on pore characteristics, without accounting for other complex factors like variations in mineral content or chemical composition.This limitation is an inherent aspect of the model.We will focus on model training for data from other rock types, seeking more direct descriptions of weathering indicators and linking them with non-destructive testings quantitatively in subsequent research.
, but has not yet been introduced in the field of conservation for stone cultural heritages.Hence, the unsupervised weathering identification are realized based on AE statistical features of Yungang sandstone.Other Sections in this paper are arranged as follows: in Sect."Characteristics analysis for weathering signals of Yungang sandstone", the freeze-thaw cycling test of 39 5 × 5 × 5 cm 3 cubes Yungang sandstone is carried out, and the AE signal and saturation NMR signal characteristics of sandstone are obtained.The correlation of the AE statistical features and NMR signal characteristics was analyzed and the true labels of pore characteristics are established; in Sect."A graph neural network for weathering identification of Sandstone grottoes", the GNN based on the statistical features of the AE signals is constructed, and the architecture and model training process of this network are introduced; in Sect."Case Study for weathering identification of Sandstone grottoes", the state-ofthe-art signal characteristics based outlier identification models are selected for performance comparisons with the proposed GNN, and parameters analysis and cross validation are carried out to observe the influence factors of performance and model's robustness.The overall technical flowchart of this work is shown in Fig. 1.

Fig. 1
Fig. 1 Overall technical flowchart for this work pore characteristics of sandstone to analyze the representative characteristics of acoustic emission and NMR signals, and attempt to establish a more detailed connection between AE signals and pore indicators.The original received AE signals is the time series of impulse signals after internal filtering of sandstone.The schematic diagram of P-wave propagation in sandstone is shown in Fig. 4a.The impulse excitation emitted by the ultrasonic transducer is a 5-period sinusoidal signal with a Gaussian window function, then the a sandstonefiltered unstable signal will be arrived in receiver terminal, which also includes the propagation time inside the sandstone.The propagation time can be judged by the jumping point in the filtered signals.Since there is no probe received inside or across the sandstone, the total average information inside the sandstone will be reflected in this time series.The acoustic emission detection was performed using the Pundit 200 AE oscilloscope manufactured by Proceq.The emission frequency and impulse voltage of the AE transducer are fixed as 54 kHz and 50 V to ensure that the energy emitted by each transducer is constant.Before analyzing, all time series signals should be normalized.Meanwhile, considering the frequency of AE signals of penetrating sandstone is mainly within (0,100] kHz, a band-pass filter can be set to filter noise signals from other frequency component and the range of frequency band is (40,80] kHz.

Fig. 3
Fig. 3 Results of P-wave velocity and porosity in weathered sandstones

Fig. 4
Fig. 4 Normalized time and Frequency-domain AE signals of weathered sandstones

Fig. 5
Fig. 5 NMR T 2 spectra and its cumulant signals of weathered sandstones

Fig. 6
Fig. 6 Correlation of AE and NMR based normalized index in weathered sandstones

Fig. 10
Fig. 10 The criteria values of different algorithms

Fig. 12b .
As the depth of the network deepens, the feature values between signals become more and more similar, their reconstruction errors in the output layer of the network to become more similar.It's difficult to distinguish unweathered signals from weathered signals and decreased the performance of the GNN.Furthermore, K-fold cross-validation can detect whether a model is over fitting.To analyze the effectiveness of the GNN model proposed in this paper, we performed a fivefold cross-validation on each of the 3 datasets.Specifically, 432 unweathered sandstones' AE signals from each dataset and 48, 96, 144 weathered sandstones' AE signals from A, B, C datasets are selected each time, and the performance of the proposed model is measured by the AUC value.The results are shown in Fig. 13.

Fig. 12 Fig. 14
Fig.12 The influence of the number of nearest neighbors k and layers on GNN

Fig. 15 Fig. 16
Fig. 15 NMR T 2 spectra and its porosity of weathered sandstones (TS1, TS2 and TS3) (b)  The GNN based on the statistical features of the AE signals is constructed and trained by 2880 sets measured AE statistical features.The GNN method achieves the best identification performance among the four evaluation criteria of AUC, ACC, DR, FAR compared with AEs, LOF and IF.Compared to AEs, each iteration of the GNN network is fit- ting not only the feature information of the signals themselves, but also the feature information of the signals and their neighbors.At the same time, since unweathered AE signals are more similar to each other, this makes the reconstruct error of unweathered AE signals tend to be similar in each iteration of the network.Hence, when the GNN stops iterating, it will be easy to distinguish unweathered AE signals from weathered AE signals by comparing the reconstruction error of each signal.Overall, GNN combines the advantages of AEs and LOF, and the training speed of the model has been improved at the same time, making it well-suited for fast in-site data training.(c) Furthermore, when the nearest neighbor's k gradually increases, the AUC of GNN also gradually increases and then tend to stable when k equals to 50-100.While the hidden layers of the network aggregates less information about the neighborhood features of the signals and cannot distinguish significantly between unweathered and weathered signals when the value of k is small.As the depth of the network deepens, the feature values between signals become more and more similar, their reconstruction errors in the output layer of the network to become more similar.That making it difficult to distinguish unweathered signals from weathered signals and decreased the performance of the GNN.(d) GNN adopts more AE features and considers the similarity between each features.This can greatly eliminate various errors caused by P-wave velocity measurement, greatly improving the robustness of AE detection.Hence, the GNN model presented addresses the limitations of relying solely on P-wave velocity measurements to assess the degree of sandstone weathering at stone cultural heritage.

Table 1
Physical and mechanical properties of 39 unweathered Yungang sandstone

Table 2
Normalized pore characteristics and weathering degrees classification coefficientThe discrete Euclidean signals (e.g., original AE signals and its Fourier transform) of heritages can be enhanced by extracting time-frequency domain statistical features.In this article, we selected a total of 11 timefrequency domain statistical features that are strongly correlated with the porosity indexes, including 5 Time Series Indicators (TI) and 6 Frequency Domain Indica- tors (FI).For the time series signal, statistical features such as root mean square (RMS) value (TI2), peak-topeak value (TI3), skewness (TI4), and kurtosis (TI5) of the time series are selected.In addition, it can be seen from Sect."Analysis of weathered sandstone where, x (n) is the discrete time-domain series in the original dataset, and F(ω) is corresponding Fourier transform; ω is the frequency of the AE signals, and N is the number of data points; "Stationary" represents the amount of data corresponding to a stationary process with zero mean, corresponding to the propagation time

Table 3
Defined weathering levels

Table 5
Parameter settings

Table 6
Actual execution time of four algorithms (seconds)

Table 8
GNN identification results