Estimation of Prediction Intervals for Performance Assessment of Building Using Machine Learning

This study utilizes artificial neural networks (ANN) to estimate prediction intervals (PI) for seismic performance assessment of buildings subjected to long-term ground motion. To address the uncertainty quantification in structural health monitoring (SHM), the quality-driven lower upper bound estimation (QD-LUBE) has been opted for global probabilistic assessment of damage at local and global levels, unlike traditional methods. A distribution-free machine learning model has been adopted for enhanced reliability in quantifying uncertainty and ensuring robustness in post-earthquake probabilistic assessments and early warning systems. The distribution-free machine learning model is capable of quantifying uncertainty with high accuracy as compared to previous methods such as the bootstrap method, etc. This research demonstrates the efficacy of the QD-LUBE method in complex seismic risk assessment scenarios, thereby contributing significant enhancement in building resilience and disaster management strategies. This study also validates the findings through fragility curve analysis, offering comprehensive insights into structural damage assessment and mitigation strategies.


Introduction
The uncertainty quantification (UQ) is an emerging domain, alongside artificial intelligence in natural events where accurate prediction is extremely difficult.To compensate for this uncertainty, a considerable margin over and above the actual requirement of natural disasters is added in structure design which results in a huge cost investment.The nominal design plans and point estimations having insufficient information are unable to address the uncertainty challenges [1].The main causes of uncertainty include data mismatch, input, and parameter uncertainty.Instead of relying on point forecast value, incorporating an uncertainty margin such as the prediction interval (PI) can help make decisions more credible and reliable [2].
One specific example where the challenges of uncertainty are evident is in the design of high-rise buildings situated even in intra-plate regions that face threats from longperiod ground motions originating from distant earthquakes.The slow attenuation of long-period waves coupled with potential amplification by soft soil sites renders these structures susceptible to resonance-induced seismic damage.These vulnerabilities have been evidenced during seismic events, such as the 1985 Michoacán earthquake [3], 2011 Tohoku earthquake [4], and 2015 Nepal earthquake [5], whereby high-rise buildings experienced excessive vibrations and severe damage to their non-structural components, notably in Mexico City and Tokyo.This underscores the critical importance of understanding and mitigating the impact of long-period ground motions on high-rise buildings, both structurally and functionally.Given the complex nature of long-period ground motions and the dearth of dependable seismic records, accurate prediction of the structural response of high-rise buildings remains a challenge without real-time monitoring systems supported by sensors and a robust communication infrastructure.The deployment of an early warning system (EWS), as discussed in [6], becomes imperative with access to reliable building data, to ensure decisions are based on the certainty of both the data and the model to mitigate casualties and losses.
The increasing frequency and intensity of earthquakes worldwide highlight the urgent need for advanced methodologies in seismic analysis and SHM.While historically droughts and floods have accounted for significant casualties, the rise in seismic activity, particularly in densely populated urban areas with vertical housing and rapid urbanization, has emerged as a primary concern.Records since the early 1900s indicate a consistent occurrence of major earthquakes, with an average of 16 significant events annually, including one of magnitude 8.0 on the Richter or Mercalli standard scales or greater.The United States Geological Survey (USGS) reports approximately 20,000 earthquakes globally each year with an average of 55 per day [7].In the past 40-50 years, USGS records show that on average, long-term major earthquakes occurred more than a dozen times every year.Notably, in 2011 alone, 23 major earthquakes of magnitude 7.0 on the Richter or Mercalli standard scales or higher occurred, surpassing the long-term annual average.In other years, the total was well below the annual long-term average of 16 major earthquakes.The lowest-ranking year is 1989 with only 6 major earthquakes followed by 1988 with 7 only.Table 1 shows the top-ranked earthquake countries.These seismic events pose severe threats to structures, particularly those situated near fault lines and seismic zones, resulting in substantial casualties and property losses, amounting to tens of billions of dollars annually.Post-earthquake probabilistic performance assessment (PPPA) is crucial for promptly and accurately evaluating building safety, particularly in ensuring safe shelter after seismic events.Typically, this assessment is time-consuming and is carried out by licensed engineering experts [9].Buildings are categorized into safety levels such as inspected, restricted use, and unsafe, based on these assessments [10,11].However, the scarcity of experts poses challenges, as exemplified by the Tokyo metropolitan government's 110,375 certified experts tasked with assessing over 1.9 million buildings [11,12].This shortage becomes more acute during aftershocks or subsequent earthquakes, as demonstrated by the two intense earthquakes that struck the Kyushu area within 28 hours [13].Thus, the swift and reliable post-earthquake assessment of building structures becomes even more critical in safeguarding human lives.Occupants must be promptly notified of the assessed damage state of the building to facilitate safe evacuation.

Background and Related Works
In recent research, a novel model for sensor-based EWS and PPPA has been introduced [6], employing the Vanmarcke approximation based on a two-state Markov assumption for extreme value detection.This approach outperforms previous heuristic techniques, demonstrating its superiority.Moreover, advancements in artificial intelligence (AI), particularly machine learning (ML) techniques employing artificial neural networks (ANNs), have garnered significant attention in seismic analysis.These techniques exhibit remarkable accuracy in predicting the transient behavior of buildings, facilitating real-time applications such as EWS and PPPA, as well as informing performance-based design strategies for buildings [14,15].Notably, AI methodologies have been employed for nonlinear mapping in data modeling, utilizing bootstrapped ANNs for rapid seismic damage evaluation of structural systems [16].Additionally, AI techniques have been instrumental in stripe-based fragility analysis of multi-span concrete bridges [15] where uniform design-based Gaussian process regression was implemented.
While significant strides have been made in SHM utilizing AI techniques, the crucial aspect of UQ remains largely unexplored in seismic analysis.The oversight of uncertainty can lead to substantial misinterpretations in real-world applications, particularly in scenarios where sudden severe hazards occur.Addressing this gap, researchers have proposed modifications to neural networks (NNs) to account for uncertainty [17,18].Further advancements include the development of lower upper bound estimation (LUBE) method [19] which integrates delta, Bayesian, bootstrap, and mean-variance estimation (MVE) techniques directly into the NN loss function.While the LUBE technique has gained traction across various domains, such as energy demand and wind speed forecasting, challenges arise during simulation and implementation phases, notably with the risk of converging to a global minimum when all high-quality prediction intervals for deep learning PIs are diminished to zero.To mitigate this issue [17] introduces the quality-driven PI method (QD) and quality-driven ensemble (QD-Ens.),employing gradient descent (GD) standard training methods for NNs, thereby enhancing robustness and reliability in predictive modeling.
Utilizing QD and QD-Ens.methods, estimating the characteristics of extreme value distribution functions becomes more convenient.Typically, deterministic hazard analysis specifies mean-plus-one-standard-deviation [20].The QD-LUBE method, known for its high accuracy, rapid convergence, and robustness, is applied to extreme engineering demand parameters (EDPs) like inter-storey drift (IDRs), acceleration (A), and base shear (V) providing prediction intervals (PIs) for observed extreme values.Case studies on a three-storey European laboratory for structural assessment (ELSA) model demonstrate the applicability of this approach.This work signifies a new dimension in assessing building structures in/during post-extreme events, such as earthquakes, enhancing the reliability of probabilistic performance analysis.Correctly estimating value bounds based on analyzed data aids disaster management decision-makers in resource allocation, prioritizing life mitigation, and formulating rehabilitation plans.

Proposed Method: QD-LUBE-Based Prediction Interval Analysis
The conventional ML techniques like nonlinear mapping in data modeling, utilizing bootstrapped ANNs for rapid seismic damage evaluation, and stripe-based fragility analysis only provide point predictions, which means single output for every single target, and are incapable of monitoring the sampling error, prediction accuracy, and uncertainty of the model.For important decisions and design plans, point estimations cannot provide sufficient information [1].In seismic analysis, the minimum and maximum values or upper and lower bound values are important for both PPPA and real-time warning systems.Estimating a credible maximum value is crucial as the risk costs the human and capital loss.Furthermore, the uncertainty sources evolved in earthquake prediction must be precisely quantified to provide essential information for decision-makers.
Hence, in model-based forecasting, specifically the ANN or ML models of natural phenomena, decisions are not solely dependent upon the accurate forecasting the certain variables but also on the uncertainty of data associated with the forecast.The main causes of uncertainty are model and data mismatch, input uncertainty, and parameter uncertainty.Incorporating the uncertainty margins termed as prediction interval (PI) in the determined point forecast value can help to make the decision more credible and reliable [2].The proposed technique for earthquake damage assessment begins with the data generation and acquisition where response spectra from moderate earthquakes are modeled using the CSI SAP2000 v22 software generating engineering demand parameters (EDPs) such as maximum inter-storey drift (MIDR), acceleration, and base shear.MIDRs are pre-processed for extreme value analysis using the peak-over-threshold method and then prepared for the QD-LUBE method.In the threshold and extreme values detection phase, the generalized pareto distribution (GPD) is employed to identify threshold values, crucial for analyzing extreme values during earthquakes.The QD-LUBE method is then applied to predict the upper and lower bounds of these extreme EDPs, enhancing UQ.The fuzzy inference system (FIS) is subsequently used to assess performance-based damage by associating EDPs with fuzzy membership functions and evaluating them against predefined rules.This process involves fuzzification, rule evaluation, and defuzzification to produce crisp output values.Finally, local and global damage states are assessed using these outputs, with fragility curves and cost estimations based on the upper bounds of EDPs providing a comprehensive framework for earthquake damage assessment and real-time warning systems.An overview of the workflow of the paper has been discussed with minor details.A flow diagram is given in Figure 1.
the determined point forecast value can help to make the decision more credible reliable [2].The proposed technique for earthquake damage assessment begins with data generation and acquisition where response spectra from moderate earthquake modeled using the CSI SAP2000 v22 software generating engineering demand param (EDPs) such as maximum inter-storey drift (MIDR), acceleration, and base shear.MI are pre-processed for extreme value analysis using the peak-over-threshold method then prepared for the QD-LUBE method.In the threshold and extreme values detec phase, the generalized pareto distribution (GPD) is employed to identify threshold va crucial for analyzing extreme values during earthquakes.The QD-LUBE method is applied to predict the upper and lower bounds of these extreme EDPs, enhancing The fuzzy inference system (FIS) is subsequently used to assess performance-b damage by associating EDPs with fuzzy membership functions and evaluating t against predefined rules.This process involves fuzzification, rule evaluation, defuzzification to produce crisp output values.Finally, local and global damage state assessed using these outputs, with fragility curves and cost estimations based on upper bounds of EDPs providing a comprehensive framework for earthquake dam assessment and real-time warning systems.An overview of the workflow of the pape been discussed with minor details.A flow diagram is given in Figure 1.

Data Generation and Acquisition
In the first step, a response spectrum generated from CSI SAP 2000 software w 475-years' return period earthquake for a three-storey ELSA model was obtained considered the population of moderate earthquakes of the 475-year return period.model was trained on fifteen moderate-intensity earthquakes to obtain EDPs inclu IDR, A, and V, collectively called DAV.Further, EDPs data were pre-processed to m

Data Generation and Acquisition
In the first step, a response spectrum generated from CSI SAP 2000 software with a 475-years' return period earthquake for a three-storey ELSA model was obtained.We considered the population of moderate earthquakes of the 475-year return period.The model was trained on fifteen moderate-intensity earthquakes to obtain EDPs including IDR, A, and V, collectively called DAV.Further, EDPs data were pre-processed to make them compatible with extreme values analysis using peak-over-threshold (EVA-POT) and subsequently prepared for the QD-LUBE method.

Threshold and Extreme Values Detection
In the case of earthquakes or other natural disasters, extreme values are the main points of interest as they have the worst impact.Being the tail-end values, normal distributions cannot capture them well and the output of the system is normally biased to the overall behavior of the data.For this purpose, GPD was used to select the value of the behavior of data.The approach is used to attain the threshold values for all the earthquakes under analysis.Section 3.3 discusses complete results with certain examples.

QD-LUBE-Based PI Analysis and Uncertainity Framework
In the predictive performance-based earthquake analysis, the concept of UQ is not extensively evolved yet; however, it is gathering fame in other natural hazards like flood and prediction of energy demand.The QD-LUBE method proposed by [17] with fast processing speed, higher accuracy, ease of handling big data, and other competitive benefits over the previous techniques to attain the upper bound of the extreme values has been applied and the model has been trained on the selected earthquake data, i.e., EDPs.The upper bound of EDPs extreme values premeditated in this step have been used in the global and local damage state assessment and global cost estimation.
Pearce et al. [17] proposed the uncertainty framework with QD-LUBE loss function using GD.In the model, if the data generating function f (x) exist and are combined with additive noise, they produce observable target values y: where 'ϵ' is termed as irreducible noise or data noise.Some models, for example the delta method, assume ϵ is constant across the input space (homoscedastic), while others allow for it to vary (heteroskedastic).The term "quality-driven" means that the framework incorporates a gradient descent method, designed through qualitative assessment, and includes model uncertainty, which is different than the conventional lower-upper boundary estimation (LUBE) approach.The loss function used in the proposed framework is distribution-free.In other words, it does require any assumption with a specific distribution for the dataset.The loss function (i.e., the objective function) that needs to be minimized to obtain the optimal neural network for a specific dataset can be defined as follows [17]: where MPIW capt is the captured mean prediction interval width, PICP is the prediction interval coverage probability, λ is a Lagrangian multiplier that controls the relative importance of the width compared to the coverage of the prediction interval, n is the number of data points, (1 − α) is the desired proportion of coverage, and α is commonly assumed 0.01 or 0.05.The prediction interval (PI) should be bounded by the predicted upper bound, y, and lower bound, ŷ, such that: where y i is the target observation of an input covariate y i , and 1 ≤ i ≤ n.The PI of each point should be calculated such that MPIW capt is minimum, while maintaining PICP < (1 − α).
To quantify the MPIW capt and PICP mathematically, the following equations can be used: where c is the total number of data points captured by PI(c = ∑ n i=1 k i ), ŷU i , ŷL i , and ŷU i and ŷL i are the upper and lower bounds of the point under consideration, and k i is a binary variable (k i ϵ 0, 1) that represents the occurrence of the data point within the estimated PI, such that: It is assumed that k can be represented as a Bernoulli random variable (i.e., ki Bernoulli (1 − α)).In addition, k i is assumed to be an independent and identically distributed variable.The former assumption can be used to justify that c can be represented by a binomial distribution (i.e., c binomial (n, (1 − α))).Utilizing the likelihood-based approach, θ, the optimum neural network parameters are optimized to maximize, L ′ θ = L ′ (θ|K, α), where K is the vector of length n with each element in the vector is represented by K i .Based on the probability, the mass function can be calculated using the central limit theorem and the negative log likelihood.

Fuzzification and Performance-Based Assessment
The fuzzy inference system (FIS) serves a dual purpose in nonlinear mapping within fuzzy logic, addressing inherent fuzziness and vagueness in limit states while relating earthquake damage parameters to multiple limit states simultaneously [21].Overall, FIS process flowchart is shown in Figure 2. Utilizing the Mamdani procedure, the FIS is established with fuzzification as its initial step, associating each EDP limit state with specific membership functions and determining the degree of association for each EDP value within these functions [22].Fuzzy operators, employing T-norm (minimum) and S-norm (maximum), are then employed to form fuzzy rules, with the number of rules contingent upon the limit states associated with member functions, which is 27 in our case.The antecedent of each rule is evaluated through fuzzy operators to derive a consequent, a unique number between 0 and 1 [23].The inference engine assigns implication relations for each rule, allowing for different rule weights to represent their relative importance.This weight assignment is crucial, especially when certain EDPs are of greater significance, necessitating higher weights for corresponding rules.Although in this study, all rules are given equal weight.The maximum composition operator aggregates the output fuzzy numbers from each rule which are then transformed into crisp output numbers through the center of the area (CoA) defuzzification process.This involves dividing the total area of membership function distributions into sub-areas, with the de-fuzzified value (z*) of a discrete fuzzy set calculated based on sample elements and their associated membership functions [22]: where z i is the sample element, µ(z i ) is the membership function, and n is the number of elements in the sample [22].The concept of evaluation ratio (ER) and its classification into "recommended", "moderate", and "not recommended" classes based on certain threshold values is utilized to assess building damage assessment based on structural characteristics and the earthquake response spectrum.It explains how different ranges of ER correspond to different levels of system performance and suitability for design.These classes are mapped to three different levels of the evaluation ratio as (ER > 0.7), (0.35 > ER > 0.7), and (ER < 0.35), respectively.The ER of each system is labeled "not recommended" if ER < 0.5.The "recommended" class means that system responses are within the recommended limits and vice versa [23,24].This study aims to compare the fuzzified original, without incorporating the PI uncertainty, on EDPs such as MIDR, A, and V with the enhanced value of EDPs based on PI results.
(ER < 0.35), respectively.The ER of each system is labeled "not recommended" if ER < 0. The "recommended" class means that system responses are within the recommende limits and vice versa [23,24].This study aims to compare the fuzzified original, witho incorporating the PI uncertainty, on EDPs such as MIDR, A, and V with the enhance value of EDPs based on PI results.In this step, local damage state assessment has been performed.The fragility curv of FEMA P-58 PACT software (version 3.1.1)were used as the basis for comparison.Glob damage assessment of structures passing the maximum values of DAV through the fuzz inference system (FIS) is executed, followed by the global cost estimation.
In the local damage state assessment on the maximum inter-storey drift values use the probability of damage states for post-earthquake performance assessment and rea time warning systems is calculated using the upper bound of extreme values of 1 earthquakes.Global cost estimation is calculated based on D values passed through FI Similarly, the global damage state of structures has been calculated using the maximu upper bound values of DAV passing through the FIS.

Experimental Evaluation: Case Study Model and Validation
All the steps named in the previous section have been elaborated in depth with th ELSA model.The ELSA model is a well-known standard model used as benchmark f most of the structural design software.Components and material details of the ELS three-storey model are given in Table 2 and Figure 3.In this step, local damage state assessment has been performed.The fragility curves of FEMA P-58 PACT software (version 3.1.1)were used as the basis for comparison.Global damage assessment of structures passing the maximum values of DAV through the fuzzy inference system (FIS) is executed, followed by the global cost estimation.
In the local damage state assessment on the maximum inter-storey drift values used, the probability of damage states for post-earthquake performance assessment and real-time warning systems is calculated using the upper bound of extreme values of 15 earthquakes.Global cost estimation is calculated based on D values passed through FIS.Similarly, the global damage state of structures has been calculated using the maximum upper bound values of DAV passing through the FIS.

Experimental Evaluation: Case Study Model and Validation
All the steps named in the previous section have been elaborated in depth with the ELSA model.The ELSA model is a well-known standard model used as benchmark for most of the structural design software.Components and material details of the ELSA three-storey model are given in Table 2 and Figure 3.

Data Acquisition
Non-linear time history analysis (NLTHA) of fifteen moderate-intensity earthquakes with a 475-year return period was used.All these earthquakes have many similar parameters, and huge infrastructure lies on their fault lines.These moderate earthquakes occurred during the years 1956 to 1980 and provide sufficient data.Important parameters of selected earthquakes are given in Table 2.The nonlinear time series data for all the earthquakes have been formulated, pre-processed, and visualized to make it compatible with the model and for other mathematical operations.The nonlinear time series data o MIDR for the first to second floor during the San Ramon-Eastman Kodak (1980 earthquake is plotted in Figure 4.All fifteen earthquakes, as shown in Table 3, were run on this building model in a single degree of freedom (SDOF), i.e., in the Y direction only to attain the maximum DAV values.

Data Acquisition
Non-linear time history analysis (NLTHA) of fifteen moderate-intensity earthquakes with a 475-year return period was used.All these earthquakes have many similar parameters, and huge infrastructure lies on their fault lines.These moderate earthquakes occurred during the years 1956 to 1980 and provide sufficient data.Important parameters of selected earthquakes are given in Table 2.The nonlinear time series data for all the earthquakes have been formulated, pre-processed, and visualized to make it compatible with the model and for other mathematical operations.The nonlinear time series data of MIDR for the first to second floor during the San Ramon-Eastman Kodak (1980) earthquake is plotted in Figure 4.All fifteen earthquakes, as shown in Table 3, were run on this building model in a single degree of freedom (SDOF), i.e., in the Y direction only, to attain the maximum DAV values.

Extreme Values Detection
Hu et al. in [6] used the Poisson's assumption and compared the results based on the Vanmarcke assumption and Monte Carlo simulation using Kalman Smoother to attain extreme values.However, the extreme values shoot out due to non-flexible, static assumptions only linked with the mean deviation method, leading towards overestimations.
Due to these shortcomings of log-normal distribution and Vanmarcke assumption, the GPD method is used to fit the POT.This allows a continuous range of possible shapes

Extreme Values Detection
Hu et al. in [6] used the Poisson's assumption and compared the results based on the Vanmarcke assumption and Monte Carlo simulation using Kalman Smoother to attain extreme values.However, the extreme values shoot out due to non-flexible, static assumptions only linked with the mean deviation method, leading towards overestimations.
Due to these shortcomings of log-normal distribution and Vanmarcke assumption, the GPD method is used to fit the POT.This allows a continuous range of possible shapes that includes both the exponential and Pareto distributions as special cases.The distribution allows us to "let the data decide" [25] which distribution is appropriate; hence, the highest level of adaptability and accuracy is achieved.
When fitting the excess with the GPD, the primary problem is the selection of threshold λ.If λ is too large, few excesses and insufficient data lead to excessively large estimator variance.If λ is too small, a large deviation between an excess distribution and GPD leads to a biased estimation.Therefore, a compromise between bias and variance is needed for λ selection.By adopting the straightforward graphic methods including the mean residual life plot and shape and scale parameters stability plots to determine λ based on the average excess function, an optimal threshold value can be calculated separately at each node and for every earthquake.Figure 5 shows the plots of the extreme values data against mean excesses and shape parameters for mean inter-storey drift between the roof and third floor for the Imperial Valley-07 earthquake.Similarly, the are plotted in respect to the cumulative distribution function and probability density function in Figure 6.The threshold selection has been calculated considering the mean value of data for each earthquake.

LUBE-Based Prediction Interval Analysis 4.3.1. Preparation of Training Sets
Datasets are the relative joint acceleration, joint displacements, and the sheer force for the 15 earthquakes as explained in Section 2. SAP 2000 produces a nonlinear time series.Pre-processing of data to make them readable to ANN is performed after the extreme values analysis.Absolute values, sorted from the minimum to the maximum value, were used.A sample shape of data is shown in Figure 7.Moreover, GPD distribution estimation for IDR values calculated at nodes '113' and '112' for 'EQ25'("Imperial Valley-07", "El Centro Array #7") is given in Table 4.

Preparation of Training Sets
Datasets are the relative joint acceleration, joint displacements, and the sheer force for the 15 earthquakes as explained in Section 2. SAP 2000 produces a nonlinear time series.Pre-processing of data to make them readable to ANN is performed after the extreme values analysis.Absolute values, sorted from the minimum to the maximum value, were used.A sample shape of data is shown in Figure 7.Moreover, GPD distribution estimation for IDR values calculated at nodes '113' a '112' for 'EQ25'("Imperial Valley-07", "El Centro Array #7") is given in Table 4.

Preparation of Training Sets
Datasets are the relative joint acceleration, joint displacements, and the sheer for for the 15 earthquakes as explained in Section 2. SAP 2000 produces a nonlinear tim series.Pre-processing of data to make them readable to ANN is performed after t extreme values analysis.Absolute values, sorted from the minimum to the maximu value, were used.A sample shape of data is shown in Figure 7.The dataset was further refined and scaled to make it compatible with the model and the loss function of QD-LUBE.The key advantages of the QD-LUBE method are its intuitive objective, low computational demand, robustness to outliers, and lack of distributional assumption.The model used is the Python TensorFlow library Keras

Setting up the Model
The dataset was further refined and scaled to make it compatible with the model and the loss function of QD-LUBE.The key advantages of the QD-LUBE method are its intuitive objective, low computational demand, robustness to outliers, and lack of distributional assumption.The model used is the Python TensorFlow library Keras sequential model with the input layer and one intermediate layer both having 100 neurons first with RELU activation functions and the output layer having two neurons with the LINEAR activation function.The Adam optimizer was used as a compiler and the confidence level was set at 95%.

Predicting the Upper and Lower Bounds for DAV
Finally, the model was run to make the prediction of the upper and lower bounds.The absolute values were used; hence, the peak values information lies in the upper bound only.The upper bounds of the selected earthquakes are shown in 8. From the graph, we can see the outliers which are abnormal from the distribution; however, PI can determine the upper bound on these outliers using the accumulative behavior of the distribution, and prediction can be calculated for any next value.The same procedure was used to attain the upper bound values of the base sheer and MIDR.Table 5 shows the acceleration values of the maximum value of distribution treated as point prediction, and after fitting the QD-LUBE, an upper bound is calculated for every earthquake.This upper bound adds to the margin of uncertainty while trading itself from the behavior of the input data which is above the threshold value.For comparison, the model was also trained with values more than the median, and it was found that the threshold section method using GPD provides a very good approximation of the extreme events or the tail-end events.The LUBE method further adds to the margin of error as the confidence level is specified in the model (in our case 95%).Therefore, the combination of POT and QD-LUBE provides a very robust hybrid combination of UQ.

Setting up the Model
The dataset was further refined and scaled to make it compatible with the model and the loss function of QD-LUBE.The key advantages of the QD-LUBE method are its intuitive objective, low computational demand, robustness to outliers, and lack of distributional assumption.The model used is the Python TensorFlow library Keras sequential model with the input layer and one intermediate layer both having 100 neurons first with RELU activation functions and the output layer having two neurons with the LINEAR activation function.The Adam optimizer was used as a compiler and the confidence level was set at 95%. 4.3.3.Predicting the Upper and Lower Bounds for DAV Finally, the model was run to make the prediction of the upper and lower bounds.The absolute values were used; hence, the peak values information lies in the upper bound only.The upper bounds of the selected earthquakes are shown in Figure 8. From the graph, we can see the outliers which are abnormal from the distribution; however, PI can determine the upper bound on these outliers using the accumulative behavior of the distribution, and prediction can be calculated for any next value.The same procedure was used to attain the upper bound values of the base sheer and MIDR.Table 5 shows the acceleration values of the maximum value of distribution treated as point prediction, and after fitting the QD-LUBE, an upper bound is calculated for every earthquake.This upper bound adds to the margin of uncertainty while trading itself from the behavior of the input data which is above the threshold value.For comparison, the model was also trained with values more than the median, and it was found that the threshold section method using GPD provides a very good approximation of the extreme events or the tail-end events.The LUBE method further adds to the margin of error as the confidence level is specified in the model (in our case 95%).Therefore, the combination of POT and QD-LUBE provides a very robust hybrid combination of UQ.The two main parameters used to evaluate the performance of any model based on a statistical model (e.g.,: LUBE method) given in the literature [26] are the normalized mean prediction interval width (NMPIW), which should be minimized, and PICP, which is considered as good as it is near to 1. QD-LUBE is proven to be better than that of Bootstrap or the LUBE method, and this fact was verified by this work.The PICP of the base shear is given in Table 6.In the final step, the building performance assessment was performed for the following aspects: 1.
Local damage state assessment using the FEMA P-58 PACT fragility specification manager, using the MIDR values calculated in Section 3.3.

2.
Global damage state assessment using the DAV values after passing them through the FIS system of the set limit state member functions.

Local Damage State Assessment
The local damage assessment was performed using FEMA P-58 PACT software.Table 7 shows a comparison of the MIDRS point maximum values and the upper bound calculated by the QD-LUBE model.Figure 9 also shows the same graphically.The model provided an uncertainty margin to accommodate for the noise, data, and model uncertainties, closely following the behavior of tail-end data.The structural component B1044.102slender concrete wall, 18" thick, 12' high, 20' long, was evaluated on three earthquakes (Imperial Valley-06 at "Chihuahua" station (EQ20), "El Centro Array #12" station (EQ22), and Livermore-01 at "San Ramon-Eastman Kodak" station) as shown in Figure 10.The damage states which were on the borderline were moved to the next damage state after adding the uncertainty margins using the upper bound of the predicted model for each earthquake.The structural component B1044.102slender concrete wall, 18" thick, 12' high, 20' long, was evaluated on three earthquakes (Imperial Valley-06 at "Chihuahua" station (EQ20), "El Centro Array #12" station (EQ22), and Livermore-01 at "San Ramon-Eastman Kodak" station) as shown in Figure 10.The damage states which were on the borderline were moved to the next damage state after adding the uncertainty margins using the upper bound of the predicted model for each earthquake.Table 7 shows the values of MIDRs simulated by the SAP model and prediction intervals upper bound.

Global Damage Assessment
For the global cost assessment, the value of Dmax is fuzzified.It is clarified that the highest impact is that of EQ27, followed by EQ17.The ELSA three-storey model has very minor to no damage in the case of other earthquakes.Results of fuzzification are given in Table 8.

Global Damage Assessment
For the global cost assessment, the value of D max is fuzzified.It is clarified that the highest impact is that of EQ27, followed by EQ17.The ELSA three-storey model has very minor to no damage in the case of other earthquakes.Results of fuzzification are given in Table 8.For UQ, the global damage assessment of DAV based on PI has been fuzzified.The evaluation ratio (ER) of EQ-17 and EQ-27 shows the highest damage.Structures' evaluation for these earthquakes resulted in "not recommended".Hence, structural parameters need to be modified to attain performance assessment results in the acceptable range.Table 9 provides normalized and un-normalized output values and evaluation ratios after fuzzification.

Conclusions
The distribution-free ensemble approach, the QD-LUBE method, has been used for uncertainty quantification, in assessing the critical parameters of structural models like inter-storey drift, peak ground acceleration, and base shear, which has been proven as powerful ML tool.In this study, the QD-LUBE was applied to the ELSA model, focusing on a three-storey building subjected to 16 well-known earthquakes within a single degree of freedom framework.
Key aspects of the methodology included leveraging the Vanmarcke's assumption for extreme value detection, which allowed us to extract peak-over-threshold determinations.The upper bounds of prediction intervals for certain parameters were accessed by training the model and testing on given datasets.The process yielded robust results and demonstrated superior performance compared to the bootstrap method in terms of accuracy and reliability.
Furthermore, the findings were validated through fragility curve analysis, specifically evaluating the impact of three earthquakes on the effective drift and transitioning between damage states.To assess the overall structural damage, a fuzzy inference system was integrated, providing a comprehensive evaluation of the global damage state.
The research not only contributes to advancing the field of uncertainty quantification in structural engineering, but also showcases the efficacy of the QD-LUBE method in handling complex scenarios and providing actionable insights for seismic risk assessment and mitigation strategies.

Future Works
The LUBE method can be used to enhance the reliability of a real-time warning system by predicting the upper bound of point prediction.In the event of an earthquake, the system will take the data from sensors and ICT infrastructures.The data samples from multiple buildings can help to make the decision for all the buildings of the same kind and the model trains itself with the extreme values behavior to provide certain values.Moreover, novel ML models are considered to improve the seismic performance assessment of buildings by incorporating additional details and methodologies [27,28].Determining a threshold in peak-over-threshold modeling using mean residual life and threshold stability plots involves significant subjectivity.The identification of linear portions in these plots is challenging due to the vague definition of linearity, leading to potential errors in selecting constant scale and shape parameter estimates.An objective method like a segmentation approach is needed to accurately determine the constant portion of these parameters [29].

Figure 1 .
Figure 1.Workflow of the proposed technique.

Figure 1 .
Figure 1.Workflow of the proposed technique.

Figure 5 .
Figure 5. Mean residual life plot and shape parameter stability plot for MIDR third floor to roof for Imperial Valley-07 earthquake.

Figure 6 .
Figure 6.CDF and PDF for mean residual threshold selection.

Figure 5 .Figure 5 .
Figure 5. Mean residual life plot and shape parameter stability plot for MIDR third floor to roof for Imperial Valley-07 earthquake.

Figure 6 .
Figure 6.CDF and PDF for mean residual threshold selection.

Figure 6 .
Figure 6.CDF and PDF for mean residual threshold selection.

Figure 7 .
Figure 7. Plot showing IDR extreme values for Livermore-Morgan Terr Park (1980) earthquake at second to third storeys of the ELSA model.4.3.2.Setting up the Model

Figure 7 .
Figure 7. Plot showing IDR extreme values for Livermore-Morgan Terr Park (1980) earthquake at second to third storeys of the ELSA model.

Figure 7 .
Figure 7. Plot showing IDR extreme values for Livermore-Morgan Terr Park (1980) earthquake at second to third storeys of the ELSA model.

Figure 8 .
Figure 8.The upper and lower bound of two earthquakes using QD-LUBE method.Figure 8.The upper and lower bound of two earthquakes using QD-LUBE method.

Figure 8 .
Figure 8.The upper and lower bound of two earthquakes using QD-LUBE method.Figure 8.The upper and lower bound of two earthquakes using QD-LUBE method.

Figure 9 .
Figure 9. Plot of MIDRs and their upper bound.

Figure 10 .
Figure 10.Fragility analysis using the maximum value and upper bound for three earthquakes.

Table 2 .
Steel (a) and concrete (b) properties of ELSA model.

Table 2 .
Steel (a) and concrete (b) properties of ELSA model.

Table 3 .
List of the selected earthquakes.

Table 3 .
List of the selected earthquakes.

Table 5 .
Maximum point prediction and upper bound of the acceleration values.

Table 6 .
PCIP of the base shear of all the selected earthquakes.

Table 7 .
MIDR values and upper bound for 15 earthquakes.
Table 7 shows the values of MIDRs simulated by the SAP model and prediction intervals upper bound.

Table 7 .
MIDR values and upper bound for 15 earthquakes.

Table 9 .
ER for engineering demand parameters (DAV) values fuzzification based on PI.