Hybrid data‐driven hazard‐consistent drift models for SMRF

The seismic design and assessment of steel moment resisting frames (SMRFs) rely heavily on drifts. It is unsurprising, therefore, that several simplified methods have been proposed to predict lateral deformations in SMRFs, ranging from the purely mechanics‐based to the wholly data‐driven, which aim to alleviate the structural engineer's burden of conducting detailed nonlinear analyses either as part of preliminary design iterations or during regional seismic assessments. While many of these methods have been incorporated in design codes or are commonly used in research, they all suffer from a lack of consideration of the causal link between the seismic hazard level and the ground‐motion suite used for their formulation. In this paper, we propose hybrid data‐driven models that preserve this critical relationship of hazard‐consistency. To this end, we assemble a large database of non‐linear response history analyses (NRHA) on 24 SMRFs of different structural characteristics. These structural models are subjected to 816 ground‐motion records whose occurrence rates and spectral shapes are selected to ensure the hazard consistency of our outputs. Two sites with different seismic hazards are examined to enable comparisons under different seismic demands. An initial examination of the resulting drift hazard curves allows us to re‐visit the influence of salient structural modelling assumptions such as plastic resistance, geometric configurations and joint deterioration modelling. This is followed by a machine learning (ML)‐guided feature selection process that considers structural and seismic parameters as well as key static response features, hence the hybrid nature of our models. New models for inter‐storey drift and roof displacements are then developed. A comparison with currently available formulations highlights the significant levels of overestimation associated with previously proposed non‐hazard consistent models.


INTRODUCTION
Performance-based earthquake engineering (PBEE) methodologies have enabled the design process to hinge around the attainment of specified performance objectives for pre-defined levels of seismic demand. Famously introduced in the SEAOC 1 guidelines, the PBEE philosophy has been adopted in many seismic design codes, such as Eurocode 8 2 and ASCE. 3 The most widely embraced PBEE framework, known as the Pacific Earthquake Engineering Research or PEER-PBEE framework, conceived by Cornel and Krawinkler 4 and contained in FEMA P-58, 5 enables structural designs to be targeted to different stakeholders' performance objectives. In order to verify the satisfaction of those objectives, consequences are expressed in terms that are easily understood by the stakeholder, such as casualties, repair or replacement cost, downtime and environmental impact. 6 One way to present the expected structural response at a given site within the PBEE framework is by depicting the probability of exceeding various structural response levels during a specified time (usually a year). These curves are known as EDP hazard curves, where EDP stands for engineering demand parameter (EDP), which is a parameter that describes the structural response (e.g., peak deformations, peak accelerations). In the case of steel moment resisting frames (SMRFs), the EDP par excellence is the inter-storey drift, or in its absence, the roof drift. This is because SMRFs are flexible structures whose design is usually controlled by serviceability constraints on their allowable deformation. Unsurprisingly, numerous simplified models have been proposed in the scientific literature for the estimation of drifts in SMRFs.
Predictive models of structural drift can generally be divided into: (i) classical mechanics-based, (ii) hybrid and (iii) purely data-driven models. 7 Mechanics-based models, e.g., Refs. [8][9][10] are derived solely from classical mechanics theory, which makes them easy to generalise and interpret, although they are also often oversimplified and imprecise. On the opposite side of the spectrum are the purely data-driven models, e.g., Refs. [11][12][13] that connect the model predictors directly to the prediction model target using an advanced statistical or machine learning (ML) algorithm. These data-driven models are especially practical in the absence of an analytical model. 14 However, they use complex methods that are arduous in their interpretation or effectively work as 'black boxes' that can only be tested from the outside. In between these two types of models are the hybrid models, e.g., Refs. 5,[15][16][17] that aim to offer a balance between interpretability and applicability. Hybrid models are initiated by mechanics-based analysis (e.g., modal or linear static analysis) to obtain certain intermediate results. These intermediate results are then connected with the target by linear, multi-linear regression or ML models. Although dealing with a different structural system than the one studied herein and sometimes focused on residual displacements, which is not the focus of this paper, the following studies [18][19][20] also represent an example of hybrid models.
Most of the models proposed to date have been constructed on the basis of limited ground-motion records scaled up to mimic increasing seismic intensities, or record sets selected to represent a particular type of ground motion (e.g., pulse-like, long-duration). Notwithstanding the limitations of a scalar intensity measure (IM) to describe the ground shaking, 21 another crucial aspect that has been universally neglected in past modelling formulations is the preservation of the causal connection between the hazard level and the set of records employed. Since the ground-motion characteristics change as a function of the earthquake intensity, simply scaling the spectral ordinates would not account for this. A step forward to the most common approaches used in previous efforts, and the one followed herein, is to enable a hazard-consistent selection of ground motion recordings on the basis of which new predictive relationships can be developed.
In this study, we evaluate the inter-storey drift and roof displacement demands of SMRFs within a hazard-consistent framework and provide predictive models for them. To this end, non-linear response history analyses (NRHA) on 24 SMRFs with different structural characteristics are performed. The frames are subjected to 816 ground motions representative of two different sites whose rate of occurrence is estimated using the conditional scenario spectra (CSS) methodology 22 to ensure the desired hazard consistency. The results from these comprehensive NRHA are expressed in the form of EDP hazard curves, and the effects of the modelling approach and key structural characteristics are first examined. Subsequently, we use a number of ML feature selection techniques to inform the choice of structural and seismic parameters to be incorporated into our predictive models. Then, several regression models are trialled to develop predictive relationships for inter-storey drift and roof displacement. Finally, our models are compared with currently available predictive models, and the significant differences between their estimations are quantified.

OVERVIEW OF HAZARD-CONSISTENT SEISMIC DEMAND ANALYSIS
Various frameworks have been proposed for the estimation of the probability of exceedance of EDPs as a function of ground-motion intensity. However, most available studies have achieved this by means of incremental dynamic analyses (IDA), 23 which suffer from the underlying assumption that a single set of ground motions scaled to achieve various IM levels is representative of the change in ground-motion characteristics with earthquake intensity. Consequently, IDA result in rather unrealistic ground motion spectral shapes that may lead to correspondingly unrealistic structural responses that are not hazard-consistent. 22 The effects of this can be quantified by comparing the outputs of previous models generated based on IDAs against the hazard-consistent estimations produced herein, as examined later in this paper. An alternative method, and the one adopted in this study, is to obtain the EDP using a set of ground motions whose rates of occurrence are determined from CSS as proposed by Abrahamson et al. 22,24,25 The CSS presents a set of realistic earthquake spectra with an assigned rate of occurrences based on their spectral shape and intensity. Hence, the ground motion spectra in the CSS are developed to be spectrally consistent with the hazard at the selected site at all relevant periods.
To develop the CSS, we have selected ground motions from hazard disaggregation at two sites and have estimated the target uniform hazard spectra (UHS) at several hazard levels (i.e., several exceedance rates). For each hazard level, the ground motions are scaled to account for the variability of the peaks and troughs around the conditional mean spectra (CMS). The CMS ensure that each spectrum has the correct shape and provides the geometric mean response spectrum for all periods of interest. 26,27 Each ground motion is then assigned an initial rate of occurrence according to the hazard level of the UHS at the CMS conditioning period. Finally, the rate of occurrence given to each ground motion is numerically optimised such that its calculated hazard matches the target hazard curve.
The CSS method has been previously used to evaluate the drift and base shear demands in RC frames by Arteta and Abrahamson 22,25 and to assess the non-linear truss model capability to simulate the seismic behaviour of RC walls by Arteta et al. 28 The CSS was also employed to perform fragility analysis of concrete shear walls 29 and seismic risk assessment of ground-mounted rigid sliding equipment and contents. 30 However, to date, the CSS has not been applied to steel structures.
In this study, we use CSS to generate a database for the training of our drift predictive models for SMRFs, as will be described later in this paper. To this end, site-specific probabilistic seismic hazard analyses (PSHA) were carried out for Oakland (37.803 • N, 122.287 • W) and Sonora (38.158 • N, 120.000 • W), both in California. The hazard curves obtained are presented in Figure 1. One of the main features of the CSS is that the hazard at a site can be recovered over a whole range of periods by using a suite of records with specific rates of occurrence. The target and reconstructed CSS spectra for different hazard levels are compared in Figure 2, where they show a good match. This figure also shows the Eurocode 8 Type 1 Spectrum for Soil Class B matching well the high-hazard site spectrum for a return period, , of 500 years. This spectra compatibility implies the similarity of the demand at the site and the Eurocode 8 2 provisions, and justifies the selection of these spectra to test the Eurocode 8 design assumptions that follow in this paper. It can be appreciated from these figures that the two selected sites have different seismicity levels. The Oakland site has the highest seismicity of the two with a peak ground acceleration (PGA) for = 2475 years at 0.835 g, which is around 3.5 times the corresponding PGA in Sonora. 450-360(1)+450-400(2-3)+450-450(4-5)+400-450(6-7)+360-400(8-9)+360-360(10)+360-330(11-12) 24 12 6 500-360(1)+500-400(2-3)+500-450(4-5)+450-450(6-7)+400-400(8-9)+400-360(10-11)+400-330(12)

SMRF models
Twenty-four plane SMRFs are considered in this study. The building archetypes used are based on the frame designs by Karavasilis et al. 16 Table 1. A typical configuration of an SMRF is shown in Figure 3. HEB sections are used for the columns, and IPE sections for the beams. The expressions  Table 1 follow the nomenclature 'column-beam(storeys)', for example, 280-360(1-4)+260-330(5-6) means that floors 1-4 have HEB280 columns and IPE360 beams and floors 5-6 have HEB260 columns and IPE330 beams. The gravity load on the beams is set to 27.5 kN/m. It should be noted that although the steel frames are designed according to the Eurocodes by Karavasilis et al., 16 different configurations of steel sections are considered in order to obtain a family of frames that covers the more flexible and more rigid ends of the spectrum. This is the reason why three different designs are used for each unique set of s and b . The more flexible frames can be regarded as the product of lower seismicity levels or more relaxed values of serviceability drift limits, while stiffer frames can be associated with higher seismic design demands. The frames were analysed using the open-source software OpenSees. 32 The uniaxial bilinear Steel01 material model was used. The beams and columns were modelled with force-based elements comprising four integration points along their length and sections discretised in around 20-40 fibres along the flange and web, depending on their section size. The shear compliance of the panel zone was modelled using the scissors model. 33 Geometric nonlinearities were incorporated by means of the corotational transformation. The non-linear behaviour of the joints was represented by rotational springs added via zero-length elements adopting the modified-Ibarra-Krawinkler deterioration model 34 with the parameters determined in accordance with the empirical relationships proposed by Lignos and Krawinkler. 35 Table 2 shows several structural characteristics that describe the dynamic behaviour of the frames obtained from eigenvalue and pushover analyses like the fundamental period ( 1 ), the stiffness ratio ( ), plasticity resistance ratio ( ), relative storey stiffness ratio ( 3 ), first mode participation factor ( ) and global ductility parameter ( ).
The stiffness ratio ( ) controls the frame behaviour in the elastic range, 16 and is calculated as = where and are the second moment of inertia and the length of the beam (b) and column (c), respectively. As increases, the behaviour of the structure changes from flexural-dominated to shear-dominated. The influence of diminishes as the structure moves into the inelastic range and approaches the formation of a mechanism. 16 The plasticity resistance ratio ( ) is used to describe the characteristic of the structure in the inelastic range, it is defined as = RC,1,av RB,av , where RC,1,av is the average of the first storey column sections plastic moments and RB,av is the average of the beam sections plastic moments. For a certain level of ductility and beam cross-section, a higher value of will delay the mechanism formation. The relative storey stiffness ratio ( ) accounts for the variation in the stiffness or strength in the upper storeys. This parameter may influence the response of the structure since relatively flexible or weak upper storeys may result in early yielding and relatively high roof drift demand due to higher mode effects. is calculated using the inter-storey drift profile corresponding to the first mode shape of the frame obtained from eigenvalue analysis. The relative storey stiffness ratio used in this study is the 3 , which is the ratio of the maximum drift of the upper half of the frame to the maximum drift of the lower half of the frame. 17 The first mode participation factor ( ) is also used to account for higher mode effects in the drift response. A higher indicates less significant higher mode effects in the structural response. The global ductility parameter ( ) is calculated as . The ultimate displacement (Δ u ) is measured at the instant of the formation of the collapse mechanism, while the yield displacement (Δ y ) indicates the point beyond which the structural response is strongly non-linear. Currently, there is no universally accepted definition of Δ u and Δ y , which has led to a variety of interpretations of global ductility assessment results for MRFs. The definitions adopted herein are the ones in Annex B of Eurocode 8, 2 which describes (Δ u ) as the displacement where the plastic mechanism occurs and provides an equation to calculate the (Δ y ) as a function of deformation energy and yield force.

Ground-motion records
596 ground motions for the Oakland site and 220 ground motions for the Sonora site were selected from the PEER NGA-West 2 database 37 based on the disaggregation of the PSHA. The difference in the number of ground motions required at each site is a reflection of the lower variability associated with the Sonora site in comparison with Oakland, hence requiring fewer records to obtain comparably stable mean estimates. This is because of the higher seismicity at the Oakland site, then requiring a larger number of ground motions within the CSS framework to reproduce the seismic demand across a range of periods. Figure 4 presents the associated horizontal-component ground motion spectra where the rate of occurrence of each spectrum was calculated using the CSS methodology. Figure 5 presents a distribution plot matrix of key parameters of the ground-motion database employed, including the PGA, peak ground velocity (PGV), significant duration ( 5−95 ), ratio between the period of the building and mean period of the ground motion ( 1 / m ) and the spectral acceleration at the fundamental period ( ( 1 )) for all buildings and ground-motions considered. The kernel density estimate plots in Figure 5 show the probability density function of one F I G U R E 4 Scenario spectra (5% damped) of (A) Oakland and (B) Sonora sites parameter as a function of another, while the associated histograms are shown in the diagonal. It can be seen from Figure 5 that the dataset employed has a generally normal and uniform distribution of parameters.

PROBABILISTIC SEISMIC DEMAND ANALYSIS
The 24 frames described before were subjected to the 816 ground motions from both sites and analysed using Opensees 32 on the high-performance computing facility at Imperial College London. 38 These 19,584 analyses resulted in an extensive dataset of structural responses that ranged from mildly elastic to highly inelastic. These results are used to develop the EDP hazard curves, as shown in Figure 6. In this figure, the maximum inter-storey drifts of Frame 03 (3S-3B) are plotted against the rate of occurrence of the corresponding ground motions. The EDP annual rate of exceedance, [ > ], is calculated using the following expression 22 : where CSS, is the assigned rate of occurrence of the ground motions obtained from the CSS, and [ − ] is the Heaviside function.
As an example of typical results, the inter-storey drift hazard curve of Frame 03 (3S-3B) is shown in Figure 6B for Oakland. The annual rates of exceedance of 1/475 (life safety) and 1/2475 (collapse prevention) are also shown in this figure to facilitate interpretation. The corresponding maximum inter-storey drift of Frame 03 associated with the life safety and collapse prevention performance levels are 1.38 and 1.98%, respectively. These values are well below the performance level objective for conventional structures based on FEMA 356, 39 which is 2.5% transient drift for life safety and 5% for collapse prevention performance level. By contrast, the current version of Eurocode 8 2 only mentions a drift limit for immediate occupancy performance level ('damage limitation state') at 1% drift.
The process of developing the hazard curve was repeated for every frame at both Oakland and Sonora sites. These hazard-consistent representations of structural response can be used to compare the effects of key structural and nonstructural characteristics over the whole range of responses and annual rates. Below, these comparisons are offered for site seismicity, joint deterioration and geometrical structural configurations.

4.1
Influence of site seismicity Figure 7 shows that the structural demands of the Oakland site are remarkably higher than those for the Sonora site, as mentioned in Sections 2 and 3. While the same frame will experience a 2.6% drift for 2475 years return period in Oakland, it will experience 0.5% in Sonora, both of which are well below the code requirement. This significant difference in seismic demands between the different sites also emphasises that site-specific performance assessments are needed in order to ensure economical designs. This aspect will be discussed further during the development of the predictive models later in this paper.

Influence of joint deterioration modelling
Although simplified non-deteriorating models have been shown to predict reasonably well mean global structural responses, for example, 40,41 a deterioration model that captures the strength and stiffness degradation of the structural component is generally expected to be more reliable even at lower levels of earthquake demand, 42 and there have been recent calls to use the higher levels of model sophistication even for design level demand assessments. 43 To examine the accuracy of these claims, alternative models without deterioration were assessed, and their outputs were compared with the deteriorating ones. The difference in their estimations is shown in Figure 8. In general, the deteriorating frames exhibit slightly higher inter-storey drift demand than the non-deteriorating ones, although this difference is generally more noticeable at higher hazard levels ( < 1/2475). At the lower hazard level, the deteriorating frames appear to predict similar or sometimes even lower demands. Accordingly, the Sonora site, which has significantly lower seismicity than Oakland, shows a very small difference between deteriorating and non-deteriorating models at all relevant hazard levels. It follows from the previous discussion that site seismicity can inform the need to incorporate expensive deterioration effects in beams and columns. This also stresses the possibility of avoiding the complexities of deterioration modelling for assessments that are not intended to evaluate collapse scenarios, a fact that was already highlighted by Krawinkler. 44

Influence of structural configuration
The influence of different structural configurations on the inter-storey drift and roof displacement hazard curves can be discussed with reference to Figures 9-11, taking the Oakland site as an example. In Figure 9, frames with a higher plasticity resistance ratio, , have noticeable lower inter-storey drift and roof displacement demand for the same hazard level as the high delays the formation of plastic hinges. It can be seen that a 0.3% increase in results in a 0.5% increase in inter-storey drift demand at a 2475 year return period rate. The hazard curves also show the influence of the number of bays in Figure 10A,B, where frames with a different number of bays show similar demands up to 2475 years return period, although a higher number of bays are associated with slightly higher demand at higher hazard level with 0.4% difference at 10,000 years return period. This finding on the influence of the number of bays defies the commonly accepted perception that a larger number of bays is conducive to a better seismic response of structures. 45,46 Last, Figure 11A shows almost no difference in inter-storey drift demands for different number of storeys at both 475 and 2475 years return period. Although the difference in the demand is more noticeable in higher hazard levels, there is no distinguished pattern of the influence of the number of storeys in the inter-storey drift demand. On the other hand, the influence of the number of storeys becomes significant in the roof displacement demand in Figure 11B as the roof displacement is obviously higher for taller frames. These results show that the plasticity resistance ratio influences the inter-storey drift more than the number of bays and storeys. All the hazard curves obtained in this study indicate that code-compliant SMRFs are safely within the required performance level objectives of conventional structures based on FEMA 356, 39 which are 2.5% transient drift for life safety and 5% for collapse prevention. This is especially true of the Sonora site, where the low seismicity results in considerably low and uniform structural drifts. Besides, Figures 7-11 show that the inter-storey drift demands of all frames at the Oakland are bounded at around 0.04% for immediate occupancy performance level ( = 1/95). However, there is a 70 and 53% variation in the inter-storey drift demands for life safety and collapse prevention level, respectively, at the Oakland site. This increased variation at higher hazard levels, where the responses are generally inelastic, indicates that the inelastic drifts could be controlled by specific characteristics of ground motions beyond spectral content that are not well incorporated in the elastic spectral shape used in this study. 21,42 Besides, the drift plots presented above illustrate in a striking manner the inability of Eurocode 8 to ensure a consistent exceedance level for a given EDP, especially at a lower hazard. Arguably, the better the code, the more it will result in well-controlled structural designs, where similar risk levels are expected on all structures depending on their function.

MACHINE-LEARNING-BASED DRIFT MODELS
The first important step towards the formulation of the new predictive models proposed in this study is feature selection, which is conducted herein using several ML algorithms. Feature selection aims to carefully pick the features that would improve the accuracy of the model and reduce its complexity, increase training efficiency and avoid overfitting. Feature selection methods can be classified into filter, wrapper and embedded methods. Filter methods are generally used to pre-process the features by assigning a correlation score between them and the target variable. Wrapper methods train subsets of features by recursively adding or removing them to select the best-performing subset based on feedback from the ML algorithm. Last, embedded methods combine the advantages of filter and wrapper methods by performing feature selection and training of the algorithm in parallel rather than through a discrete process where the features are either kept or discarded. All three methods are used in this study. A preliminary assessment was carried out by filtering features on the basis of their Pearson's correlation coefficient, which quantifies linear dependencies between each feature and the response variable of interest. Then, three wrapper methods are used: Forward Selection, Backward Elimination and Exhaustive Feature Selection, all carried out by means of the MLXtend Python Library. 47 In the case of Forward Selection, the model is started by fitting one feature at a time and selecting the best-performing one. The remaining features are then added to the model, and the best-performing pair is chosen. This process is repeated until the best-performing feature set with the desired number of features is obtained. Conversely, in Backward Elimination, the model starts with all the features, and the least significant feature is eliminated at each iteration. On the other hand, Exhaustive Feature Selection tries every possible combination of features, which makes it computationally expensive.
In terms of Embedded Methods, Random Forest 48 was used in this study by means of the Scikit-Learn Python Library. 49 Random Forest builds decision trees over a random extraction of the observations from the dataset and a random feature extraction. The random forest algorithm calculates the relative importance of each feature based on the mean decrease in impurity (or gini importance). Thus, the algorithm measures how effective the feature is at reducing uncertainty when creating the decision trees. However, this default type of feature importance (as opposed to permutation or drop column types) has a known limitation in that it tends to escalate the importance of continuous or high-cardinality features, so that it can be unreliable for datasets whose features vary in the scale of measurement. 50,51 To overcome this limitation, permutation importance 52,53 is used here. This method permutes a specific column and measures the decrease in the accuracy of the overall regressor. The random forest with permutation importance was carried out using the Rfpimp Python Library 54 within Scikit-learn. However, Strobl et al. 51 showed that permutation importance overestimates the importance of TA B L E 3 Structural and ground motion parameters considered in this study  ( 1 ) Spectral acceleration at 1 s correlated predictor variables. Last, the most computationally expensive random forest feature importance is drop-column importance, which defines the importance of a feature as the difference between a model with all features (the baseline) and a model with the feature dropped using a given accuracy metric. 50,55 As defined by Guan et al., 7 a hybrid predictive model requires the user to perform some type of mechanics-based analysis to generate intermediate results. Subsequently, those intermediate results are used within a ML framework to develop a function approximator for the final target parameter, in our case, the drift. Hybrid models represent a good balance between the purely data-driven models, whose interpretation is difficult or even wholly concealed, and the solely mechanistic models, whose accuracy can be low. In this context, the features used in the predictive model formulated in this study can be divided into structural (some of which demand some form of structural analysis) and ground motion parameters. The initial set of features explored is listed in Table 3. More specifically, the predictive models developed below are constructed in terms of a maximum drift modification factor, mod , and a global drift modification factor, Δ mod that can be written as where max is the inter-storey drift for a given behaviour factor, , and 1,max is the maximum inter-storey drift at the first yield obtained from pushover analysis with a first-mode lateral load profile. The behaviour factor, , was estimated by means of the expressions proposed by Karavasilis et al. 16 Similarly, Δ mod is defined on the basis of the roof displacement, Δ max . Besides, Δ max is the maximum roof displacement, and Δ 1,roof is the roof displacement at the first yield obtained from pushover analysis with lateral load profile based on the first mode distribution. Since the distributions of feature values and EDPs in this study follow the power law, they were then transformed into lognormal space, as suggested by Cornell et al. 56

Feature selection
For brevity, only the results corresponding to the Oakland site are presented in this section. A similar process was followed for the Sonora site. First, Pearson's correlation coefficients were calculated between the features, and between the features and the target variable. These correlation coefficients are presented in the form of heatmaps in Figure 12. Several features have high collinearity between each other, as is the case for Δ u and Δ 1,roof , or 5−75 and 5−95 , as well as several spectral acceleration values, as expected. It is worth noting that the term collinear is used here to stress that, since Pearson's correlation is used, a low correlation coefficient does not imply that the referred feature is unimportant or unrelated to the other, but just that it is not linearly correlated. Although collinearity between features may pose as a problem, 62 it is not always the case, 63 and it is still interesting to observe which of these collinear features is more influential to the seismic drift response of SMRFs. The last column and row of the heatmaps in Figure 12 show the collinearity of each feature with the target variable, mod . Interestingly, most structural parameters ( Figure 12A) and duration-related metrics ( Figure 12B) have very low collinearity with mod . Although ground motion duration is not expected to be used independently to predict seismic responses, it has been recently argued that duration may play a role in the seismic response of yielding systems, even at low levels of demand. 42 On the other hand, the velocity-, acceleration-, ( Figure 12B) and spectral acceleration-related parameters ( Figure 12C) have prominent collinearity with mod .
With regard to so-called wrapper methods, the optimal number of features to be included in a predictive model can be informed by observing the model's performance with certain features in the forward selection and backward elimination results. No significant change in the performance after five features was found. This number of features is also considered to be practical in the context of reducing the complexity of a predictive model equation. The selected feature subsets from  Figure 13A,B, respectively. On the other hand, Figure 13C presents the corresponding fiive-feature subsets with the highest performance score obtained from exhaustive feature selection. In this last case, all possible five-feature combinations from all the parameter sets were evaluated (53,130 in total). The results from the wrapper methods consistently select the same features. Some of these selected features are not surprising. For example, ( 1 ) is widely accepted to govern the seismic response of structures, 64 and PGV is recognised as a better indicator of the structural seismic response than other peak values like PGA. 65 It is also worth noting that acceleration-related features, CAV, PGA and AI, also appear in the subsets with the highest performance scores identified by the exhaustive feature selection results as they have high collinearity with mod .
Finally, the results of the embedded method carried out by means of Random Forest are depicted in Figure 14. The random forest default feature importance and permutation feature importance are shown in this figure. Generally, consistent results are observed among the different methods where PGV, ( 1 ), 1 , and 1,max , all appear at the top rows of default and permutation importance methods. However, several unexpected features with very low collinearity with mod are also identified (e.g., 1 / m , PGD). The appearance of the mean ground-motion period, m , is in line with previous studies that highlighted its role in expected drift demands [66][67][68] and PGD may describe well drifts for taller buildings experiencing limited non-linear deformations. Besides, in contrast with the results from the other feature selection methods, the drop column importance shows that the behaviour factor ( ), maximum inter-storey drift at the first yield ( 1,max ) and several spectral accelerations for periods past the fundamental structural period have negative importance. These unforeseen results from the embedded method can be partially attributed to the collinearity between the features. 55 It should be noted that the feature importance in embedded methods is calculated by considering all features to behave individually and independently of each other and that collinearity acts differently in random forest models. In the random forest with default and permutation feature importance, the importance of collinear features is shared with each other. The amount of this share was found to be a function of how much noise there is between the collinear features by Parr et al. 55 For example, 5−75 and 5−95 are assigned equal importance as they are highly collinear and have very low noise between them. The effect of collinearity is also present in the drop column importance, as this method calculates the importance from a drop in accuracy. Hence, when one of the highly correlated features is dropped, the accuracy will hardly change, and the importance assigned to this feature will be very small, regardless of its actual importance causing, for example, some spectral accelerations (which are highly collinear between each other) to be assigned very low importance values. 55

Model formulation
The differences in feature selection outputs discussed above stress the need to carefully examine the results of ML algorithms as they may behave differently for different datasets. With these differences in mind, several subsets were trialled to arrive at the predictive model with a balance between accuracy and practicality. In the end, the five features used in the predictive model for mod at both sites (Oakland and Sonora) are PGV, ( 1 ), 1 , and 1,max . This is an important development facilitated by the observed results since it allows us to formulate models with the same functional form and coefficients that preserve the hazard-consistency attributes of the ground-motion database. As usual, the database generated as described earlier in this paper was divided into 80% training and 20% validation. Multi-linear regression was performed on the training datasets to derive the EDP predictive model for mod as ln mod = 0 + 1 ln + 2 ln ( 1 ) + 3 ln 1 + 4 ln + 5 ln 1,max where PGV is expressed in cm/s, ( 1 ) in g, and the coefficients 0 to 5 take the values presented in Table 4. The model was validated using the testing datasets of each site. Figure 15 shows the model's prediction accuracies in colour shade format in terms of kernel density distribution. 69 It can be seen that most of the data, denoted by darker shade, falls around the reference line, where the predicted target is similar to the observed data. The accuracy of the model can be evaluated by the coefficient of determination, 2 , which measures how much variability in the dependent variable can be explained by the model. 70 While 2 is a relative measure, the mean square error (MSE) is an absolute measure of the fit, obtained by calculating the sum of the square of prediction errors divided by the number of data points. The root mean square error (RMSE) square roots the MSE back to the same level of prediction error and makes it easier to interpret. In addition, to assess the relative difference, the median absolute relative derivation (MARD) 71 is also provided to highlight the central tendency of the relative deviation of the prediction from the true value. Besides, the 30% parameter proposed by Guan et al. 7 that shows the fraction of the dataset whose relative difference does not exceed 30% is also given. The accuracy measures of the model are shown in Figure 15, where they are generally acceptable and comparable to other  7,72,73 or even broader earthquake engineering settings. 74 The Sonora site has a noticeably higher model accuracy ( 2 = 0.95; MARD = 0.05) compared to Oakland ( 2 = 0.83; MARD = 0.17), which is attributed to the lower seismic demand and associated dispersion at Sonora, as discussed previously in this paper.
In a second stage, we also explored whether improved models could be obtained by means of multivariate adaptive regression splines (MARS). MARS, first introduced by Friedman, 75 constructs an ensemble of linear functions joined together by one or more hinge functions. The py-earth package 76 was used to conduct the MARS analyses in this study. The predictive model obtained with MARS has the form ln mod = 0 + 1 ln + 2a max(0, ln − 1 ) + 2b max(0, 1 − ln ) + 3a max(0, ln ( 1 ) − 2 ) + 3b max(0, 2 − ln ( 1 )) + 4 ln 1 + 5 ln + 6 ln 1,max where are regression coefficients and are the knots where the behaviour of the model changes. To illustrate these knots, Figure 16 shows the knots for PGV and ( 1 ) for Frame 24, as an example. The MARS-based predicted mod presented in these figures was calculated at fixed values of ( 1 ) (for Figure 16A) and PGV (for Figure 16B). To this end, the ( 1 ) for Frame 24 at a 2475-year return period was obtained from the UHS at 0.406 g, while PGV was calculated from the model developed by Huang and Whittaker 77 at a 2475-year return period. It can be seen from Figure 16 that the PGV presents a knot, 1 , at PGV = 604 cm/s, whereas the knot for ( 1 ), 2 , forms at ( 1 ) = 4.7 g. In both cases, the knots are located at ground-motion parameter values associated with a high earthquake intensity. Despite its sophistication, the MARS-based model does not yield a considerable increase in accuracy in comparison with the multi-linear regression model presented before. A net increase of 0.009 is observed for the 2 value between our MARS and multi-linear regression models, while the RMSE is reduced by only 0.001. Likewise, the MARD is only reduced by 0.001, and there is no increase in the 30% value. This can be due to the knots forming at high earthquake intensities where the available data are limited, for example, although the PGV knot at 604 cm/s in Figure 16A changes the tendency dramatically, there are only a few data points available for PGV > 604 cm/s. In considering the complexity of the resulting expressions and the overall predictive performance, we have opted to favour the multi-linear regression approach (Equation 4) and hence have decided not to present the MARS coefficients.

Global drift model
In addition to inter-storey drifts, it is sometimes important to get an independent estimation of roof drifts. The same steps described above for the formulation of the inter-storey drift models were followed to construct a global drift model. These steps are summarized in this section, although for reasons of brevity, the discussion is succinct.

Feature selection
A feature selection process in the same line as that explained above for mod was also carried out for Δ mod as the target variable at both sites. As before, an optimal number of features of 5 was identified for building the global drift predictive model. The selected feature subsets from forward selection, backward elimination and exhaustive feature selection are presented in Figure 17. Again, the results from the default and permutation importance from the random forest method are slightly different from the results obtained from the wrapper methods, which is attributable to the presence of the collinear features as explained in the case of mod in the previous section.

Model formulation
Several feature subset based on the feature selection results were tried to obtain a predictive model for Δ mod with a balance between accuracy and practicality. PGV, ( 1 ), 1 , and Δ 1,roof were finally selected, and the following Δ mod predictive whose coefficients 0 to 5 are presented in Table 5.
The model was validated using the testing datasets of each site. Figure 18 shows the prediction accuracy together with the 2 and RMSE values. The Sonora site, once again, has noticeably higher model accuracy ( 2 = 0.94) compared to the Oakland site ( 2 = 0.73). Interestingly, the model for Δ mod was found to have a higher bias in Oakland than its mod counterpart. Since the global drift is influenced by the number of storeys and the higher modes, its dispersion is larger than that associated with inter-storey drifts and exacerbated by the higher earthquake intensities expected in Oakland in comparison with Sonora.
As before, the MARS methodology was also tried, but it was associated with a rather small increase in accuracy for a significant increase in the complexity of its functional form. Thus, the multi-linear regression approach was selected.

COMPARISON WITH AVAILABLE MODELS
As mentioned before, several researchers have proposed a number of simplified models for the estimation of seismic drifts in SMRFs. Strictly speaking, such models are not directly comparable with our predictions since the parameter space explored is not identical, and hazard consistency has been overlooked in past investigations. However, some overlap exists, and a comparison is still possible in order to highlight the importance of incorporating hazard considerations into the prediction of local and global drift demands on SMRFs. Therefore, in this section, we compare our response predictions which can be classed as semi-probabilistic, 78 with the deterministic models proposed by Karavasilis et al., 16 Kumar et al. 17 and Bravo-Haro et al. 43 It is recognised that, unlike probabilistic approaches like the one followed in this study, there is a number of uncertainties a deterministic procedure will not be able to reflect. It should be noted, however, that all the deterministic models considered in our comparisons used the same seismic design spectrum as the one employed in this paper ( Figure 2) and that the building configurations employed are largely similar. First, the model proposed by Karavasilis et al. 16 for the estimation of Δ u as a function of the ratio is expressed as Second, the model proposed by Kumar et al. 17 for the estimation of maximum inter-storey and global drift modification factors that has the following form: where and are regression coefficients. Third, the model proposed by Bravo-Haro et al. 43 for the estimation of inter-storey drifts and roof displacement in SMRF is ln mod = 0 + 1 + ( 2 + 3 ) ln where and ℎ are regression coefficients. Karavasilis et al. 16 observed that the equal displacement rule adopted in some seismic codes (e.g., Eurocode 8 2 and NTC 2018 79 ), whereby = , leads to inaccurate estimates of and proposed explicit expressions for a more accurate determination of the behaviour factor. These expressions have been employed herein in conjunction with Equation (7) for consistency.
Also importantly, the expressions by Kumar et al. and Bravo-Haro et al. require the evaluation of the ground-motion mean period, m . Since m is scenario-dependent, these procedures can be catalogued as 'semi-probabilistic' since they allow for the incorporation of a probabilistic component within a deterministic framework. 80 However, despite being used in several applications in structural and geotechnical engineering, 60,67,81,82 only a limited number of ground motion prediction equations (GMPEs) are available for m . In this paper, the GMPE proposed by Du, 83 constructed on the basis of 8491 recordings from the NGA-West 2 database 37 and based on a physical interpretation of m was used. In terms of soil conditions, a s,30 of 760 m/s was considered, and the depth of the top of the fault rupture was set equal to 5 km, while directivity was disregarded. Furthermore, since the depth to the 1000 m/s shear wave isosurface, 1 , was not available, the recommended default value of 1 = 0 was employed. 83 The same earthquake scenarios used for estimating m were also considered for the estimation of PGV required for the hybrid ML-based models proposed in this study. To this end, the conditional ground motion model proposed by Huang and Whittaker, 77 as advised by FEMA P-58, 5 was adopted.
Typical results predicted by the models proposed by Karavasilis et al., 16 Kumar et al. 17 and Bravo-Haro et al. 43 for Frame 03 (3S-3B), Frame 09 (6S-3B) and Frame 15 (9S-3B) are presented in Figure 19 for = 1/475 and = 1/2475. Since the database employed by Kumar et al. 17 considers SMRFs of up to seven storeys, Kumar et al.'s estimates for nine-storey buildings are not presented in this figure. It is evident from Figure 19 that all available models severely overestimate both the peak inter-story drifts and roof displacements. In fact, all these models predict all frames to have collapsed based on the FEMA 356 39  On the other hand, the hazard-consistent predictive models developed in this study, also presented in Figure 19, are significantly more consistent with the hazard curves. Although providing a lower bound estimate of responses, the hazard-consistent estimations found with Equations (4) and (6) are able to replicate the hazard level scaling and provide correspondingly higher drift estimates for lower annual exceedance rates. A complete comparison between the CSS inter-storey demands for all frames and the estimations from the proposed and available models is presented in Figure 20 for inter-storey drifts demands and in Figure 21 for roof displacements. It can be clearly seen that all previous models severely overestimate the actual demand, sometimes excessively, while the proposed models can predict the demand quite well. It should also be noted that some of the available models, notably, 43 which produce the highest overestimations, were developed using IDA analyses rather than hazard-consistent PSDA, which stresses the importance of a hazard-consistent analysis. Although the bias introduced by deterministic predictive models is now generally recognised, the results presented above stress the sheer level of their overestimations and call for the use of hazard-informed predictions in seismic assessments.

CONCLUSIONS
This study has evaluated the inter-storey drift and roof displacement demands on SMRFs within a hazard-consistent PBEE framework. To this end, NRHA of 24 SMRFs were conducted using a database of 816 ground motions representative of two sites with different seismicity. We have used the CSS methodology to estimate the rate of occurrence of each ground motion, hence ensuring the hazard consistency of our outputs. Several observations can be made from the comparisons of engineering demand parameter curves (EDP-hazard curves): (i) site-specific performance assessments are needed in order to ensure economical designs although EC8 appears inherently unable to offer a consistent level or risk for a given EDP; (ii) contrary to recent suggestions based on non-hazard consistent results that call for widespread use of deteriorating models even at lower demand levels; when looked at from a hazard-consistent perspective, the inherent variability in the response overtakes any deterioration effect making it sensible to avoid the complexity of using advanced deteriorating models for assessments that are not intended to reach collapse scenarios; (iii) the plasticity resistance ratio influences the inter-storey drift demand more than the number of bays or storeys, while the latter has a greatest effect on the roof displacement demands. The insignificant influence of the number of bays at lower hazard levels, and its immateriality at higher hazard levels contradict the commonly held belief that a larger number of bays is conducive to a better seismic response.
The extensive database of drifts constructed in this study was used for the development of EDP predictive models. This was preceded by a careful feature selection process using several ML algorithms. PGV, ( 1 ), 1 , and 1,max were selected as the best predictors for inter-storey drift, while PGV, ( 1 ), 1 , and Δ 1,roof were identified for roof displacement. Although these structural and earthquake features have been recognised separately in previous studies as potentially good predictors; their relative importance, as obtained from the ML algorithms used here, can differ. The differences in the results of each ML algorithm described and discussed in this study stress the need to interpret the results of ML tools carefully.
The developed hazard-consistent predictive models were tested, and their good prediction accuracies were confirmed. The Sonora site has a higher model accuracy compared to Oakland, which is attributed to the lower seismic demand and the associated lower response dispersion at Sonora. More sophisticated MARS models were also developed, but they produced only marginal improvements. Although a more advanced ML algorithm may increase the accuracy of the models, this is outside the scope of this paper, which is to assess the influence of hazard-consistency on these models, and to strike the right balance between accuracy, interpretability and, most importantly, practicality (e.g., for hazard integration). Last, our new models were compared with available deterministic and hybrid approaches from the literature. We found that all currently available models can lead to severe overestimations of both peak inter-story drifts and roof displacements. In contrast, the hazard-consistent predictive models developed in this study were able to replicate well the structural drift response and its scaling with seismic hazard level (i.e., provide correspondingly higher drift estimates for lower exceedance rates). The results and findings of this study stress the importance of hazard consistency in PBEE design and assessment and provide the framework to implement it directly in a simple yet accurate way.

A C K N O W L E D G E M E N T
The first author would like to thank the support of Indonesia Endowment Funds for Education (LPDP).

C O N F L I C T O F I N T E R E S T
The authors declare no potential conflict of interests.

D ATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.