Optimum distribution of seismic energy dissipation devices using neural network and fuzzy inference system

The current study proposes a framework to provide an optimal distribution of energy dissipation devices for low‐ and mid‐rise framed buildings using an artificial neural network (ANN) and fuzzy inference system (FIS). Three illustrative framed buildings to be retrofitted using steel slit dampers are presented to demonstrate the effectiveness of the proposed framework. Two hundred natural earthquake records are used to consider variability in seismic ground motions, and 33,600 nonlinear time history analyses (NLTHAs) are conducted to train the framework. Three engineering demand parameters are used to represent the strength and serviceability requirements concurrently, and the fuzziness of structural limit states is also included within the framework. The framework is fine‐tuned by carrying out sensitivity exercises based on different sample sizes to select the most appropriate training algorithm, activation function, and ANN architecture. After that, the proposed framework is compared with three different supervised machine learning classification algorithms; namely, support vector machine, decision tree, and bagged ensemble. The results show the superiority of the proposed framework compared to the conventional machine learning algorithms in predicting the ranking and obtaining the optimum retrofit scheme for low‐ and mid‐rise framed buildings. Finally, NLTHAs are conducted to validate the results produced by the framework, which are found to be in good agreement with the NLTHAs testing results.


INTRODUCTION
Recent advances in deep learning techniques have proved to be a reliable alternative compared to the traditional modeling techniques for solving structural problems (Lee, Ha, Zokhirova, Moon, & Lee, 2018). Recently, more attention has been given to the application of soft computing on earthquake engineering (Salehi & Burgueno, 2018). Falcone et al. (2020) highlighted that soft-computing tech- This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 Computer-Aided Civil and Infrastructure Engineering niques, such as neural network and fuzzy logic, are emerging trends in seismic engineering and more studies are needed in this research area. In structural optimization design problems for buildings, most of the previous studies have focused on the structural safety aspect (i.e., story drift), and only a few studies (e.g., Xu et al., 2017) considered the serviceability aspect. Neglecting serviceability during the design stage may hinder the structure from its functionality. The importance of serviceability is highly increased for important facilities, such as hospitals and data centers (Cimellaro et al., 2010). Commonly, floor acceleration is used as the representative engineering demand parameter (EDP) for serviceability (Kim & Roschke, 2006).
Another important aspect that is commonly overlooked in the retrofit and optimization studies is the increase in the seismic demands on the foundation due to global retrofit (e.g., attachment of energy dissipating devices) (Kam & Pampanin, 2010). This necessitates the inclusion of the base shear as an additional EDP during the design stage. Moreover, the multiobjective nature of the engineering design process (ASCE-41, 2017) requires considering multilevel seismic design criteria (e.g., limit states: operational, fully operational, life safety, etc.).
Furthermore, most of the studies deal with damage limit states as clear boundaries, and the transition from one limit state to another is always abrupt (Javidan & Kim, 2019). However, in reality, the move from one limit state to its neighbors is gradual rather than sharp. Besides, the structural damage may belong to a particular limit state to some partial degree and another limit state by another degree at the same time.
Early studies of optimal control of structures (e.g., Adeli & Saleh, 1997) open the doors for many researchers to provide effective optimum control schemes for structures. For example, Kim and Adeli (2004) provided a hybrid feedback-least mean square algorithm for structural control. The same algorithm using a wavelet approach has been applied to control of cable-stayed bridges (Kim & Adeli, 2005a). The wind-induced motion of high-rise buildings considered a structural control challenge; however, Kim and Adeli (2005b) have proposed a semiactive tuned liquid column damper and showed its effectiveness on a 76-story building benchmark control problem. Recent literature on structural control showed the potential of recent control algorithms (e.g., Gutierrez Soto & Adeli, 2017a;Li & Adeli, 2018) and vibration control devices with effective placement (e.g., El-Khoury & Adeli, 2013;Gutierrez Soto & Adeli, 2013). Gutierrez Soto and Adeli (2017b) integrated different replicator controllers with a multiobjective optimization algorithm to obtain Pareto-optimal values for achieving maximum structural performance with minimum energy consumption. Gutierrez Soto and Adeli (2018) used a neural dynamic optimization model and replicator dynamics to control the vibration of smart base-isolated irregular buildings.
Some studies focused on applying ANN to structural optimization problems; for example, Salajegheh and Gholizadeh (2005) improved a genetic algorithm using ANN to enhance and accelerate the optimum design of structures. Gholizadeh et al. (2008) used a wavelet radial basis function ANN to improve the performance of the genetic algorithm model for structural optimiza-tion with frequency constraints. Gholizadeh and Samavati (2011) used wavelet transforms and neural networks for improving the structural optimization process of structures. Hashemi et al. (2016) proposed a wavelet neural network-based semiactive control model to compute the input voltage to the magnetorheological (MR) dampers to generate the optimum control force of structures.
The fuzzy concepts have also been applied for structural optimization and control; for example, Gholizadeh and Salajegheh (2009) proposed a metamodel called fuzzy self-organizing radial basis function network to reduce the computational burden of the structural optimization process under seismic time history analysis. Jiang and Adeli (2008) introduced a dynamic fuzzy wavelet neuroemulator for predicting the structural response. Uz and Hadi (2014) used integrated fuzzy logic and multiobjective genetic algorithm to obtain an optimal design of a semiactive control model for adjacent buildings connected by MR damper. Bozorgvar and Zahrai (2019) proposed a semiactive control model of buildings using MR damper to determine input voltage using ANFIS and Fuzzy CoCo.
Many studies have investigated the application of soft computing such as ANN and fuzzy logic (Siddique & Adeli, 2013) on the prediction of structural response and optimization; however, still more studies are needed to investigate the efficiency of these techniques in prediction of the optimum location of energy dissipation devices (EDDs) or prediction of maximum drift using actual structures rather than simplified models. Recent studies (e.g., Sheibani & Ou, 2020;Xu et al., 2021) applied soft computing in effective postearthquake response and survey of structural damage.
The purpose of the current study is to propose a framework for prediction of the optimal locations of EDDs in structures considering variability in natural seismic ground motion excitations. To this end, a combined regression supervised machine-learning technique based on a multilayer feedforward neural network (ANN) and fuzzy inference system (FIS) has been developed. Three different EDPs will be considered; namely, maximum interstory drift ratio (D), floor acceleration (A), and base shear (V). These EDPs cover the structural safety and serviceability requirements and monitor any side effects due to retrofit or upgrading of the structure, such as increase of the seismic base shear demands. Also, three different limit states are incorporated in the proposed framework considering the vagueness and fuzziness inherent in these limit states. Considering multiple limit states is crucial for performance-based seismic design. The effectiveness of the proposed framework is verified through comparison with nonlinear time-history analysis (NLTHA) results and three conventional supervised classification learning techniques, which are support vector machine (SVM), fine decision tree (FDT), and classification bagged ensemble (CBE).

OUTLINE OF THE PROPOSED FRAMEWORK
Seismic retrofit of an existing structure is a complex process and, in many cases, the designer needs a quick tool for judging the suitability of a retrofit case (FC) based on the strength and serviceability limit states. The situation may be challenging if there are architectural limitations for adding retrofit devices such as EDDs on specific floors. Besides, the whole process may become more complicated if the uncertainty in the ground excitation is considered. Moreover, the large number of available FCs requires an optimization scheme to provide the most optimum alternative. For example, if a 10-story building is to be investigated for having EDD at each floor, the total number of FCs available is 2 10 = 1,024 cases (assuming that each floor has the two options of having or not having the EDD). To this end, a framework is required to solve the whole problem accurately and systematically in a short time.
The main idea of the proposed framework, as depicted in Figure 1, is that it takes the number of floors of the building, the FC index, and the earthquake response spectrum as inputs. After that, an ANN is trained to predict the D, A, and V for each FC. These values will be used as crisp inputs for the FIS. The fuzziness and vagueness of a set of predefined limit states are considered in the FIS, and the output is a defuzzified crisp value. This value is the evaluation ratio for each FC, which is used for ranking the FCs and selecting the optimum one. The steps of the proposed framework are summarized as follows: . Data set processing such as outlier detection and removal is conducted on the input and output data. Then, a linear normalization based on the minimum-maximum tech-nique is used to normalize the data sets that are divided randomly into 70% for training, 15% for validation (to avoid overfitting), and 15% for testing. 4. Train the ANN using different backpropagation optimization algorithms and architectures to obtain the highest accuracy. One ANN is trained for each output parameter individually with the same input. 5. The output vectors [D], [A], and [V] in the previous step are used as the input for the FIS. At the same time, the limit states (LS1, LS2, and LS3) for each EDP should be defined including the preferred weight for each limit state and the linguistic description (e.g., low, medium, and high). 6. Define the shape and the number of the membership functions (Gaussian member function is used) for the FIS (Mamdani-type FIS is used). 7. Define the fuzzy operator, fuzzy rule (and its weight), and defuzzification method (centroid defuzzification method is used). The defuzzified value of the FIS is used as an evaluation ratio for each FC. 8. Rank the FCs based on the evaluation ratios and select the optimum FC with the highest evaluation ratio.
As discussed above, the proposed framework is different from the common combinations of ANN and FIS found in the literature, which are cooperative, concurrent, and hybrid (Siddique & Adeli, 2013). These combinations focus on creating a rule base, adjusting the membership functions, or determining other FIS parameters through ANN learning methods. However, in the current study, the main focus is dedicated to adjusting the ANN parameters, algorithm, and architecture to be suitable for the prediction of the crisp inputs of the FIS. Then, the crisp output will be used as an evaluation parameter for the FC. Three ANNs are arranged in parallel and then in series with the FIS. This makes the framework a simple yet effective preliminary design tool for the designers in the early design stage, where quick and reliable tools are highly required. The computational novelty of the proposed framework lies in its simplicity and effectiveness for predicting the optimum retrofit scheme while avoiding any complexities that hinder its suitability for the early decision-making process of the structural retrofit.

Artificial neural network
Artificial neural networks (ANNs) are mathematical models inspired by the human brain. ANN is consisting of several layers; each one contains several neurons. The F I G U R E 1 Flowchart of the proposed smart framework based on supervised machine learning technique using ANN and fuzzy inference system (FIS) mathematical model consists of input, hidden, and output layers (Farfani et al., 2015;Rafiei & Adeli, 2017). In the current study, several trials have been conducted to obtain the most accurate prediction of the output vectors [D], [A], and [V]. It is found that using separate ANN for each output parameter provides the highest accuracy of prediction. In addition, different topologies, sample sizes, algorithms, and activation functions have been investigated to assess the performance of the ANNs, which is explained in Section 5. The normalized mean of squared errors (MSEs) and linear correlation factor (R) are widely accepted as reliable evaluation metrics in the literature (e.g., Farfani et al., 2015;Gholizadeh & Salajegheh, 2009) to assess the performance of the ANN. In the current study, these two performance parameters are used and calculated as follows: where I is the number of test data, J is the number of output layer's neurons, d ij and y ij are the predicted and target solutions for the ith series of data at the jth neuron of the output layer, respectively, and̄and̄are the means of these solutions, respectively. d i and y i are the predicted and target solutions for the ith series of data. As R goes closer to 1 and MSE goes closer to 0, the performance of ANN becomes better. Besides, the error histogram (EH) is used to represent the errors between the target and the predicted values after training.

Fuzzy inference system
FIS is a mathematical technique that employs fuzzy logic for nonlinear mapping between a given input space and the corresponding output space (Adeli & Hung, 1995). FIS can handle knowledge uncertainty and measurement imprecision efficiently (Khalifa & Frigui, 2015). The implementation of the FIS can be established using two different approaches, namely, Mamdani and Sugeno (Gholizadeh & Salajegheh, 2009). The Mamdani approach is used in the current study. Figure 2 shows the main steps of the FIS process. The first step is to take the inputs (D, A, and V) and map them to membership functions to determine the degree to which they belong to each of the fuzzy sets (fuzzification).
The second step is to apply the fuzzy operation (AND or OR). If the antecedent of a rule has more than one part, the fuzzy operator is applied to obtain one number that represents the rule antecedent. Each limit state is assigned to a membership function that associates a number in the interval (0, 1) for each input variable. At this stage, a specific weight for each rule should be assigned (assumed 1.0 in the current study).
The third step is to apply the implication method to obtain the consequent, which is a fuzzy set represented by a membership function. The consequents in the base of rules in the Mamdani inference approach are calculated using expert knowledge. The logic connective "and" is represented by the T-norm (minimum); however, the logic connective "or" is represented by the S-norm (maximum) (Pourjavad & Shahin, 2018). The inference engine assigns an implication relation (R) for each activated rule to relate the fuzzy number resulting from the logic operations and the consequent,̃. In the Mamdani approach, a commonly used implication operator is the T-norm (minimum), which truncates the output fuzzy set as shown in the figure.
In step 4, the output fuzzy number of each rule is aggregated using a composition operator. The maximum operator is used for aggregation in the current study.
In step 5, the output fuzzy numbers are changed to a crisp number through the defuzzification process. In the current study, the center of the area (CoA) technique is used for the defuzzification process. This technique provides a crisp value based on the CoA of the fuzzy set. The total area of the membership function distribution is divided into a number (n) of subareas. These subareas are used to find the defuzzified value ( * ) of a discrete fuzzy set. For discrete membership function, * is provided as * = where is the sample element, ( )is the membership function, and n is the number of elements in the sample (Pourjavad & Shahin, 2018).

Classification algorithms for comparison
In the current study, three different supervised algorithm classification machine learning methods are compared F I G U R E 2 ANN architecture and mathematical model with the proposed framework. These methods are SVM, CBE, and FDT. The algorithms for training, validation, and testing these models are coded using MATLAB software (MathWorks, 2020) environment.
The first three steps in the proposed framework are used here to prepare the input data for the classifying algorithm (CA). Subprocess (1) in Figure 1 is used for training the CAs. The main parameters required for the CA training (e.g., Kernel function parameters, convergence rate, acceptable error, penalty factor) are selected based on the accuracy indicators (e.g., confusion matrix, scatter plot, and receiver operating characteristic [ROC] curve) after few trials. After training, the output of the CA will be a class label for each FC instead of the numeric values of the D, A, and V. The number of the class labels for the classification algorithm (CA) is 4 3 = 64 (3 limit states with four interval classes: A, B, C, and D). The CA is used to predict the appropriate class of the retrofit scheme. For example, if a retrofit scheme has the class label "AAA," all three EDPs (V, A, and D) are less than the IO limit state. However, if the class label is "BBD," V and A fall between the IO and LS limit states, but D exceeds the CP limit state. After that, different CAs are trained, tested, and the best one is selected for prediction of the FC label. Then the labels are ranked and the optimum one is selected.

Ground motion records
The seismic input motions should have three main characteristics. First, they should be a reliable representative of the seismic hazard. Second, they should be diverse enough to represent the real variability in the magnitude and profile of real ground motion events. Third, the parameters used to represent these ground motions should maintain simplicity and availability for ease of application. The proposed framework maintains these characteristics. For example, the peak ground accelerations (PGAs) and frequency contents (i.e., response spectrum) are used as input in the current study. Recent studies (e.g., Kim et al., 2019;Oh et al., 2020) showed that these inputs are good representatives of the ground excitations. Moreover, these inputs are considered suitable for the designer because of their availability and simplicity. Another important advantage is that the nonlinear behavior under seismic excitation is highly affected by the frequency contents of the earthquake . This means that training ANNs with the response spectrum will enhance its performance in general and its nonlinear response prediction in particular. In the current study, two versions of the response spectrum are investigated; namely, full version and short version. In the former, many points of the response spectrum are used as input (spectral values from 0 to 5.0 s in every 0.1 s). In the latter, specific controlling points are used (such as points at natural periods of 0.2, 0.5, 1.0, 1.5, and 2.0 s). The diversity of the earthquakes in the current study is based on using real ground excitation events from the PEER (2020) database. The variation in these seismic excitations includes PGA, magnitude (M w ), source-fault mechanism, site to source distance, shear velocity (V s30 ), and the lowest useable frequency.
A total of 1,000 earthquake events (magnitude range 3.5-8.5) have been selected from PEER (2020) to represent variability in input ground motions. It is found that around 10% of these events are strong ground motions that can be considered in the analysis where the rest events are having very small PGAs. Based on that, a scale factor of four is used to scale up the ground motion events, and 200 earthquakes are selected based on a PGA ranging from 0.017 to 1.8 g. This scale factor will not distort the seismological characteristics of the ground motions according to previous studies. More samples are used in the range of 0.3 to 0.6 g as it is the practical range for seismic design of buildings. Table 1 lists the input earthquake parameters used in the current study. Figure 3a shows the response spectra of the 200 input earthquakes (100 fault normal and 100 fault parallel), and Figure 3b depicts a histogram of the PGA ranges used. This selection guarantees the diversity of the input earthquakes over the expected range of occurrence. It should be noted that diversity in ground motions is required for training ANNs. After that, the framework will be ready to be used for any earthquake (scaled, unscaled, natural, synthetic, etc.) with specific characteristics for optimization purposes. Figure 4 shows the plan and elevations of the analysis models. One of the outer frames in the short direction is used for analysis for simplicity, which is reasonable because the building has a rigid diaphragm and no  Table 2.

Characteristics of the model structures
From OpenSEES (McKenna et al., 2011) element library, an elastic beam-column element is used for modeling beams and columns. A zero-length element is used to connect beams and columns. For modeling the plastic hinges at the ends of the elements, the nonlinear forcedeformation relationship of the zero-length element is modeled according to the modified Ibarra-Krawinkler deterioration model (Lignos & Krawinkler, 2011). This model provides a bilinear hysteretic response of the material, which has been calibrated for more than 350 experimental data of steel beam-to-column connections.

Model
Standard section Multivariate regression formulas are provided to estimate the deterioration parameters of the model for different connection types. In this model, the ratio of the capping moment to yield moment is 1.05. The rotations at yield, post-capping, and ultimate points are 0.025, 0.3, and 0.4, respectively. The residual strength ratio for positive and negative loading direction is taken to be 0.4. It is worth mentioning that the bilinear model has limitations in predicting realistic behavior including the structural deterioration that may inherently exist in the original model. The first-story columns are assumed to be fixed to the ground. Two percent of the critical damping is used for modal and dynamic analyses. According to the modal analysis, the fundamental periods of the three-, five-, and seven-story steel frames are found to be 0.58, 0.64, and 0.92 s, respectively.
In the current study, a steel slit damper is used for efficient seismic retrofit of existing structures (Kim et al., 2017). It consists of many vertical steel strips and is generally placed between stories where interstory drifts are relatively large. It dissipates seismic energy by hysteretic behavior of vertical steel strips. The slit damper is modeled using the uniaxial material steel-Menegotto, Pinto, and Filippou (MPF) model, which represents a constitutive nonlinear hysteretic material in the OpenSEES (McKenna et al., 2011) element library. The details of the model are given elsewhere (Kolozvari et al., 2015).

VALIDATION OF THE FRAMEWORK
In this section, the components of the framework are finetuned to make the whole framework as accurate as possible. After that, the framework is validated through the accuracy metrics, such as normalized mean square error (MSE), regression correlation factor (R), and EH. Finally, the framework is validated and a comparison with different supervised CA methods is made.

Accuracy of the framework
To enhance the robustness of the framework, proper sampling techniques are needed to effectively represent the entire domain of the input-output variables. Data normalization is commonly used to map the data to a uniform scale, especially if the data have widely different scales. The data have been normalized using a min-max technique to scale all values to be within a range of 0-1. This normalization preserves all the original data values because it is a linear transformation. The following equation is used for normalization (Li et al., 2000): where and are the new and old values, respectively; min 1 and max 1 are the minimum and maximum of the original data range, respectively; and min 2 and max 2 are the minimum and maximum of the new data range, respectively. Figure 5 shows the normalized maximum interstory drift ratio of the three-story frame for 1,600 samples. The same is generated for the five-and sevenstory frames with 6,400 and 25,600 samples, respectively. As can be seen in the figure, the training set includes data over the entire range of the output, which enhances the accuracy of the backpropagation algorithm. To select the ANN with the highest accuracy possible, different topologies, sample sizes, algorithms, and activation functions have been studied. Based on that, the following alternatives have been investigated: (a) three different hidden layers (one, two, and three layers); (b) four different number of neurons per layer (10, 20, 30, and 40); (c) three training backpropagation algorithms such as scaled conjugate gradient (SCG), Levenberg-Marquardt (LM), and Bayesian regulation (BR); (d) three different activation functions such as Log sigmoid transfer function (LOGSIG), linear transfer function (PURLIN), and hyperbolic tangent sigmoid (TANSIG); and (e) three sample sizes from the total data set (25%, 50%, and 100%).
It is found that LM and BR training algorithms provide consistently high accuracy for all models (MSE = 0.005; R = 0.98). On the other hand, the accuracy of SCG decreases in terms of MSE and R as the number of stories increases, especially in the seven-story case.
The optimum number of layers has been investigated to obtain the highest accuracy possible in the framework. It is found that using two layers provides the least errors for the building models. Figure 6 shows the drift EH for the sevenstory model obtained using two layers, TANSIG activation function, LM training algorithm, and 10 neurons per layer.
The optimum number of neurons per layer for all models is found to be 10 and 30, as shown in Figure 7a; however, the least training time is achieved by the 10 neurons per layer as shown in Figure 7b.
It is found that the TANSIG and LOGSIG activation functions provide a better regression factor (R = 0.98) compared to the PURLIN function (R = 0.92) for all models. Figure 8 shows the sample size effect on the MSE and R of the building models using the LM training algorithm, two hidden layers, 10 neurons per layer, and the TANSIG activation function. The total number of NLTHA simulations conducted to produce 100% of all input data is 1,600, 6,400, and 25,600 for the three-, five-, and seven-story models (total 33,600 analyses). As can be observed in the figure, the effect of the sample size is not significant in the case of the seven-story building. This can be attributed to the large data set created for the seven-story model due to the large number of FC alternations. On the other hand, the 25% sample size shows significantly less accuracy compared to 50% and 100% for the three-and five-story models.  Figure 9 shows the effect of the input response spectrum on the MSE and R of the building models. The results show that using the full version of the response spectrum generally provides better results than use of the short version, except for the D of the three-story model. The difference is significant in the case of the three-story V and A output parameters. The difference in accuracy decreases with increasing the number of stories. It is noted that the short version requires less time in training the ANN compared to the full one. For example, in the seven-story model, the training time using a 100% sample size of the D, A, and V output parameters and full version spectrum is 10, 39, and 22 s, respectively. In the case of using the short version of the spectrum, the previous values are found to be 3, 5, and 2 s, respectively.
Using one ANN for the combined V/A/D as output provides MSE and R of 0.0071 and 0.92, respectively. On the other hand, using different ANN for each EDP, the previous values turn out to be 0.0024 and 0.98, respectively. This means that it is better to use different ANN to predict different output parameters as used in the proposed framework. Based on the previous results, the optimum ANN should have an LM training algorithm, two hidden layers, 10 neurons per layer, and the TANSIG activation function.

F I G U R E 9
Comparison of the performance of the ANN-based on MSE and R parameters using full and short versions of the input response spectrum Figure 10 shows the accuracy of the supervised classification methods obtained based on different sample sizes in the three-story model. In general, the upper bound of accuracy is found to be 81.2%, 85.55%, and 87.8%, respectively, for the three-, five-, and seven-story models. This means F I G U R E 1 0 Accuracy of different supervised classification algorithms based on different sample sizes in the three-story model that the proposed framework provides much better accuracy compared to the best conventional CAs. As can be seen from the figure, the SVM has a slight edge over CBE and FDT in the case of small sample size (e.g., 25%). For the 50% and 100% sample sizes, the CBE and FDT outperform the SVM. It should be noted that the computational effort and the training time required for the CAs are similar to those required for the proposed framework with the recommended algorithms and ANN architecture.

Comparison with classification methods
The total number of FCs for each model is a function of the number of stories. The FCs for the three-, five-, and seven-story models, respectively, are 2 3 , 2 5 , and 2 7 . Each FC has a unique index that represents different damper topology based on a vector of zeros and ones. For example, for the three-story model, FC index "1" indicates the case with no damper; FC index "8" indicates the case with dampers installed in all stories. The Appendix shows the indexing of the three-story building as an example. Table 3 shows the top five optimum FCs obtained from the best CA method and the proposed framework for each building model for three different earthquake records (listed in Table 4). In the CA column, a label is given to each FC, and in the proposed framework column, the FIS evaluation ratio (ER) is indicated. The ranking based on the reference method, which is the nonlinear time-history response analysis (NLTHA), is indicated in the NLTHA column.
An interesting observation in the table is that different FCs have the same label in the CA column. This indicates that the CA could not predict the optimum FC, which can be attributed to the low prediction accuracy of the CAs. For the five and seven-story models, the CA provides the top four FCs with the same label; however, none of them is the optimum FC. On the other hand, the proposed frame-F I G U R E 1 1 Optimum damper distribution obtained from the framework work predicted the optimum FC using the ER even if there is a small difference between the FCs as in the case of the three-and five-story models. The framework shows a perfect agreement with the NLTHA ranking in all models except for the rank number four-in the five-story model. This indicates the high accuracy of the proposed framework in predicting the optimum damper topology for the selected building models investigated in the current study. Figure 11 shows the optimum damper distribution for the building models obtained from the proposed framework.
Based on the above discussion, it can be concluded that the proposed framework provides higher accuracy compared to conventional CAs. Besides, it can predict the optimum damper distribution accurately using the FIS evaluation ratio. This highlights the novelty of the proposed computational framework, which fine-tunes the existing ANN to be more suitable for the optimization process and enhances its prediction accuracy by combining it with an FIS in a simple but effective way.
It is worth mentioning that the framework can be generalized for different types of earthquakes and different alternations of FCs. The framework can be extended to other structures within the range of the case studies presented.

Validation using NLTH analysis
In this section, the proposed framework is validated for different models depicted in Figure 12 using NLTH analysis. The analysis models used for validation are: (a) TA B L E 3 Optimization ranking obtained from the classification algorithm (CA) and the proposed framework a four-story 2D model, (b) symmetric three-story 3D model; and (c) asymmetric three-story 3D model. The first two models (Figures 12a and 12b) are designed for gravity loads only based on AISC-360 (2016) with a dead load of 4.1 kN/m 2 and a live load of 2.5 kN/m 2 . Nominal yield strength of 345 MPa is used for all steel elements. The third model (Figure 12c) is an asymmetric benchmark model used in many previous studies (e.g., Fajfar et al., 2006). The four-story 2D frame model has not been used in the training process of the framework. A vertical irregularity (soft story) exists in the first story according to ASCE-7 (2016). The cross sections of the first-and the second-story columns are W14 × 311; the third-and the fourth-story F I G U R E 1 2 Configuration of the analysis models used for validation: (a) four-story 2D model, (b) symmetric three-story 3D model, and (c) asymmetric three-story 3D model columns are W14 × 257. All beams are designed with W14 × 118 sections. The 3D symmetric model has three spans (9.15 m each) in each direction. The story heights and the element cross sections are the same as the three-story model shown in Figure 4. The retrofit device is added only to the outer frames in the X-direction. A vertical irregularity (soft story) exists in the first story according to ASCE-7 (2016).
In the asymmetric 3D benchmark model, the retrofit devices are placed in the X-direction between C8-C6 and C5-C1, and in the Y-direction between C7-C4 and C8-C9. The details of this model are provided in Fajfar et al. (2006) The nonlinear modeling of the elements used in Section 4.2 is applied to the case study structures. The earthquake records (indicated with an asterisk in Table 4) used for the NLTH validation have not been used in the framework training process. Table 5 compares the optimum design results obtained from the proposed method and the NLTH analysis. It can be observed that the predicted ranking of the four-story structure shows a good agreement with the NLTH analysis ranking for all FCs except for the fifth FC. The minimum accuracy is 97.2%, which confirms the accuracy of the ANNs.
In the case of the symmetric 3D model, the accuracy of the ANNs decreases to 93.1%. The predicted ranking matches well for the first three places; however, the locations of the fourth and the fifth ranking do not match with those obtained by the NLTH analysis.
In the asymmetric 3D model, the minimum prediction accuracy of the ANN is found to be 92.1%. The proposed method can predict the optimum ranking correctly, and the predicted ranking is similar to that of the NLTH analysis for the first three ranking locations, as shown in Table 5. It is found that the fourth and the fifth ranking locations do not match. This may be attributed to the asymmetric nature of the model, which makes it difficult to predict the EDPs with very high accuracy.
Based on the above results, the scope of application of the proposed framework can be well defined, which encompasses 2D and 3D low-rise buildings with limited irregularity.

CONCLUSION
In the current study, a framework for optimum damper distribution method was proposed using an ANN and FIS. The input for the framework is the number of story, retrofit case index, and the earthquake response spectrum. The output is an evaluation ratio for each retrofit case. The fuzziness of the limit states was considered for the serviceability and strength requirements. Two-hundred natural earthquake records were used to represent the variability of the ground motions. Three-, five-, and sevenstory frame buildings were investigated for the optimum retrofit scheme using steel slit dampers. A total number of 33,600 nonlinear time history analyses were conducted to train the ANN and to obtain the optimum retrofit scheme. The main findings of the current study are summarized as follows: The Levenberg-Marquardt and Bayesian regularization training algorithms provided higher accuracy (MSE = 0.005; R = 0.98) compared to the SCG algorithm.
The optimum ANN architecture required to achieve the highest accuracy (MSE < 0.0055) and least training time (less than 5 s) should have two hidden layers and 10 neurons per layer. The TANSIG and LOGSIG activation functions provided better accuracy (R = 0.98) compared to the PURLIN function (R = 0.92) for all models used in the current study. The three-and five-story models were more sensitive to sample size compared to the seven-story model.
The use of full version of the response spectrum as an input for the framework provided higher accuracy compared to the short one with selected points. Training separate ANN for each EDP provided more accurate prediction than combining all variables (V, A, and D) as one output. The MSE and R were found to be 0.0071 and 0.92, respectively, for the combined output, whereas these values were 0.0024 and 0.98 for the separate ANNs case. The upper bounds of the prediction accuracy of the conventional CAs were found to be 81.2%, 85.55%, and 87.8%, respectively, for the three-, five-, and seven-story models, whereas the prediction accuracy of the proposed method reached 98% for all models. The FIS was proved to be an effective component of the framework providing accurate evaluation ratios for very similar retrofit schemes. The proposed framework could predict and rank the optimum retrofit cases even in cases with small differences in the evaluation ratios. The top five optimum retrofit cases were almost identical to those obtained from the nonlinear time history analysis, which indicates the efficiency and accuracy of the proposed framework. The conventional CAs could not classify accurately the optimum cases compared to the proposed method.
The application of the proposed framework to structures not used in the training phase showed high accuracy (97.2%) for low-rise 2D models with limited vertical irregularity (such as soft story). The accuracy decreased to 93.1% when it was applied to symmetric 3D model. For the asymmetric 3D model structure, the prediction accuracy turned out to be 92.1%. Considering the asymmetric and 3D configuration, the accuracy of the proposed method is quite satisfactory.