Online Prediction Method for Power System Frequency Response Analysis Based on Swarm Intelligence Fusion Model

Instability at transient frequency caused by faults in complex power systems is one of the greatest threats to operational safety. By analyzing the frequency response of power system in real-time and adopting control strategies promptly, power system accidents can be efficiently prevented. While existing online analysis methods integrate physical-driven and data-driven methodologies, they do not effectively utilize frequency timing characteristics. Consequently, a swarm intelligence fusion model, which integrates physical-driven and data-driven methods, is proposed as an improved frequency response analysis method. The transient frequency affecting components are separated into primary state variables and system time series data based on the properties of the time sequence. To preserve the actual relationship of the electrical mechanism model, the system frequency response (SFR) model is used as the physical-driven method for the primary state variables of the system. The Long Short Term Memory (LSTM) network was used as the data-driven method to extract timing features and correct the SFR model’s prediction using the system time series data as input. The two methods are combined using the bootstrap mode to form the fusion model, and the structure of the model is optimized using an improved sparrow search algorithm (ISSA), a swarm intelligence optimization algorithm. The model structure is adapted autonomously, implementing a method for online frequency response analysis. The simulation on the New England 39-bus system has verified that the method can quickly and accurately calculate the dynamic process of frequency response after a large-scale disturbance.


I. INTRODUCTION
Modern power systems are facing significant challenges in stability analysis and control due to the increasing complexity of their network structure, load characteristics, and power supply components, caused by the interconnection of large-scale power grids, the increase in demand for The associate editor coordinating the review of this manuscript and approving it for publication was Youngjin Kim . electricity, and the continuous influx of various sources of energy [1], [2]. When the power system suffers a serious disturbance and experiences a large power shortage, the frequency will drop sharply in a short period of time. Transient instability can cause cascading failures and even the collapse of the power system if it is not regulated immediately. Power grid transient faults have been identified as the main cause of large-scale power outages in recent years, resulting from numerous major power grid accidents [3], [4]. Therefore, it is VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ essential to promptly determine the frequency situation of the grid after an interference to ensure the stable operation of the power grid [5], [6]. Physical-driven methods, including single machine equivalent model methods (dominated by Average System Frequency Models (ASF) and System Frequency Response Models (SFR)) [7], [8] and full-state time-domain simulation methods [9] using dynamic simulation software, are widely used as basic methods for predicting power system frequency features. The single machine equivalent model reduces the complexity of the power system while preserving its analytical properties. While this method is fast in operation, a large number of simplified assumptions can lead to insufficient accuracy. The full-state time-domain simulation method establishes a detailed numerical model of the power system using high-order nonlinear equations. In spite of this method has higher calculation accuracy, the calculation process is complicated and the calculation speed is extremely slow. In the case of modern smart grids, neither method can simultaneously meet the requirements for fast and accurate frequency prediction following disturbances [10].
The rise of artificial intelligence provides a feasible data-driven method for analyzing power system frequency response [11], [12], [13]. Data-driven methods build models based on data, including decision tree (DT) [14], [15], deep belief network (DBN) [16], [17], support vector machine (SVM) [18], [19], [20], [21], random forest (RF) [22], extreme learning machine (ELM) [23], convolutional neural network (CNN) [24], [25], long short-term memory (LSTM) [26], [27], [28], [29], and extreme gradient boosting (XGBoost) [30], etc. Unlike traditional methods, this method does not require a physical model. It extracts and learns the causal relationship between the database and the sample data based on the sample data. The frequency transient features can be accurately fitted using this method if the sample is sufficient and accurate. A number of machine learning methods, including DT, SVM, etc., are limited by their shallow model structures, which do not adequately fit the implicit mapping rules of the data [31]. Additionally, the data-driven method does not recognize the direct natural connection between input and output, and the quality and size of the sample directly affect its validity. As a result, it is similar to a ''black box'' and its results may not be reliable [32].
The traditional physical-driven method, which maintains its physical properties based on the power system, can be used to explore the essential results of the model. Data-driven methods, based on the theory of data science, examine the relationship between input and output from data. Therefore, combining the physical-driven method with the data-driven method creates an effective combination of rules and experience, enabling the proposal of a more accurate and efficient frequency transient features prediction algorithm [33], [34], [35], [36], [37]. Wang et al. [38] divide the factors affecting transient frequency into key factors and non-critical factors, and use the SFR model to maintain the physical connection and the ELM correction model to characterize the potential relationship between the data, enabling faster and more reliable analysis and prediction. Wen et al. [39] integrate the SFR method with the support vector regression (SVR) model based on the adaptive neural fuzzy inference system (ANFIS), enabling the quick acquisition of high-accuracy results for frequency response prediction of wind power systems using small samples.
There are still some shortcomings in the existing power system frequency response analysis methods based on integrating physical-driven and data-driven methods.
• The existing data-driven methods used in the fusion model often rely on machine learning methods based on statistical analysis, such as DBN, SVM, and ELM. A limitation of these methods is that they fail to take into account the time-series features between data in the power systems and fail to fully utilize sample data, resulting in a lack of performance in frequency response analysis.
• Training parameters for machine learning are usually selected based on subjective human experience. Filtering the parameters of the model is time-consuming and requires a lot of effort, and the ability to find the optimal parameters of the model is limited. A swarm intelligence fusion model is proposed to address the current challenges in analyzing the frequency response of power systems. This method uses a bootstrap mode to refine the output of the physical-driven method with the help of the data-driven method. As a physical-driven method, the SFR model with fast computation speed is selected among the existing frequency prediction methods that can be applied online. Based on the temporal characteristics of data, LSTM is chosen as the data-driven method, and the physical-driven method is embedded into the data-driven method through bootstrap mode. Finally, a swarm intelligence search algorithm (ISSA) is used to intelligently search and adjust the structure of the fusion model. This method is novel in its consideration of the temporal characteristics of power system transient data. The data-driven method is fed with timeseries data, allowing full utilisation of the sample data and ensuring that the performance of the feature prediction results is not compromised by the temporal characteristics of the sample data. The fusion model structure is optimised using the ISSA algorithm, and the model structure is dynamically adjusted in response to the online data, thereby enhancing the self-adaptation capability of the prediction algorithm. The combination of these advantages further improves the accuracy and speed of the prediction model. According to the results of this study, the proposed method is significantly more accurate than other machine learning models on the New England 39-bus system. typical fusion modes between physical-driven methods and data-driven methods applied in various power system scenarios, including the parallel mode proposed for addressing the issue of joint data and physical modeling in power systems [40]; the serial mode proposed for the problem of high complexity or insufficient accuracy in power system model [35]; the bootstrap mode proposed for the problem of lack of knowledge or engineering experience to guide the construction of power systems data model [41]; and feedback mode proposed for addressing the issue of mismatched power system model parameters [42], as shown in Fig. 1.
The parallel mode is suitable for scenarios where both physical-driven and data-driven methods are effective and can be used to directly select or summarize the results; the serial mode uses the results of the physical-driven method to guide the construction of a reasonable data-driven method; the bootstrap mode can use the data-driven method to refine the outputs of the physical-driven model [43]; the feedback mode correct the physical driven method by using the data driven method.
In reality, frequency response is required to speed up the calculation and ensure accuracy. Therefore, the bootstrap mode is used for the fusion of models. And the data-driven method is then employed to fix the results of the physical-driven model by investigating the correlation between the output data of the simplified physical-driven model and the actual results in various scenarios.

B. SFR MODEL FOR THE PHYSICAL-DRIVEN METHOD
Current online applications of frequency response analysis are mainly based on the single-unit equivalent model method represented by SFR [7]. By ignoring frequency oscillations between generator units, this method equates the complex power systems to the simple power systems of a single machine with a concentrated load, which can reduce the difficulty of solving the dynamic response of the power system. SFR does not require the stepwise integration method to solve the system frequency dynamics and retains the analytical characteristics of frequency, which can extract key information from the analytical expressions, such as maximum frequency deviation, time at maximum frequency deviation and steady-state frequency deviation. Hence, It is widely used for low-frequency load shedding.
The SFR model assumes a uniform frequency across the network and combines the equivalent rotor equations of motion for all generators in the network into a single machine model, ignoring the network's effect. And the SFR model uses a single prime mover-governor as a basis to simulate  the dynamics of all the prime mover-governors in the system by aggregating their equivalents. The total mechanical power output of the prime mover-governor is applied to the equivalent rotors, which are aggregated from the equivalent rotors of all the generators in the system.
The majority of current generating units are reheat turbines, as shown in Fig. 2. This system consists of a separate governor for the dynamic performance of each rotating mass, which incorporates a separate acceleration power to control the dynamic performance of each rotating mass.
The corresponding control flow diagram is shown in Fig. 3, where P SP is the ratio of the load disturbance power in the system to the total system load; F H is the high-pressure bar power coefficient of the turbine; K m is the mechanical power gain coefficient; ω is the system frequency offset; P m is the mechanical power output of the turbine; P e is the electromagnetic power of the generator; P a = P m − P e is the acceleration power of the generator; H is the total inertia time constant of the generator; D is the equivalent damping coefficient of the generator; R is the frequency modulation coefficient of the governor; T R is the reheat time constant of the prime mover.
According to the frequency response model in Fig. 4, the transfer function can be derivated as follows.
where ω n is natural oscillation frequency or undamped oscillation frequency, and ζ is damping ratio or relative damping coefficient. Substitute (2) to get the following formula: Obviously, it can be seen from (5) that for any of the variations in P SP or P e , the nature of the response is consistent, but with different signs and phases.
In general, Load shedding procedures caused by disturbances are more concerned, thus P SP = 0, P d = −P e is defined as ''disturbance frequency.'' When P d > 0, it indicates a sudden increase in power generation or a sudden decrease in load, such as a load making accident; and when P d < 0, it indicates a sudden decrease in power generation or a sudden increase in load, such as a unit tripping accident.
Based on the above definition, the frequency response model in Fig. 3 can be simplified as shown in Fig. 4. Therefore, equation (5) can be derivate as For the step disturbance P d = P step ε(t), P d (s) = P step s , where P step based on the system rating S SB , ε(t) is expressed as a unit step function. After substituting (6), the following equation can be derived.
For ease of analysis, a decomposition of (7) yields In the Rasch transform table, The calculation gives Thus, the time domain resolution of the frequency response model is where The equations shown above are typical for any reheat generator unit in the system. Considering that all equations have a certain rating S B , all units can be combined into a single large unit, which represents all generating units in the entire system.
Based on the assumption that the rating of all generating units in the system S SB is equal to the sum of their ratings, The rating of the whole system can be calculated by the following formula: Hence, (14) is used as a physical-driven method, a neural network as the data-driven method, and a bootstrap mode to effectively integrate the physical-driven method with the data-driven method. It is pertinent to note that the frequency response of each generator unit in a multi-unit system may not be the same. The entire system containing all units needs to be equated into a single-unit system, and all units are aggregated using the equivalence method proposed in the literature [7].

C. LSTM FOR THE DATA-DRIVEN METHOD
After a disturbance in the power system, each electrical quantity will keep changing with time, so a large amount of time series data will be generated. Since time series data contains more information about the system due to its larger volume, if the relevant model can be applied to establish a mapping relationship between time series features and system stability, then the analysis of transient stability can be better [44]. Gao et al. [25] propose a transient stability assessment method with four convolutional layers based on a one-dimensional convolutional neural network considering the temporal dimensional information of the original input features. James et al. [27] develop a transient stability assessment system based on the long short-term memory network that learns from the temporal data dependencies of the input data. This shows that the combination of time-series data and data-driven methods can extract more time-series information about transient processes for transient stability analysis.
The recurrent neural network (RNN) is currently used to process time series data effectively in a neural network. The traditional RNN retains past information by using the hidden layer. A hidden layer's output is only influenced by the state of the hidden layer at a previous moment and the input at the present moment, making it only sensitive to short-term information. It is theoretically possible for RNN to solve the long-distance dependence problem by increasing the number of hidden layers, however, too many layers are prone to gradient disappearance and gradient explosion [45].
Based on this analysis, related scholars developed a Long Short Term Memory network (LSTM). The LSTM enables the prediction of a large number of sequences by improving the RNN, introducing memory block units that record longterm information, and by adding a forget gate, input gate, and output gate to the hidden layer to update the memory block unit state, as shown in Fig. 5.
Through a layer of the network, the forget gate unit determines how information is transferred between the previous memory module unit and the present memory block unit. Through two layers of the network, the input gate unit determines the hidden layer information retained in the memory block unit at the present moment. For updating the memory block, the two layers of the network use the sigmoid and tanh functions as activation functions, respectively. The input and output gate units are identical, and both are determined by the two layers of the network to maintain the memory block information of the hidden layer at any particular time.
The following are the parameters corresponding to the LSTM neural network.
A variety of gating controls enable LSTM to perform the function of long-term memory on the basis of RNN, and to fully process time series data.
In order to achieve a more accurate and efficient frequency transient analyse algorithm, a deep learning model capable of analyzing time-series type data combined with a power system transient time-series data set is considered to correct the errors of the SFR. On the basis of the above, a fusion model using LSTM as the data driven method is proposed.

D. PHYSICAL-DATA FUSION MODEL
To preserve the electrical causality connection between input and output data of the power system frequency stability problem, the SFR model is utilized as the physical-driven method. Using this model, it is possible to obtain the frequency change curve of the system after the disturbance and to extract critical features for frequency stability evaluation, such as maximum frequency deviation and time at maximum frequency deviation. This information and the system transient timing data set are jointly used as input for the data-driven method, and complete the fusion model based on the bootstrap mode. The fusion model is shown in Fig. 6.

III. ONLINE PREDICTIVE FUSION MODEL BUILDING
The online construction method of the fusion model is presented in this section, that including the correction of the SFR model error using the LSTM algorithm and the intelligent iterative optimization of the LSTM important hyperparameters using the ISSA algorithm. The SFR model errors are corrected with the LSTM algorithm, and the ISSA algorithm is used to solve LSTM algorithm shortcomings given random parameters. Finally, an online prediction process is provided after the swarm intelligence fusion model has been constructed.

A. FUSION MODEL BUILDING SOLUTION
A power system frequency response analysis fusion model based on SFR-LSTM is constructed to integrate the prediction results of the physical-driven method, and fully transient data timing characteristics concurrently to further improve the prediction accuracy, as shown in Fig. 7. The specific steps of modeling are organized as follows.
1) Data collection on the operating parameters of the system and related timing data.
2) Data pre-processing. Data are processed to remove outliers and duplicates, and then normalized.
3) Fusion model training. The initial prediction of transient frequency characteristics is performed according to the physical-driven method (SFR). The preliminary prediction results and time series data are used as input features to train the fusion model through batch prediction method. 4) Evaluation of the fusion model based on test data. By using the above modeling process, the fusion model can be built offline, and the trained model can be used to analyze the frequency response online.

B. SWARM INTELLIGENCE OPTIMIZATION METHOD -SPARROW SEARCH ALGORITHM
The sparrow search algorithm [46] mimics the predation and anti-predation behavior of sparrow populations and they are divided into three categories: discoverers, joiners, and scouts. The discoverer has a higher fitness value than the VOLUME 11, 2023 rest of the population and can change the direction of the search to provide a better foraging area for the population. Joiners must follow discoverers to compete for food. The ratio of discoverers and joiners is fixed, and their identities can be converted into each other. During a search, the scout maintains the current search direction and sends out an early warning signal when predation has been detected, reminding the population to perform anti-predation. According to the sparrow search algorithm, the population position and the fitness value of each sparrow are initialized in advance, and the sparrow position is then updated according to the following rules.

1) DISCOVERER LOCATION UPDATE RULES
x t i,j represents the position of the i-th sparrow in the j-th dimensional space at the t-th iteration, α and Q are random numbers, T is the preset maximum number of iterations, i is the current iteration number, R 2 is the warning value, and ST is the safety value, L is an all-one matrix. When R 2 < ST , it means that the safe discoverer in this area can expand the search range; when R2 ≥ ST , it means that the area is dangerous, and the discoverer needs to lead the population back to the safe area to search.

2) JOINER LOCATION UPDATE RULES
x t worst represents the position with the worst fitness value of the sparrow in the t-th iteration process, and x t best represents the position with the best fitness value of the sparrow in the t-th iteration process. A is a matrix whose elements are 1 or −1. When i > n 2 , it means that the fitness value of joiner i is poor and needs to be transferred. When i ≤ n 2 , it means that the fitness value of joiner i is better, and only needs to find a random position nearby.

3) SCOUT LOCATION UPDATE RULES
f i is the fitness value of the current sparrow, f w is the global worst fitness value, and f g is the best fitness value of the entire population. When f i ̸ = f g , it means that the sparrows at the edge of the population realize the danger and need to approach the sparrows in the center; when f i = f g , it means that the sparrows at the center realize the danger and need to approach the sparrows at the edge of the population. Using a sparrow search algorithm with an unbalanced distribution of populations can result in poor population quality and slow convergence of the algorithm. An improved sparrow search algorithm (ISSA) [47] utilizes a backward learning strategy to eliminate this issue and a dynamic step size adjustment strategy to improve the algorithm's optimization accuracy. ISSA optimizes the hyperparameters in the above fusion model by setting the optimization parameter interval in such a way that it maximizes the prediction effect of the fusion model using the mean absolute value error of the model. The optimization process of ISSA fusion model is shown in Fig. 8.

C. ONLINE PREDICTIVE ISSA FUSION MODEL BUILDING PROCESS
Power system transient frequency analysis methods using deep learning generally require a reasonable configuration of activation functions and hyperparameters such as learning rate. However, the determination of hyperparameters is usually subjective in nature. The ISSA can be used to evaluate the model results, which not only optimizes the model, but enables adaptive adjustment of the model structure to improve the model prediction accuracy based on the real-time response data of the system, which tracks the system's operation mode, topology, and other changes.
There are three parts to the online prediction process based on the swarm intelligence fusion model: offline training, online prediction, and model update. After collecting historical or simulation data as a sample data set, the fusion model is used for offline training and ISSA is used to optimize the fusion model to obtain a training model, which is the training model M on training platform in Fig. 9. This model is deployed to the prediction platform to predict the next phase of system operation. At the same time, the actual system operating condition records are used as a sample set together with the historical data to train the prediction model M for updating. After that, transfer the updated training model M+1 publish to the prediction platform, and analyse the transient frequency of the power system based on the real-time data provided by the prediction model. The prediction model is dynamically adjusted as time progresses, as shown in Fig. 9.
The above prediction process can improve the degree of adaptation of the online prediction method. Transient data sets are regularly collected for updating the model while predicting the transient frequency in real time, based on changes in the system operation mode or topology. Due to the regularity of frequency changes in the system, the update cycle should not be too frequent. To ensure the unity of the system and model, a training update is performed every 24 hours, and the prediction model is refreshed. The above process should be repeated to perform real-time maintenance and updates to the transient frequency response analysis model.

IV. CALCULATION EXAMPLE AND RESULT ANALYSIS
The New England 39-bus system is used as a testbed for testing and validating the proposed method in this paper. This experiment is conducted on a Windows 10 64-bit computing platform that is equipped with a CPU of Intel Core i5-6500, a simulation software MATLAB PST v3.0, a GPU of NVIDIA GeForce RTX 3080Ti, and a deep learning development framework based on Kreas2.8.0 and TensorFlow2.8.0.

A. SAMPLE GENERATION AND EVALUATION INDEX
The Monte Carlo method is used to generate the sample data required for testing. To simulate as many power disturbance events as possible, it is assumed that any load node in the system has the same probability of a power disturbance event and the disturbance size obeys a uniform distribution of [0.5,1.5]. Using the above method, 10,500 sets of samples will be generated in the New England 39-bus system, of which the number of test samples is 500.
The SFR model takes the primary state variables of the power system as inputs, including the system base S B , the total system base S SB , the mechanical power gain factor K M , the disturbance power P d , the inertia constant H , the frequency modulation coefficient of the governor R, the damping factor D, the reheat time constant T R , and the fraction of total power generated by the HP turbine F H . It is important to note that the results of the SFR model are analytical expressions of frequency response and cannot be used as input variables in the LSTM model. In the event of a high-power disturbance, three features can be used to describe the main characteristics of the dynamic frequency response of the system. These features are: maximum frequency deviation f max , time at maximum frequency deviation t z , and steady-state frequency deviation f ess . Therefore, the above variables are selected as the input variables of the data-driven method LSTM to describe the frequency response dynamics. Currently, power system timing data are primarily collected using synchronous phase measurement units (PMU) [48]. The input features of LSTM include the magnitude, phase angle, and active power of the bus voltage obtained directly through PMU, as well as the primary features of the frequency response obtained through the SFR model. In this way, the data at a particular moment can be expressed as follows: x(t) = [V 1 , · · · , V m , θ 1 , · · · , θ m , P 1 , · · · , P n ] (29) In this equation, V i and θ i represent the voltage magnitude and phase angle of the selected ith bus at time t, P j represents the active power of the selected jth generator branch at time t, and m and n represent a total of m buses, and n lines. For any sample, the input features are noted as follows: The number of time points employed is t. In summary, the input feature X for each sample is a one-dimensional matrix with the shape R 1 * ((2m+n)t+3) . The output features are selected as the primary features of the dynamic frequency response of the system: the maximum frequency deviation f max and the time at maximum frequency deviation t z .
Considering that the input features need to reflect the transient process of the system as comprehensively as possible, the sampling interval for the temporal data in the sample should be [t 0 , t post ], with sampling points d = (t post − t 0 )/T , where t 0 is the steady-state moment before perturbation, t post is the period following perturbation, and T is the sampling period.Moreover, the sampling interval needs to be chosen in such a manner that it ensures rapid prediction while obtaining as much information as possible about the power system as possible. As the current frequency of the system is 50Hz or 60Hz, the data of each 6 cycles before and after the fault moment t f is chosen as the sampling interval. As the 6 cycles are 0.12 s and 0.1 s, respectively, at two frequencies, they meet the engineering requirements with respect to prediction. Assume T = 10ms, that is, sampling every half cycle. The sampling interval for each sample temporal data is [t f − 12T , t f + 12T ], and it can be calculated that d is 24.
The coefficient of determination (R 2 ), mean square error (MSE), and mean absolute percentage error (MAPE), which are commonly used to measure the performance of regression problems, are selected as evaluation metrics for model evaluation analysis [38]. Following is the specific formula.
whereŷ i is the predicted value of the model, y i is the actual value,ȳ i is the average value and n is the number of samples. Model accuracy is measured using MSE and MAPE, and generalization ability is measured using R 2 .

B. PARAMETER OPTIMIZATION OF ISSA FUSION MODEL
As the proposed method implements online prediction by periodically updating the model structure, one update is presented here to illustrate the parameter optimization process. Input the training data into the fusion model optimized by the ISSA search algorithm, which uses the ISSA search algorithm to optimize the most critical hyperparameters of the fusion model, such as the epochs, the learning rate, and structural parameters, including the number of hidden layers and the number of hidden layer units. The optimization process is shown in Fig. 10, and (a)-(d) represents the epochs, the learning rate, the num of hidden layers, and the fitness of ISSA with each iteration of ISSA, respectively. Fig. 10(a) shows that the epochs increase and then decrease with each iteration of ISSA algorithm, and finally stabilize at 147; from (b), the learning rate increases gradually with each iteration of ISSA and stabilizes at 0.0042; from (c), the number of hidden layers is 2, where the number of hidden layer's units in each layer is 68 and 15, respectively; from (d), the fitness finally stabilizes at 0.019 as the iterations are completed. Upon completion, the optimal hyperparameter set of the ISSA fusion model is η best = {147, 0.0042, [68, 15]}.

C. ABLATION EXPERIMENT
In the ablation experiment, the physical-driven method SFR model is used as the base model. The data-driven method LSTM and the swarm intelligent optimization method ISSA are added in turn to the SFR model. The objective of  this experiment is to demonstrate the effectiveness of frequency response analysis using the swarm intelligence fusion model.
In the first step, an error analysis of the SFR model is conducted. At t = 0s, a sudden active power disturbance with a nominal value of 100MW is applied to bus 5 of the New England 39-bus system. Fig. 11 illustrates the comparison of the system frequency response results between the SFR model and the time domain simulation model. Simulation results demonstrate that the SFR model is capable of describing the frequency response of the system after a high-power disturbance, but its error is considerable and the calculation accuracy need improving. Fig. 12 illustrates the training process of the fusion model and the swarm intelligence fusion model, respectively. It can be observed that the loss and MSE decrease over time, so both proposed models did not overfit. In addition, the accuracy of the swarm intelligence fusion model is slightly higher than that of the fusion model.
As shown in Fig. 13 and 14, the prediction results of the SFR method, the fusion model combining SFR and LSTM in  a bootstrap mode, and the ISSA fusion model (our model) are compared using a randomly selected set of test results in the New England 39-bus system. Fig. 13 illustrates that the computational accuracy and generalization ability of the swarm intelligence fusion model ISSA-SFR-LSTM is significantly higher than that of the SFR model. From the comparison table of various method metrics in Fig. 14, it can be seen that the SFR model based on the physical-driven method has the largest prediction error. However, the prediction accuracy of the fusion model with the addition of temporal characteristics is greatly enhanced. Additionally, the prediction error of the fusion model SFR-LSTM is only 18.11% of the SFR's. This indicates that temporality plays a significant role in the prediction of frequency response after perturbation. With the ISSA  The computation times for the SFR model, the fusion model SFR-LSTM, and the swarm intelligence fusion model ISSA-SFR-LSTM are 1.20 ms, 3.98 ms, and 4.03 ms, respectively. It can be seen that the SFR model based on the analytical model is extremely fast to compute. The speed of the ISSA-SFR-LSTM model is slightly lower than that of the SFR-LSTM model. In general, it is still considered capable of meeting the high-speed computational requirements for online frequency response analysis.

D. COMPARATIVE EXPERIMENT
To verify the superiority of our model over other data-driven models based on machine learning, two additional models are constructed for comparison experiments, the SFR-MLP model and the SFR-ANN model, both with two hidden layers. The input and output data are the same as those used in the ISSA-SFR-LSTM model. Both models use the ISSA algorithm to optimize the hyperparameters in order to achieve the optimal prediction results. The proposed method is compared with the fusion methods, SFR-ELM model and SFR-SVM model, proposed in the literature [38], [39]. The results of the comparison are presented in Table 1. VOLUME 11, 2023 As a result of our model, the MAE and MAPE for the maximum frequency deviation f max are 0.0014 Hz and 0.9%, respectively, and for the time at maximum frequency t z , they are 0.0008s and 0.42%, respectively. The accuracy is substantially higher compared to the fusion models SFR-ELM and SFR-SVM, which do not use time-series data. Taking MAPE as an example, for the f max our model improves 4.93% over SFR-ELM and for the t z improves 3.98% over SFR-ELM. It indicates fully mine the temporal information of the measurement data in the transient process of a power system can help to further improve the accuracy of power system transient stability assessment. Compared with other fusion models SFR-MLP and SFR-ANN constructed with the same inputs, the accuracy of the proposed method is also significantly improved, thus proving that our model is more capable of handling time series data. Considering the Table 1 above, it can be seen that the computational time consumed by our model is slightly higher than other methods. In spite of this, it can still meet the requirement of high computing speed for online analysis of frequency response.

V. CONCLUSION
A swarm intelligence fusion model is proposed to predict the transient frequency of power system. Physical-driven method SFR is integrated with data-driven method LSTM, which is capable of processing time series data, as well as ISSA for the optimization of the hyperparameters of the fusion model. Based on the simulation verification of New England 39-bus system, the following conclusions have been drawn.
• In contrast to existing fusion methods that utilize conventional machine learning as a data-driven method, this paper considers the temporal characteristics of transient data in power systems and uses LSTM with high temporal processing performance as the data-driven method to take full advantage of sample data so that the performance of feature prediction results will not be limited by the temporal characteristics of the sample data.
• In the case of optimizing the fusion model structure, ISSA prevents the incorrect adjustment of the model structure due to subjective factors. Furthermore, it is possible to improve the prediction accuracy by analyzing and correcting model prediction errors. Meanwhile, the fusion model structure is dynamically adjusted according to the online data using the ISSA algorithm, improving the self-adaptive degree of the prediction algorithm. The application of various fusion methods based on physical-driven and data-driven approaches in the future will allow for the real-time assessment of frequency safety and stability of different power systems, including those considering increased energy access, and will serve as a basis for subsequent scheduling and control to prevent frequency collapse events. Furthermore, a further extension of our research would be to examine more details about the real-world implementation of the swarm intelligence algorithm.
LIN XU is currently pursuing the master's degree with the School of Software, Institute of Intelligent Systems, Dalian University of Technology. Her research interests include machine learning and power system automation.

LI LI is an Engineer with the China Electric Power
Research Institute (CEPRI), where she focuses on optimizing power grid dispatch and analyzing power grid performance.
MEIYING WANG is currently pursuing the master's degree with the School of Software, Institute of Intelligent Systems, Dalian University of Technology. Her main research interest includes machine learning.