R ESEARCH ON SPEED CONTROL OF HIGH - SPEED TRAINS BASED ON HYBRID MODELING

: With the continuous improvement of train speed, the automatic driving of trains instead of driver driving has become the development direction of rail transit in order to realize traffic automation. The application of single modeling methods for speed control in the automatic operation of high-speed trains lacks exploration of the combination of train operation data information and physical model, resulting in low system modeling accuracy, which impacts the effectiveness of speed control and the operation of high-speed trains. To further increase the dynamic modeling accuracy of high-speed train operation and the high-speed train's speed control effect, a high-speed train speed control method based on hybrid modeling of mechanism and data drive is put forward. Firstly, a model of the high-speed train's mechanism was created by analyzing the train's dynamics. Secondly, the improved kernel-principal component regression algorithm was used to create a data-driven model using the actual operation data of the CRH3 (China Railway High-speed 3) high-speed train from Huashan North Railway Station to Xi'an North Railway Station of "Zhengxi High-speed Railway," completing the mechanism model compensation and the error correction of the speed of the actual operation process of the high-speed train, and realizing the hybrid modeling of mechanism and data-driven. Finally, the prediction Fuzzy PID control algorithm was developed based on the natural line and train characteristics to complete the train speed control simulation under the hybrid model and the mechanism model, respectively. In addition, analysis and comparison analysis were conducted. The results indicate that, compared to the high-speed train speed control based on the mechanism model, the high-speed train speed control based on hybrid modeling is more accurate, with an average speed control error reduced by 69.42%. This can effectively reduce the speed control error, improve the speed control effect and operation efficiency, and demonstrate the efficacy of the hybrid modeling and algorithm. The research results can provide a new ideal of multi-model fusion modeling for the dynamic modeling of high-speed train operation, further improve control objectives such as safety, comfort, and efficiency of high-speed train operation, and provide a reference for automatic driving and intelligent driving of high-speed trains.


Introduction
As train speeds increase, the safety of trains operation becomes increasingly important. Hence, when constructing intelligent trains, stricter safety standards are demanded (Karolak, 2021). The research of high-speed train operation curve control and tracking method is highly relevant to attaining safe, efficient, and comfortable train operations. The current focus of study is on the modeling and control of the train speed tracking process, with the control premise being the development of an effective model description of the train operation process (Tan et al,. 2022). Due to the complexity of the high-speed train operating process, train operation control is affected by a number of variables, including vehicle performance, line conditions, environmental changes, and external interference. Creating an accurate mathematical model to explain the railway operation process is frequently challenging. This theoretical study issue has attracted a growing number of experts from around the world. Related studies have shown that the model description methods for the high-speed train operation process can be generally characterized as mechanism modeling and data-driven modeling. There are two types of mechanism modeling: single-mass point modeling and multi-mass point modeling. Lian et al. (2020) constructed the state-space equations of highspeed trains based on the single mass point model to describe the dynamics of high-speed trains during operation, however, it differed significantly from the actual train operation. Zhang et al. (2022) established the train dynamics equation based on the running resistance generated during actual train operation; but the force analysis was insufficient and the interaction force between carriages was ignored. Multi-mass point modeling corresponds more closely to the actual train operation procedure. Hou et al. (2019) viewed the train as a "mass point chain" comprised of multiple mass points and applied fuzzy adaptive PID control to control the train's speed. Jia et al. (2020) created a multi-mass point model and employed nonlinear predictive control to follow the rolling stock's running curve. Mo et al. (2021) utilized a unit-shift multi-mass point model, which enabled the online control operation of trains and simplified the multi-mass point model calculation. The models shown above are derived from the empirical formula and have a distinct physical meaning. Yet, the formula parameters in the modeling and solving process are typically derived by experience, and some assumptions are made. It is difficult to accurately characterize the dynamic aspects of the train operation process using mechanism modeling. Further, the researchers conducted extensive research utilizing a data-driven model to characterize the train operation process. Using traction system sensor data, Jiang et al. (2020) suggested an ideal datadriven fault detection and diagnosis (FDD) method for solving the fault diagnostic problem of dynamic traction systems. Fu et al. (2022) did a comprehensive data mining and analysis of the motor system's data features. A combination method of improved principal component analysis and deep learning is proposed to handle the problem of identifying tiny faults in traction systems, hence improving train operating safety. Wang et al. (2022) changed the dynamic model of high-speed trains into a partial format dynamic linearized data model and suggested a model-free adaptive fault-tolerant control (PFDL-MFAFTC) algorithm to enhance the high-speed trains' adaptive fault-tolerant control capacity. Datadriven modeling solves the problems of mechanism modeling to a considerable extent, however the model has no physical significance, low interpretability and generalization ability, and stringent data source requirements. In conclusion, the majority of the present dynamic models constructed around the train operating process have constraints such as low model accuracy, complex modeling, and sluggish solution speed, and it is imperative to search out new modeling techniques to solve the problem. In several domains, multi-model fusion modeling has become prevalent in response to the research trend of high modeling accuracy (Zhang et al,. 2021;Anifowose et al,. 2017;Kim et al,. 2021). The hybrid modeling method proposed in this paper is based on the fundamental design concept of the multi-model fusion modeling method, combining train dynamics theory with data driven by machine learning, utilizing the complementary advantages between single models to establish a fusion model, taking into account the physical significance of the mechanism model and datadriven data information, and then improving the modeling accuracy of high-speed trains. In addition, PID control is a frequently used classical control method, however it is better suited for linear systems (Zhang et al,. 2018). In this paper, fuzzy control and PID control are coupled to reduce speed control error via PID parameter correction. Together with the benefits of predictive control for complicated control processes, the predictive fuzzy PID controller is intended to increase the accuracy and comfort of train speed control. Lastly, utilizing the actual route data and train parameters, the efficacy of the proposed hybrid modeling method and compensation algorithm is validated by comparing the accuracy of train operation curve tracking under the mechanism model and the hybrid model under the condition that the predictive Fuzzy PID control algorithm is implemented.

Modeling of high-speed train dynamics mechanism
The EMUs of CRH3 are chosen as the object of study. The train's fixed formation and running characteristics are investigated. Using Newton's law, the running process of a high-speed train is analyzed dynamically, and the dynamic equation of the train under the resultant force is derived.

Traction/Braking characteristics
During the operation of a high-speed train, traction or braking force is delivered to maintain or change the train's operation. The application of traction or braking force varies depending on operational conditions (traction, braking, inertia, and cruising.) The formulae for traction and braking (Hou, 2015) Where, 0 is the basic train resistance, is the train operating speed.

Additional resistance
Additional resistance is an extra part of resistance when the train runs on memorable lines (such as curves, ramps, etc.), expressed as the sum of additional curve resistance , ramp additional resistance , and additional tunnel resistance . By analyzing the resistance during train operation and referring to Hou's research (2015), the train resistance of high-speed trains can be shown in Equation 4.

Kinetic equations
The differential equation as in Equation 5 is used here to describe the dynamics of the high-speed train.
Where, indicates the operating conditions. When = 1, it indicates that the train is in traction condition. When = 0, it indicates that the train is in idling condition. When = −1, it indicates that the train is in braking condition. Considering that the cruising operation state is also composed of the aforementioned three working conditions, the running working condition is set here to contain three kinds (Ding, 2021), ∈ {1,0, −1}. The control volume is a function of the running working condition and running speed, the train resistance is a function of the train position and running speed, is the train mass, and is the train running time.

Mechanism and data-driven hybrid modeling
Due to the fact that the parameters of the equations in the mechanism modeling and solution process are often derived empirically and certain assumptions are made, there are discrepancies between the mechanism model and the real process of train operation. In light of the aforementioned issues, the mechanism model is utilized as the main body, and the datadriven model based on the real-time moving window local weighted kernel principal component regression (RTMW-LWKPCR) technique is applied to the mechanism model as an error compensator. The residual error between the mechanism model result and the measured value is considered the target data , and the mechanism model is compensated and corrected by the output error compensation value to increase the overall model's accuracy. Figure 1 depicts the schematic of mechanism and data-driven hybrid modeling (Li et al., 2021).
High-speed train system High-speed train system From the diagram, is the actual outputs speed of the high-speed train; ′ is the hybrid model output speed; is the mechanism model output speed; and ̂ is the data-driven model compensation output speed. The relationship is as shown in Equation 6. In the data-driven model training process, the model output is made to approximately satisfy ′ = +̂= . When the error 0 approaches zero, the training is completed, the error compensation model is obtained (Garca-matos et al., 2013), and the output speed of the high-speed train hybrid model is obtained.
The input parameters of the high-speed train hybrid model are the speed of the train at the current moment, and the applied control force (tractive force/braking force) and train resistance, the control volume = { 1 , , , }, and the output parameters of the model are the speed of the high-speed train at the next moment, ′ = { 2 }. The output of the data drive is the deviation of the actual high-speed train system from the speed of the mechanism model.
Let ( ) = ( ) − 1 ∑ =1 ( ), so: The correlation weights of the input and output variables are calculated using the Person coefficients. The equations are as in Equation 9 and Equation 10.
Where, ̄ is the mean value of input variables, ̄ is the mean value of output variables, , is the Pearson correlation coefficient of input variables and output variables, represents the correlation weights. Then the local weighting strategy of each variable is Equation 11.
Where is the i th sample data, is the m th variable data of the i th sample data. Thus, the covariance matrix can be finally expressed as Equation 12.
(2) Calculate the data weighted projection Let be the eigenvalue matrix of and be the corresponding eigenvector matrix of . That is: Since the mapping function ( ) is implicit, there is no explicit form of expression, and it is impossible to find and directly. Therefore, the Gaussian kernel function is chosen, and the kernel technique is introduced.
The feature vector can be expressed by a linear combination of ( ) as Equation 15.
Where, is the linear combination coefficient of ( ), denoted as = [ 1 , 2 , … , ] . Multiplying ( ) left to the left and right sides of Equation 13, it can be obtained as follows: Then substituting Equation 15 into Equation 16: Then substituting Equation 14 for the kernel function into Equation 17, the formula was written as: = The above equation shows that is the eigenvector of the kernel matrix . The kernel matrix and the eigenvectors of the kernel matrix can be calculated by the kernel function given earlier. normalizing to be = 1, the formula can be written as: The weighted projection of ( ) is calculated as follows: The output sample space will need to be normalized. Thus, the output of the data sample, which is the value of the velocity error compensation, is Equation 23.
Where ̅ is the mean value of the output variable.

Immediate moving window
Due to the complicated and ever-changing external environment of the actual train operating process, ongoing usage of the historical model would inevitably lower the accuracy of the model's calculation outputs as a result of the constant introduction of new working conditions. To address this issue, the real-time moving window technique was proposed to respond to actual working conditions by continuously updating the error compensation model to ensure real-time model updating.
Assuming that the data samples in the i th moment window are{ , }. Firstly, the model is trained on the sample data in the current window, and the mean and standard deviation are calculated. Secondly, the model is updated or not according to the changes in the mean and standard deviation of the following sample data compared with the samples in the current window. When the mean and standard deviation of the following sample data do not change much, the previous training model is extended for predictive, and the model is not updated. When the change is significant, the window is shifted, and the locally weighted kernel principal component regression model is updated (Liang, 2021). The algorithm flow and working process are shown in Figure 2 and Figure 3, respectively.

RTMW-LWKPCR algorithm
The main steps of the RTMW-LWKPCR algorithm are shown in Table 1 Calculate the current window sample mean and standard deviation 

Predictive Fuzzy PID control algorithm
To verify the efficacy of the hybrid modeling and compensating algorithm, the Fuzzy PID algorithm is chosen to construct the controller. The deviation and deviation transformation rate of the optimal input value and the given input value obtained by the predictive controller are used as the input of the fuzzy PID controller, after which the three parameters of the PID controller are adjusted online in accordance with the fuzzy control principle, and the output is then applied to the controlled object in order to determine the input value of the predictive controller. Figure 4 demonstrates its structure: is the target speed; is the predicted speed; is the feedback corrected speed; is the actual outputs speed. The method chosen for the simulations is the dynamic matrix control algorithm in this paper. After applying a control increment ∆u(k) to the high-speed train at moment k, the P predicted speed values for the future moment (Hou et al., 2020) are given by: Where, ( ) is the predicted speed at P future moments, 0 ( ) is the initial predictive value of the system, is the predictive model vector of the highspeed train for the step response, ( ) is Q continuous control increments. The predictive error 1 is the difference between the predicted speed and the train's operating speed ( + 1) at moment k+1.
Then the corrected predicted velocity is shown in Equation 26.
In the Fuzzy PID control algorithm, the Fuzzy subset defines the train speed tracking error and error change rate as {NB(negative large), NM(negative medium), NS (negative small), ZE(zero), PS(positive small), PM (positive medium), PB(positive large)}. The Gaussian distribution type is chosen for the affiliation function of the input quantity, and the triangular type with higher sensitivity is used for the affiliation function of the output quantity. The fuzzy reasoning adopts Mamdani inference, and anti-fuzzification adopts the center of gravity method. The specific Fuzzy rules are shown in Table 2. The parameters are corrected as shown: ′ = + , ′ = + , ′ = + .

System simulation and result analysis
The actual line data (Hou, 2015) from Huashan North Station to Xi'an North Station of Zhengxi High-speed Railway is utilized as the baseline data to validate the effectiveness of the hybrid modeling and algorithm suggested in this paper. The simulation tests are conducted on a Windows 10 system with MATLAB R2019b as the simulation environment. Certain train parameters and line data are displayed in Tables 3 and 4, respectively. Figure 5 shows the target speed profile of the CRH3 train obtained from a one-way run on this simulated test line, with a one-way run time of 1680 s. During the simulation, the parameters of the RTMW-LWKPCR algorithm for the hybrid model were set as follows: the window length was set to 200 s/step, the kernel function width parameter was 3, the scale factor parameters 1 = 0.8, 2 = 1.2, the maximum step size s/step. The ±5% speed error of the actual line data is used as the data sample. The speed error compensation values of the data-driven model based on the normalized RTMW-LWKPCR algorithm are shown in Figure 6. The velocity profile tracking under the same predictive Fuzzy PID controller control is executed independently for the mechanism and hybrid models, and the simulation results are depicted in Figure 7. Local enlargement is depicted in Figures 8 and 9. As shown in Figure 7, the speed curve under the mixed model corresponds with the target velocity curve of the solid line, and the model and algorithm achieve the tracking control of the actual speed-time curve to the target speed-time curve with a satisfactory tracking effect. Figure 10 shows the comparison of speed control errors under the two models.   Analyzing the simulation results, the error range of high-speed train speed control based on the single mechanism model in Fig.10 is ±0.45m/s, and the maximum error of speed control is 0.4373 m/s; however, the error range of high-speed train speed control based on the hybrid model is ±0.15m/s, the maximum error of speed control is 0.2328 m/s.

Conclusion
On the basis of the fundamental concept of multimodel fusion, a hybrid model of high-speed train mechanism and data-drive is created. The simulation verification and result comparison demonstrate that, compared to the single mechanism model, the maximum control error of the speed control based on the hybrid model is reduced by 46.76 %, and the average speed control error is reduced by 69.42 %, which proves the hybrid model and compensation algorithm's effectiveness. In addition, the real-time window ensures that the model is updated in real-time, hence enhancing the model's precision. In conclusion, the suggested model and compensation algorithm significantly minimize speed control error and raise the speed control precision of high-speed trains by 13.75 %. The simulation results demonstrate that the hybrid model and compensation algorithm suggested in this paper are successful. The innovative points of this study are: (1) A hybrid modeling method based on mechanism and data-driven is proposed, which increases modeling accuracy and speed control effect and overcomes the problem of single modeling's low accuracy.
(2) Due to the real-time nature of high-speed train operation, the real-time moving window is designed to continually track and update the model's state in real-time, thereby enhancing the model's accuracy.
In this paper, a portion of the modeling of highspeed trains is examined, and phased research results are presented. Yet, there are still issues that require additional research. In the next phase of the research, the alterations in internal forces and running lines (ramps, curves, etc.) in the workshop will be analyzed in order to improve the design of the model error compensation algorithm and optimization control method, thereby advancing the train speed control objective.