Real-time drive-by bridge damage detection using deep auto-encoder

Structural health condition monitoring of bridge structures has been a concern in the last decades due to their aging and deterioration, in which the core task is damage detection. Recently, the drive-by method has gained much attention as it only needs several sensors installed on the passing vehicle. In this paper, we proposed an automatic damage detection method, which can be exploited in real time when the vehicle is passing the bridge. There are three steps in the proposed method: (1) The vehicle’s framed short-time vibrations instead of full-length data are utilized for training a deep auto-encoder model; at this stage, not commonly used time-domain accelerations of the passing vehicle


Introduction
Bridges play an essential role in modern transport systems and have a significant impact on people's daily lives.In recent years, however, the aging and deterioration of bridges have become a serious concern in many countries due to the rapidly increasing number of aged bridges.Many bridges in Europe were built in the middle of the last century and have served beyond their design years [1].It was reported that in Finland, around 7,000 out of 15,160 bridges would require renovation by 2020, and around 5% of all bridges were in poor conditions [2].In the U.S., nearly half of all bridges were rated as fair and 7.6% of them were regarded as poor bridges [3].In Japan, a large number of bridge constructions started in the 1960s and the majority of them have stood for three to four decades [4].Health condition assessment of bridges becomes crucial to keep them safely operating in their rest life [5].Damage detection, as an important component of structural health monitoring (SHM), can provide safety assessment and early warning for bridges.One promising method for bridge damage detection is to extract damage indicators (DIs) from its vibration data [6,7], such as natural frequencies, modal shapes, and damping [8][9][10][11].DIs before and after damage can be utilized as references for assessing the bridge's health conditions.Traditionally, these DIs were extracted from method has become a hot topic in recent years.Initially, researchers mainly put efforts into determining the bridge's natural frequencies from the vehicle's vibration data.In 2005, Lin and Yang [14] successfully extracted a field bridge's fundamental natural frequency with an instrumented tractor-trailer and a heavy truck (as the exciter) at speeds less than 40 km/h.In 2008, Wang et al. [15] proposed to extract the bridge's natural frequencies using a calibrated vehicle passing road with unknown roughness.A particle filter approach was employed, and the bridge's first frequency could be clearly identified in the power spectrum.In 2021, Shi and Uddin [16] investigated the theoretical principles of extracting multiple natural frequencies from the vehicle's vibration data.The results showed that the vehicle's frequency parameters had significant influences on bridge frequency extraction.To improve the frequency identification precision, a single-axle test vehicle with two Degrees of Freedom (DOFs) was designed by Yang et al. [17] in 2022.The authors tried to utilize contact point (CP) responses rather than the traditional accelerations of the vehicle's axles.Temporary vehicle stops were employed to refine the results, and the first threeorder natural frequencies were extracted with good precision.However, the bridge's natural frequencies may not be suitable DIs since local damage sometimes cannot cause a perceptible change of the bridge's natural frequencies [18].For this reason, several researchers put efforts into exploring the vehicle's complete time-domain or frequency-domain responses rather than just peaks.In 2017, a time-domain signal processing approach using wavelets to detect the bridge's damage was proposed by Hester and González [19].The vehicle was considered as a point load or a two-axle sprung model, and results showed that when the Signal Noise Ratio (SNR) was high, the damage was noticeable.But when it decreased to 20 dB, the results could not detect the damage.In 2014, Cerda et al. [20] proposed to classify different damage scenarios using the vehicle's frequency responses.Results represented that different changes in the support conditions, damping, and damage in bridges could be classified well.Despite the studies performed, the drive-by damage detection method is commonly applied to small span bridges currently, and few studies have been explored for middle or large span bridges.Theoretically, the passing vehicle's short-time vibration data can reveal the bridge's health conditions, which is conducive to damage detection for large span bridges.
In recent years, artificial intelligence (AI) has been influencing different fields due to the rapid development of computer science [21].Employing machine learning and deep learning (DL) techniques in SHM has gained much attention owing to its characteristics of intelligence and automation [22,23].Researchers began to investigate the application of DL in drive-by bridge damage detection.In 2019, Locke et al. [24] trained convolutional neural networks to classify different bridge damage severity.The environmental temperature, vehicle speeds, and vehicle weights were also considered during the training process.The results showed that the damage severity classification could be achieved with good accuracy (> 80%) when only low frequency (3-10 Hz) response peaks were utilized.In the same year, Malekjafarian et al. [25] trained two artificial neural networks (ANNs) to predict the bridge's health conditions.The first model utilized the vehicle's positions and speeds as input features and its acceleration as labels.In the second model, input features were frequencies and the speeds, and the vehicle's frequency responses were labeled accordingly.In the training stage, the vehicle's vibration data when passing a healthy bridge were used.Then errors between predicted signals and true signals were calculated as DIs.Results showed that both the bridge's damage and its severity could be identified.Liu et al. [26] proposed that all the vehicle's frequency-domain responses were informative and required analysis in the process of damage detection.The authors utilized a stacked auto-encoder (SAE) model to reduce the input dimensions.Then the low-dimension hidden state in the bottleneck was used to feed a semi-supervised model with a few labeled data.The test results showed that a near 0.1% mass increase could be detected.In 2021, Feng et al. [27] utilized the bridge's instantaneous forced frequencies and k-Nearest Neighbors (KNN) to locate and quantify its damage.Results showed that the damage's degree and position could be identified in optimal cases, but near supports, the precision was relatively low.In 2021, instead of using the vehicle's frequency-domain responses directly, Corbally and Malekjafarian [28] proposed to utilize the CP frequency responses to feed ANNs.The damage detection results showed that CP responses were more robust than traditional accelerations of the vehicle's axles, and the damage could be distinguished when different vehicle speeds, ambient temperatures, and road roughness were investigated.However, for the existing DL-based methods, labels for different damage cases were needed beforehand.In practical engineering, damage scenarios of the bridge are scarce and even cannot be found and labeled, making the supervised learning techniques (labels data are needed) difficult to be applied in the real world.
Unsupervised learning can overcome the above problem because it does not need data with labels; instead, data can learn features from themselves.In 2022, the K-means algorithm was utilized by Yang et al. [29] to cluster peak frequencies of the CP's principal components based on the singular spectrum.An experimental test demonstrated that the proposed algorithm could recognize the first two natural frequencies of the bridge.Auto-encoder model, as a good unsupervised learning method, has been investigated by researchers in SHM recently.In 2018, Pathirage et al. [30] utilized an auto-encoder model as a dimension reduction tool to obtain dense features, and the features in the bottleneck were utilized for training a linear regression model and then to predict stiffness reduction parameters.Compared to the traditional ANN model, the proposed framework can improve accuracy and efficiency simultaneously.In 2021, Wang and Cha [31] combined auto-encoder and one-class support vector machine, and 91.0%damage detection accuracy could be reached using extracted features by the auto-encoder.Later, Shang et al. [32] built a deep convolutional auto-encoder model to reconstruct the cross-correlation functions of original signals.Compared to the traditional auto-encoder, the proposed approach was robust to noises and environmental impacts.The capability of auto-encoder has been proved in SHM.However, it is rarely investigated in drive-by bridge damage detection to the author's best understanding.
In this paper, an unsupervised deep auto-encoder (DAE) model is proposed to automatically detect the bridge's damage in real time, in which short-time data are employed to detect the bridge's health conditions when the vehicle is running on the target bridge.The potential of the proposed idea was explored by a laboratory U-shaped beam and a model truck installed with two accelerometers.Then the DIs are extracted by the trained DAE model.Finally, real-time damage detection was performed.The remainder of this paper is organized as follows: Section 2 introduces the basic principles of the deep auto-encoder and the process of extracting DIs from original vibration data.Section 3 shows the laboratory setups and damage scenarios.Section 4 explores the results of experiments and some discussions on the proposed idea.Finally, conclusions and future work of this paper are provided in Section 5.

Proposed method
The proposed method mainly includes three phases: data preprocessing, training DAE, and damage indicator extraction.In the first phase, vertical acceleration signals of the passing vehicle are collected and truncated, and accelerations are transformed from time-domain to frequency-domain to obtain the vehicle's frequency responses.In the second phase, frequency responses of the vehicle passing the healthy bridge are utilized for training a DAE that can identify the bridge's health conditions.In the third phase, the trained DAE is utilized to reconstruct the vehicle's frequency responses, and DIs are extracted by the original and reconstructed responses.The proposed method only requires acceleration data of the vehicle passing the intact bridge, and critical information about the bridge's health conditions can be provided using the extracted DIs.An overall schematic view of the proposed method is shown in Fig. 1.In later subsections, all phases are explained in detail.

Data preprocessing
Acceleration selection.Since the proposed method is to identify the bridge's conditions in real time, the vehicle is expected to run at a constant speed on the bridge when the analysis starts.Besides, to analyze as accurate as possible, only when both front and rear tires are on the bridge, the vehicle's accelerations are utilized for analysis.
Framing.After acceleration data of the vehicle passing the healthy bridge are collected, in order to make a real-time damage detection later, the collected acceleration data need to be divided into different frames.As the vehicle is running on an intact bridge, each frame of the passing vehicle's vibration data should reveal the bridge is healthy.When framing, an overlapping area is needed between two adjacent frames to keep the signal's stationarity and detect the damage in real time.The overlapping area is selected according to the updated rate of damage detection (namely damage detection frequency, DDF).For example, if the DDF is set as 100 Hz, the non-overlapping time between two adjacent frames is 0.01 s.The frame length   needs to be properly selected.Long   means more signals to be analyzed, and it can cause an increase in analysis time that needs to be less than 1∕ .Also, a long frame means the vehicle must run a long distance on the bridge before the analysis, which may not be suitable for relatively short bridges.Nevertheless, the frame cannot be too short because the vehicle may not be able to collect enough dynamic information for analysis.It is recommended to determine the frame length by using two thresholds, including the minimum and longer frame length.The minimum frame length is supposed to be determined by the estimated bridge's first natural frequency  1 and natural period  1 , where  1 = 1∕ 1 .In one frame, the bridge needs to vibrate at least once following the first modal shape; thus, the vehicle can at least collect the bridge's vibration once in the selected frame.Therefore, the frame's length   is expected to be greater than  1 .Generally, in practical engineering, the bridge's first natural frequency is less than 10 Hz, thus the minimum length of   is 0.1 s.But in order to secure a high accuracy of damage detection, a longer frame length, such as 10 1 , is recommended.In terms of the longer frame length, half of total passing time is recommended.
Windowing.After the vibration data are truncated into different frames, windows are added to alleviate spectrum leakage.General window functions include Hann, Rectangular, Flat top, and Blackman windows.As the vehicle's vibration data are easily influenced by environmental noises, the Hann window is selected in this paper.

Short-time fourier transform (STFT).
Employing the selected parameters above, the STFT can be carried out to extract frequency responses of each frame.It is worth noting that not all frequency responses of each frame need to be utilized for analysis because the bridge's first threeorder natural frequencies are relatively low.If the vehicle passes the healthy bridge  times (represented by  runs) in the training stage, and the th passing vibration data can be divided into   frames, we can get  = ∑  =1   frames in total.Then one frame's frequency responses can be regarded as a sample to feed the auto-encoder in the next step.Suppose that Ñ frequency responses are selected, then there are Ñ features in one sample.

DAE
Auto-encoder is an unsupervised neural network that does not need labels for different samples.The auto-encoder model is used to make the outputs as similar to inputs as possible.The traditional auto-encoder consists of an encoder and a decoder, and there is one hidden layer only, which can be represented by Eqs. ( 1) and ( 2).

𝒉 = 𝑓 (𝐖𝒔 + 𝒃)
(1) where  is the input vector, and ŝ is the output vector. is the hidden state of inputs. and  are the activation functions for the encoder and decoder respectively. and  * are weight matrices for encoder and decoder, and  and  * are bias vectors for the hidden layer and output layer respectively.The target of the auto-encoder is to optimize ,  * ,  and  * to minimize the difference between inputs and outputs.
As the auto-encoder needs to reconstruct the inputs using the information in the hidden layer, the hidden layer consists of the most crucial information of inputs.Some unrelated or insignificant information will be dropped in the hidden layer.Therefore, it can be used to reduce the dimensions of inputs.On the other hand, because some information is lost in the hidden layer, the auto-encoder cannot reconstruct the inputs completely.The error between inputs and outputs can be utilized as a loss function to train the auto-encoder.
DAE is an improved auto-encoder model, and it has more hidden layers.With these hidden layers, the auto-encoder's learning capability is boosted.It can be used to address more complex inputs.However, with the increase in the number of hidden layers, the auto-encoder may have overfitting problems that mean the auto-encoder learns from some noisy information and the trained auto-encoder cannot be generally utilized for other samples.The number of hidden layers is a hyperparameter needed to be determined before training the DAE.For a DAE model, given an unlabeled dataset  = [ 1 ,  2 , … ,   ] and the reconstructed dataset Ŝ = [ŝ 1 , ŝ2 , … , ŝ ], the loss function can be represented by Eqs. ( 3) and ( 4), where  represents all parameters in the DAE network, including wight matrices   ,  *  and bias vectors   ,  *  ; here, the subscript  denotes the parameters for the th hidden layer and there are  hidden layers in total.() is a regularization term imposed on weights to prevent overfitting problems.
Then, all parameters will be updated in the opposite direction of the loss function's gradient ∇  () to minimize the loss.In case the computer cannot address all inputs at the same time, the mini-batch is commonly utilized in the training process.It has been proved that minibatch can improve training efficiency and accelerate convergence [33].The next step is determining the step length (namely, learning rate in DAE).It is usually difficult to decide on a proper learning rate because the loss function's hyperplane is very complex, and the starting point is random.Fortunately, the adaptive moment estimation (Adam) algorithm is proposed, and it can change the learning rate in every step.In this work, it is observed that Adam can perform better than other optimization algorithms.
In this paper, the DAE is employed to extract features of the vehicle's vibration data when passing the healthy bridge.The objective of the DAE is to minimize the errors when reconstructing input.It is worth noting that compared to the previous auto-encoder studies, we employ the vehicle's frequency responses as input features to feed the DAE model.This is because accelerations usually contain high frequency components induced by noises, which may have negative influences on damage detection results.It is easy for the DAE model to learn the features of noisy signals rather than the natural characteristics of the signal [32].Therefore, frequency responses are utilized in this paper to improve the DAE model's robustness under noises instead of commonly used time-domain responses.

Damage indicator
For damage detection, given a frame's original frequency responses  and its reconstructed frequency responses ŝ using trained DAE, the DI can be represented by the square error as shown in Eq. ( 5).

DI = ‖𝒔 − ŝ‖ 2
(5) There will be one DI value for one frame.For the vehicle's one run (it passes the bridge once), the DIs can be calculated in real time.Because the DAE is trained by frequency responses only when the vehicle passes the healthy bridge, it can reconstruct ''healthy'' frequency responses with delicate precision.When the vehicle passes on a damaged bridge, the vehicle's frequency responses will become abnormal for the trained DAE.Based on this principle, the trained DAE can differentiate the bridge's healthy conditions automatically.When the bridge is intact, DI values will be pretty low.However, when the bridge is damaged, the trained DAE model cannot identify these frequency responses as ''healthy'', so DI values will be relatively high.Therefore, DI values can be utilized to determine health conditions of the bridge.

Setups
In this section, a lab-scale vehicle bridge interaction model is utilized to verify the proposed method.In the experiment, a UPE300 steel beam is used to simulate a two-span continuous bridge in practical engineering.The beam's cross-section parameters are shown in Fig. 2. The material of the beam is Q355 with tested average Young's modulus of 199.0 GPa.The length of the beam is 6.0 m with three hinge supports at 0.15, 3.0, and 5.85 m, respectively.The beam's mass is weighted as 248.64 kg.
For the vehicle, a Tamiya model truck is utilized, and it can be controlled by a remote unit to run on the bridge (see Fig. 3).The length, width, and height of the truck are 570, 200, and 260 mm respectively.It has its independent suspension system, connection shift, rubber tires, etc., as shown in Fig. 4. The truck's mass is 4.305 kg.Since heavy vehicles are conducive to increasing amplitudes of the bridge's vibrations and improving the accuracy of damage detection [34,35], 5.157 kg of extra mass is added to the truck's trunk.After adding extra mass, the truck's front axle mass is 4.315 kg, and the rear one is 5.147 kg.In current research studies, the vehicle-bridge mass ratio is typically lower than 5% [36][37][38], and it is 3.8% in this paper, which is acceptable to simulate a real VBI system.
In order to control the truck to run in a straight way when it passes the bridge, two guide cables are used.This is to simulate that the vehicle is driven straightly in practical engineering.The cables have little influence on the truck's vertical vibration because they go through two pipes that are fixed on the truck (see Fig. 3a).It is worth noting that the cables are not considerably tight, so the truck's traces when passing the bridge are different.
It has been reported that the vehicle's speed can influence drive-by damage detection [39].In order to make the experiment more realistic,  the truck is running at a constant speed when passing the beam as discussed in Section 2.1.However, for different runs in the laboratory test, the truck's speeds are not exactly the same.In the experiment, a wood acceleration runway is set to accelerate the truck from a static state to the highest speed.At the end of the beam, there is a deceleration runway to decelerate the truck.Therefore, the truck is driven at its highest speed when passing the beam.As mentioned before, only vibration data when both the truck's front axles are on the beam are analyzed.In the experiment, the relative frequency distribution of speeds in this experiment can be found in Fig. 5.It can be seen that all speeds are between 0.72 and 1.05 m/s, and are nearly subject to a normal distribution.
In the experiment, two acceleration sensors are installed on the truck's front and rear axles to collect its vibration data.Two sensors are made by Brüel & Kjaer, and their type is 4371.There are also four sensors attached at the bottom of the beam for analysis and comparison to the drive-by bridge damage detection results.The sampling frequency is 10 kHz.A laptop installed with data collection software is employed in the experiment.Apart from these, the I/O device and signal amplifiers are utilized.The experiment is performed with normal environmental noises in the structural laboratory at Aalto University.The laboratory deployment can be found in Fig. 6.
With regard to the dynamic characteristics of the above VBI model, the vehicle is scaled from a real truck (scale = 1:14).When the accelerometer is attached to the rear axle, it can be measured the first order frequency is 19.531Hz (introduced in Section 4.2).For a real car in Ref. [40], the rear axle's frequency is tested as 15.075 Hz.The scaled vehicle not only scales the dimensions but can also keep a real car's dynamic characteristics well.
For the bridge model, the speed parameter  can be used to evaluate the scaled laboratory model [41][42][43].It can be obtained by Eq. ( 6),  where  is the vehicle speed;  1 means the fundamental frequency of the bridge, and  represents its length.From Fig. 5, it can be seen that the scaled vehicle's speed is around 0.9 m/s.The laboratory beam's first three order frequencies are tested as 30.748,42.528, and 98.033 Hz (introduced in Section 4.2).Substituting  1 = 30.748Hz, =6 m, and =0.9 m/s into Eq.( 6), we can get the speed parameter that is 0.0027.If we keep the bridge's fundamental frequency constant, the experimental model can be utilized to simulate a vehicle with 9 m/s (32.4 km/h) speed when passing a 60 m continuous bridge with two spans.Rather, we can see that the beam's first three natural frequencies are in the 0∼100 Hz scale and are higher than the natural frequencies of real bridges.In practice, generally, the bridge's fundamental frequency is lower than 5 Hz, and the first three order frequencies are within 0∼50 Hz [44][45][46].However, the main objective of this paper is to investigate the bridge's dynamic information hidden in the passing vehicle's vibrations.Thus, when selecting the range of utilized frequency responses, it can be selected wider than applications for a real bridge, e.g., 0∼100 Hz, to train the DAE model.In engineering applications, since the range including the real bridge's first three order frequencies is lower, frequency responses within 0∼50 Hz can be used for real-time monitoring of bridges.

Damage of the bridge
Theoretically, once the damage occurs, the local stiffness of the bridge will reduce.As the bridge's natural frequencies are associated with structural mass and stiffness matrices, a practical way to simulate the bridge's damage is to add additional mass to the bridge [47][48][49].In this experiment, different masses are added to the bridge's two spans.The damage degree can be represented by the ratio of the added mass with respect to the bridge's span mass.For each span, there are six damage scenarios (DS) in total: 5 kg, 10 kg, 15 kg, 20 kg, 25 kg, and 30 kg, represented by DS 1-6 as shown in Table 1, and DS 0 represents that the bridge is intact.At each span, one hook is employed to clamp the beam, and its mass is 2.0 kg.For instance, a 5 kg mass added to the beam's one span is shown in Fig. 7.

Data preprocessing
After the truck runs several times on the healthy bridge in the experiment, both the front and rear axle's acceleration data are recorded.Since they do not have apparent differences and the rear axle is heavier than the front one, only the rear axle's vibration data are utilized for the later analysis in this paper.One of the truck's rear axle's vibration data is shown in Fig. 8.
Then the next step is to select the length of the frame.In order to make the frame contain sufficient dynamic information about the bridge and improve the DAE model's robustness, the frame is selected as 1.0 s in this paper.The DDF is set as 100 Hz.Namely, the time resolution of automatic damage detection in real time is 0.01 s.
After the above preprocessing on all 506 runs of DS 0, 291,170 frames of the truck's vibration data are obtained, and each frame is 1.0 s.For every frame, after the windowing and STFT, the vehicle's frequency responses in a range of 0-5000 Hz are obtained.However, not all frequency responses are suitable for damage detection of the bridge.On one hand, high frequency responses can contain much information about the ambient noises.On the other hand, selecting more frequency responses means that the increase of the DAE's input layer's neurons, and the DAE model will become very large.The computational cost will boost sharply, which cannot meet the requirements of real-time damage detection.Since the bridge's first three-order natural frequencies are smaller than 100 Hz [5,48], in this paper, frequency responses within 0-100 Hz are selected.In order to include as many details of frequency responses as possible, padding zeros is employed when performing FFT.The FFT resolution of each frame after padding is 0.0763 Hz, and there are 1310 data points between 0-100 Hz.

Real-time frequency responses analysis
The above process can be applied to DS 1-6, and the vehicle's frequency responses with respect to time can be obtained.For comparison, acceleration data of sensors attached to the bridge (the direct method) is also plotted.The results are shown in Fig. 9.
It can be seen from Fig. 9 that when the truck's vibration data are utilized (indirect method), responses at the truck's frequency are the highest because the sensor is attached to the truck's axles, and it can be easily identified as near 20 Hz.If the bridge's vibration data are used (direct method), some peaks in the bridge's frequency spectrum can be captured.For the intact bridge (see DS 0: direct), it can be initially assumed that its first natural frequency is near 30 Hz as it is the highest amplitude.Since the frequency-time spectrum is complex, we employed a finite element (FE) simulation for the U-shaped beam.
The basic parameters for the FE model are listed in Table 2.It is built in Abaqus with S4R shell elements.There are 3200 elements and 3417 nodes in total, and each node has six DOFs: x, y, z-translation, and x, y, z-rotation.The boundary conditions are set according to Fig. 2.After modal analysis, the beam's first natural frequency can be obtained as 30.790Hz, and its first order modal shape is shown in Fig. 10b.Thus, we can understand that in the DS 0: direct of Fig. 9, the amplitude around 30 Hz is the beam's fundamental frequency.
To verify our assumptions, free vibrations tests are performed to determine the fundamental frequencies of the vehicle ( 1 ) and the bridge ( 1 ), and the results are shown in Figs.11a and 12a.Exponent windows are employed on the free vibration signals to avoid spectral leakage.After FFT is employed, it can be seen from Figs. 11b and 12b that the bridge and vehicle's fundamental frequencies are  1 = 30.748Hz and  1 = 19.531Hz, respectively.We can see that the results of free vibration tests agree with our initial assumptions using Fig. 9.By analyzing DS 0-6 frequency-time spectrum using the direct method, we can find: (1) To some degree, the vehicle's frequency can be identified in the bridge frequency responses, so the sensors installed on the bridge can capture the truck's frequency as well.But these signals are relatively weak.( 2) With the increase of damage degree, the beam's first natural frequency decreases to 28.571, 27.551, 26.326, 25.306, 24.286, and 23.674 Hz for DS 1-6, respectively, but the vehicle frequency remains constant.Compared to the direct method, in the frequencytime spectrum using the indirect method, the bridge's natural frequency is submerged by the vehicle's frequency responses.Even if some traces can be found around 30 Hz, they cannot be adequate references for damage detection as they are too weak and sometimes disappear.Therefore, detecting the bridge's damaged conditions is rugged with only the truck's real-time frequency response peaks.The following sections will introduce the damage detection method using DAE.

Model configuration
In view of the fact that frequency responses of the passing vehicle are not on the same scale (see Fig. 9: indirect method), feature normalization is needed to improve the DAE model's capability.The process can be represented by Eq. (7), where   is the th feature.In this paper, it means the th frequency's response. is the mean value, and  is the standard deviation.It is worth noting that for normalization, only training data are utilized to obtain the mean value and standard deviation.
The training, validation, and testing datasets are split as follows.There are 506 runs in DS 0 in total.The first 400 runs are used for training and validation, and 401-506 runs are employed for testing.For training and validation, random 90% runs are selected for training, and the rest are used for validation.It is worth noting that the split of training, validation and testing should not be based on all frames of DS 0's runs.The reason is that the adjacent frames (with 0.99 s overlapping vibration data) are very similar.This will make the DAE model fit both the training and validation data very well but cannot be generalized for testing data.Besides, overfitting problems may not be noticed by this strategy.Instead, the split needs to be based on different runs.In our split, the test datasets are purely new for the trained DAE model.The validation datasets are randomly selected, and they can be utilized to monitor the training process in case there are overfitting problems.
In the following training process, the optimizer is selected as Adam and the learning rate is set as 1 −5 to optimize the DAE model.A batch size of 128 is employed.To avoid overfitting problems, the early stopping strategy is adopted: the training stops when it reaches the loss of 0.0001 or 1200 epochs.The trained DAE model at every epoch is saved, and the best model (determined by the validation loss) is selected in 1200 epochs.In addition, the regularization parameter       [50] and sci-kit learn [51] packages.
Before training the DAE model, some hyperparameters need to be adjusted to suit bridge damage detection problems so that the features of frequency responses can be learned accordingly.The DAE model configuration includes (1) the number of hidden layers, the number of neurons in the bottleneck, (3) the activation function, and (4) the regularization parameter.All employed hyperparameters are listed Table 3 Number of hidden layers.When a different number of hidden layers is analyzed, the number of neurons in the bottleneck is fixed as 16.The activation function is selected as Leaky-ReLU, and the regularization parameter is selected as 1 −5 .It can be seen from Table 4 that with the increase of hidden layers, the DAE model's learning capability becomes stronger, and the validation loss becomes lower.The traditional autoencoder model (1 hidden layer) performs poorly, and it nearly cannot learn any features of the input frequency responses.Also, we can see that the testing loss is greater than the validation loss.This is because the training and validation datasets are randomly selected from 1-400 runs.In two adjacent runs, the truck's dynamic characteristics, engine capability, speeds, etc. do not vary quite much, so after the DAE model is trained by training datasets, it will not be too hard for it to fit validation datasets.But for testing datasets, they are completely new for the DAE model, so the loss is a little greater than training and validation loss.Besides, we can see that when 7 hidden layers are utilized, the training loss can continue to decrease, but the validation and testing loss rebounds.The overfitting problem begins to emerge.In real testing in engineering, the testing datasets are unknown, so we can only determine the DAE model's performance by using validation datasets.In this paper, 5 hidden layers are selected for later analysis.5 shows the training results when 16, 32, 64, and 128 neurons are utilized in the bottleneck.It can be seen that with the increase of neurons, the training loss decreases sharply.The number of neurons can represent how much information is retained in the DAE model's bottleneck.More neurons denote that less information is dropped in the DAE model so that the frequency responses can be reconstructed with high precision.If there are only a few neurons, the DAE model has to drop much information, including the bridge's dynamic properties.However, if the DAE model owns too many neurons in the bottleneck, the noise information may also be kept when reconstructing frequency responses.According to the author's experience, the neurons' number can be selected near 1/10 of neurons in the input layer.In this paper, 128 neurons in the bottleneck are selected.

Activation
functions.As discussed above, the architecture 1310→512→256→ 128→256→ 512→ 1310 is selected.Different activation functions have various properties.In this paper, four activation functions: ReLU, Leaky-ReLU, Sigmoid, and Tanh are tested on the datasets.When the four activation functions are selected, their best validation losses within 1200 epochs are: 0.00071, 0.00068, 0.13782, 0.00082.It can be seen that the Sigmoid function performs the worst, and the rest three's best loss does not vary much.The loss reaches the minimum when the Leaky-ReLU function is utilized.This matched the results of Ref. [52] where time-domain responses were utilized for training an auto-encoder model.
Regularization parameter.Another important hyperparameter for the DAE model is the regularization value that is represented by () in Eq. (3).It is used to apply penalties on the weights of the DAE model so that the noisy representations in the training datasets can be reduced.As a result, the model's generalization capability will be improved, and the overfitting problem can be circumvented.There are two kinds of regularization terms used in DAE models as shown in Eqs. ( 8) and ( 9) where  > 0 is a hyperparameter representing the penalty term's contribution.‖ ⋅ ‖ 1 is the  1 norm and ‖ ⋅ ‖ 2 represents the  2 norm.
means all weights that need to be penalized.Specifically, in the proposed DAE model, only weights in the encoder need to be penalized.When reconstructing signals in the decoder, no penalty is needed because we want it to reconstruct signals better with the decoder.The main difference between these two penalties is that  1 can force unimportant features' weights to be zeros so that the weight matrix will become sparsity.However,  2 regularization will make weights of unimportant features as small as possible rather than make them zeros.
For the vehicle's frequency responses, a lot of features are about the vehicle rather than the bridge, so they have no strong connection to the bridge's health states (they can be understood as unimportant features for damage detection).Therefore, in order to avoid the singularity of the weight matrix and potential problems, the  2 norm is selected to apply to the DAE model's encoder.To select the optimal , DAE models with different penalty terms are built, and the results are shown in Fig. 13.It can be seen that when  increases, the training, validation, and testing losses increase simultaneously.The regularization term has played its role in limiting weights, but the testing loss increases the most sharply.In order to avoid overfitting and not to make validation and testing losses increase too much,  = 1 −5 is selected in this paper.In summary, the final DAE model is the architecture of 1310→ 512→ 256→ 128→256→512→ 1310 (5 hidden layers) with 128 neurons in the bottleneck, Leaky-ReLU activation function, and  2 regularization value of 1 −5 .

Automatic damage detection
Utilizing the above-selected hyperparameters and training the DAE model in 1200 epochs, the training, validation, and testing loss are shown in Fig. 14.It is apparent that after 1200 epochs of training, the loss between original and reconstructed frequency responses has been quite low.An example of the reconstructed frequency responses of training, validation, and testing data is shown in Fig. 15.We can see that the frequency responses between 0-100 Hz of the passing vehicle on the healthy bridge can be reconstructed with high precision.For testing data, the reconstruction precision is close to validation datasets.The results mean that the trained DAE model has learned the properties of frequency responses of the passing vehicle when the bridge is intact.
For DS 1-6, the reconstruction frequency responses can be found in Fig. 16.It is evident that when the bridge's damage is relatively low (DS 1), the truck's frequency responses can be reconstructed with relatively good precision.However, with the increase in damage severity, the reconstruction performance becomes worse, and the DI becomes higher.For DS 2, 3, and 4, we can see that the trained DAE model starts not to be able to reconstruct peaks of the truck's frequency responses, and for DS 5 and 6, some details of the frequency responses are lost when reconstructing.Therefore, the DI calculated by Eq. ( 5) can be utilized to determine the bridge's health conditions automatically.

Damage detection in real time
By reconstructing all frames for DS 0's 401-506 runs (testing), we can get Fig. 18a: DS 0: 401-506 runs.It can be seen that most DIs are near zero, indicating that the trained DAE model can reconstruct frequency responses with high precision for the healthy bridge.In addition, we can see that when the truck is near the end of the bridge, DIs increase greatly, and this happens to DS 1-6 as well.Therefore, when the truck is near the support, it is easy for the proposed method to determine that the bridge has been damaged mistakenly.This matches the results of Ref. [27] that when the vehicle runs close to the supports at the end, detection results become worst.Due to this reason, the outliers in DS 0 need to be removed before making decisions on realtime damage detection.By sorting all 54,455 frames' DIs of DS 0's 401-506 runs in ascending order, Fig. 17 is obtained.It can be seen that the first 4 4 DIs are near zero, and DIs increase slightly with the increase of the order number with a near-constant slope (green background area).However, due to the influence of supports at ends, after the first 4 4 DIs, the DI values begin to increase sharply, and the slope becomes higher (red background area).Therefore, it can be deemed that there are 4 4 ∕54455 × 100% = 73.46%data are not outliers in DS 0's 401-506 runs.The value at the 4 4 position can be used as the health damage threshold of DIs.In this paper, the threshold is 7.815 −4 .
For all runs in DS 1-6 shown in Table 1, the real-time DIs are shown in Fig. 18a.We can see that for a certain DS, the real-time DIs are similar.Also, it can be found that for damaged scenarios (DS 1-6), the real-time DIs become greater and unordered compared to DS 0. When the damage severity is low (see DS 1), the real-time DIs are relatively low and are similar to DS 0: 401-506 runs.When the damage severity is increased, DIs become higher.In DS 2's runs, most DIs are within [0, 0.1].But for DS 3, it becomes quite hard for the trained DAE to reconstruct all frequency responses, and all DIs are between [0, 0.3].For DS 4 and 5, the scale of DIs goes back to [0, 0.1].For DS 6, DIs become greater and the scale is [0, 0.5].It can be found that the value of DIs does not have a positive linear correlation relationship with damage severity.Thus, the value of DIs may not be suitable for determining the severity of damage.For comparison and to show the performance of the proposed method, one run's DIs in DS 0-6 are plotted in Fig. 18b.As shown in the figure, for a DS 0's run (intact bridge), the bridge can be identified as healthy most time.When the bridge is damaged with low severity (e.g., DS 1 and 2), sometimes the proposed method can determine the bridge is intact wrongly (see DS 1: 5.43-5.50s, 5.78-5.83s, 5.88-5.89s, 5.49-6.21s, and DS 2: 3.21-3.24s, 3.31-3.34s).However, with the increase in damage severity, it can be seen that the trained DAE model can constantly identify the bridge's damage when the truck is passing the bridge (see one run for DS 3,4,5,and 6).Therefore, in this paper, a new index named identified damage ratio (IDR) is proposed to estimate the damage severity of the bridge as shown in Eq. (10).
where   is the sum number of the moments when the bridge is identified as damaged, and  is all frames for a certain DS.For example, for DS 1, there are 49 runs and 26,828 frames in total, in which, at 19,205 moments, the bridge is identified as damaged.That is to say, IDR = 19205/26828 = 71.58%.It can be understood that when the truck is passing the bridge, it can be identified as damaged at 71.58% of the total time.By analyzing DS 2-6's damage indicators, their IDRs are 80.29%, 82.42%, 92.87%, 97.54%, and 97.62%, respectively.It can be seen that with the increase in damage severity, the IDR is increasing as well.In other words, if the bridge is damaged severely, it will be easier for the trained DAE to determine if the bridge is damaged with fewer mistakes.However, it must be noticed that when the bridge is damaged to a relatively high degree, the trained DAE can detect the bridge's damaged state in real time with high accuracy (> 90%), namely high IDR.At this time, if the damage degree increase, the IDR's increase speed will be slow (e.g., for DS 5 and 6, only 0.08% increase of IDR) and the IDR is gradually approaching 100%.
In order to investigate the property of DIs, Fig. 19 plots the histogram of DIs for DS 0-6.It can be noticed that with the increase of the DI's value, the occurrence of DIs increases sharply at the beginning and decrease slowly after the peak.The DIs are approximately subjected to a lognormal distribution for a certain scenario.The fitted lognormal probability distribution functions (PDFs) for DS 0-6 are shown.When the bridge is healthy (see DS 0: 401-506 runs in Fig. 19), most DIs are in the healthy area (green background), and a small part of DIs are in the damaged area (red background).After the bridge is damaged, the proportion of the healthy area is gradually shrunken and approaches zero.Also, it can be seen that when the damage severity is high (e.g., DS 6), there will be a large number of DIs with high values (>0.04) compared with DS 1-2 in which most DIs are in [0, 0.03].By summarizing all IDRs for DS 1-6, the overall damage detection accuracy of the proposed method is 86.2%.

Consideration of engineering applications
This paper proposes an unsupervised method to monitor the health conditions of the bridge.Compared with other methods, it only needs  an initial condition of the healthy bridge and does not require labeled data of the damaged case.The approach is based on detecting minor changes in time-variant frequency responses of passing vehicles before and after the bridge is damaged using DAE.Employing the proposed DIs, the bridge can in real time when the vehicle passes the bridge.The proposed approach needs many runs when the vehicle passes the ''healthy'' bridge to learn its features hidden in vehicular signals.Then, after months or years, the vibrations of the same vehicle passing the bridge will be collected again for the real-time diagnosis of bridge health conditions.
However, in engineering applications, many factors can challenge the proposed method's applications, for example, temperature effects, wind loads, control of vehicle speed, etc.One key obstacle is the inverse influence of road roughness.The experiments in this study are based on relatively smooth road roughness, and good results have been obtained.Still, in practical engineering, the road roughness can be worse than the laboratory tests.Generally, the classification of road roughness can be determined by power spectral density (PSD) as A∼H (smoothest ∼ poorest) in ISO 8608 [53].It has been reported that the road profile of bridges is better than normal roads [54] and can be typically regarded as A-class [55][56][57].When the road roughness is relatively smooth, the excitation source of the vehicle is mostly from the bridge's vibrations, so the vehicle's vibrations of a single axle can contain much information about the bridge's dynamics, that is, frequency-domain responses in this study.However, as the road roughness becomes poorer, the vehicle will be stimulated by the road roughness and the bridge's vibrations.It can be observed that the scale of road roughness is generally greater than the bridge's deflection [46,54,58,59], making excitation from the road roughness (random external excitation) the main source for the vehicle.Under this condition, the vehicle will vibrate greater, but the road roughness can submerge the bridge's dynamic information.As little information about the bridge is contained in the vehicle's single axle's vibrations, it can become more challenging for the DAE to extract features related to the bridge's damage.
Therefore, eliminating the effects of poorer road roughness when employing the proposed method will be in our future work.There are two ways to remove the influence of road roughness that deserve further verification before being applied in engineering applications.The first method is to utilize residual responses of two connected vehicles to weaken the inverse effects of road roughness.The method typically needs to subtract vibrations of the rear axle from the front one at the same point of road roughness (residual accelerations).It was found that the bridge's frequency responses are more outstanding when the residual accelerations are employed compared to using raw accelerations of one axle [60][61][62].However, utilizing two connected vehicles may be hard to operate in practical engineering [45].The second approach is related to back-calculating a two-axle vehicle's CP responses and using the residual CP responses of two axles [45,63].Since CP responses are related to the bridge's vibrations and road roughness, the vehicle's information is completely removed in the frequency domain.Further, the influence of road roughness is mostly eliminated by subtracting the CP responses of the rear axle from the front one.However, this method relies on precise calculating of CP responses from vehicular accelerations as they cannot be measured directly.Besides, requiring the vehicle to be driven strictly straight and wheels to undergo the same road roughness may be challenging to achieve in engineering applications.The above two methods for removing the influence of road roughness will be further investigated in our future studies when the DAE is employed to extract weak damagesensitive features of bridges from the vehicle's vibrations before it is applied to practical engineering.
As our initial investigation, this study investigates the influence of engine effects, different traces of driving, various vehicle speeds, etc., on real-time damage detection using the DAE.The relatively smooth roughness is employed at first in the laboratory tests.Our research will be extended to weaken the influence of road roughness, ongoing traffic, and effects of other external loads, etc. before the proposed method can be employed in engineering applications.

Conclusions and future work
Utilizing the passing vehicle's vibration data, an automatic driveby bridge damage detection method is proposed in this paper.The frequency responses of the vehicle were employed as inputs to train a DAE model which can identify the bridge's health conditions.By dividing the vibration data into frames, the damage detection process can be accomplished in real time.That is to assess the bridge's conditions instantaneously when the truck is passing on the bridge instead of analyzing data after it has passed the bridge.The main conclusion remarks are shown below: (1) When passing the bridge, short-time vibration data from the vehicle can be used to evaluate the bridge's health states.In comparison with the existing methods that typically need the vehicle to pass the bridge and collect all vibration data, utilizing short-time accelerations is more efficient and can assess the bridge's health condition in real time.
(2) The proposed method can automatically identify the bridge's health conditions using the extracted damage indicators from the vehicle's vibration data.When six different damage scenarios are explored, a relatively high accuracy can be achieved (86.2% in this study), though the trained DAE model may misreport the health condition of the bridge near the support.(3) According to the results from the laboratory tests, the value of DIs may not be suitable for determining the damage severity.Instead, the proposed IDR can be utilized as a reference for detecting the bridge's damage severity.With the increase in damage severity, the IDR increases remarkably at first and then gradually approaches 100%.
It is worth noting that the proposed method is a baseline-based method, which requires a large number of the vehicle's vibration data when passing the ''healthy'' bridge.Despite the findings summarized above, there are many external factors (e.g., seasonal temperatures, winds, the bridge deck's road roughness, and other traffic influences) that can influence the automatic damage detection procedure that deserves further exploration.Our future work will initially be extended to check the influence of road roughness and investigate ways to weaken its influence on the proposed method.Then, other possible factors will be tested before field tests.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 4 .
Fig. 4. Configuration of the truck and sensors.

Fig. 9 .
Fig. 9. Comparison of frequency responses with respect to time using indirect and direct methods.

Fig. 14 .
Fig. 14.Training, validation and testing loss for the intact bridge.

Fig. 17 .
Fig. 17.Sorting frames of DS 0: 401-506 runs in ascending order.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 19 .
Fig. 19.Distribution of DS 0-6's DIs.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 2
Basic parameters of the beam's FE model.

Table 4
Selection of hidden layers.

Table 5
Selection of neurons in the bottleneck.
Fig. 13.Loss with respect to the regularization parameter.