A Statistical Model for Phase Difference Spectrum of Ground-Motion and Its
Application in Generating Non-Stationary Seismic Waves

The intensity non-stationarity is one of the most important features of earthquake records. Modeling of this feature is significant to the generation of artificial earthquake waves. Based on the theory of phase difference spectrum, an intensity non-stationary envelope function with log-normal form is proposed. Through a tremendous amount of earthquake records downloaded on Kik-net, a parameter fitting procedure using the genetic algorithm is conducted to obtain the value of model parameters under different magnitudes, epicenter distances and site conditions. A numerical example is presented to describe the procedure of generating fully non-stationary ground motions via spectral representation, and the mean EPSD (evolutionary power spectral density) of simulated waves is proved to agree well with the target EPSD. The results show that the proposed model is capable of describing the intensity non-stationary features of ground motions, and it can be used in structural anti-seismic analysis and ground motion simulation.


Introduction
With the development of earthquake engineering and the innovation of novel structure forms, it is necessary to obtain a better understanding of these structures' performance under seismic loads. It's preferable to conduct the time-history analysis, especially when the inelastic behavior of a structure is innegligible. Seismic inputs for the time history analysis are the records of ground motions, which can be commonly obtained by selecting ground motions recorded during the past earthquake events based on the site type, and epicenter distance, etc. Nevertheless, due to the complexity of geographical conditions and the variety of design schemes, records of natural ground acceleration are far from enough, which makes it an alternative to use synthesized earthquake waves by numerical methods.
There are mainly two types of stochastic ground motion models: one is 'source-based' models [1,2], which considers the effect of fault rupture, propagation of seismic wave and local site of structure, and the other is 'site-based' models [3,4], which are derived from the description of the ground motion for a specific site by fitting to a recorded earthquake wave with known earthquake and site characteristics. The former one is a mixed model of deterministic and stochastic models and has to depends on seismological principal when describing the fault breaking mechanism and seismic wave travel path. Therefore, the model is favored by seismologists, but cannot be widely used in engineering due to the lack of engineering data. The latter model does not rely on physical mechanisms of ground motions but emphasizes on parameter fitting of the model based on ground motions recorded on specific sites with known earthquake and site characteristics. The seismic waves synthesized based on site-based model and the recorded earthquake waves are statistically similar. Furthermore, the correlation between model parameters and the site characteristics can be obtained using suitable statistical methods, which can be adopted to simulate ground motions on the specific sites.
It is a challenging issue for accurate modeling of non-stationary earthquake ground motions by sitebased models. Ohsaki [5] believed that the main sources of non-stationary characteristics of earthquake waves are phase spectrum and firstly pointed out that the shape of the phase difference spectrum is somehow similar to the envelope function of earthquake records. Researches have been carried out on this interesting phenomenon. Zhao et al. [6,7] studied the statistical natures of the phase difference spectrum and then established an empirical model from the perspective of seismology. It was revealed that the phase difference spectrum has an impact on the SDOF response spectrum by changing the shape of response time history. Cheng et al. [8] defined the concept of principal value for phase difference spectrum and discussed the relationship between phase difference spectrum and the distribution function of its principal value. Zhu et al. [9] analyzed the connection between the statistical characteristics of phase difference spectrum and the magnitude and epicenter distance of earthquake records. Jin et al. [10] proved that for a given earthquake record the amplitude of the phase difference spectrum is directly proportional to the value of envelope function at a certain point of time. That is to say, it's applicable to generate intensity non-stationary records by altering the phase spectrum instead of modulating the amplitude. Zhao et al. [11] pointed out the inner connection between phase difference spectrum and velocity pulse. Thrainsson et al. [12] studied the probability distribution law of the phase difference spectrum and then offered a formula to describe the patterns of its mean value and variance from a statistical perspective.
Researches about the application of the phase difference spectrum have also been carried out widely. Zhou [13] developed a method to simulate artificial seismic waves via the phase difference spectrum. Tian et al. [14] combined FFT technology with a statistical phase difference spectrum model to generate non-stationary seismic motion field. Yang et al. [15] researched the numerical characteristics of phase difference spectrum and its connection with ground motion parameters. Based on their result a seismic simulation method was also been proposed. Similar research has been conducted by Yi et al. [16] on the generation of artificial seismic waves via the distribution law of phase angle and phase difference spectrum. Li et al. [17] simulated multi-point earthquake excitations for engineering applications according to the statistical characteristics of phase difference spectrum on different site conditions. Tan et al. [18] analyzed the numerical characteristics of 67 strong motion records in the western part of China, and then regenerated the seismic waves, which proved the feasibility of the earthquake generating method based on the phase difference spectrum. Yang [19] generated fully non-stationary seismic waves on the basis of the relationship between phase difference spectrum and intensity non-stationary characteristics. Yang et al. [15,19,20] generated several artificial seismic waves according to the hypothesis about the probability distribution of the phase difference spectrum.
The intensity non-stationarity is one of the most important features of earthquake waves, it reflects the arrival time of mainshock and the variation of energy release. Hence, it's necessary to gain a better understanding of the non-stationary nature of earthquake waves and then describe it mathematically. Up to present, the intensity non-stationary characteristic of an earthquake wave is often described by envelope functions [21][22][23][24]. The accuracy of a model mainly depends on the value of parameters once the envelope function form has been determined. Significant efforts have been made to identify the parameters, which also statistically reveal the nature of earthquake waves. Huo et al. [25] studied the influence of magnitude, epicenter distance and site classification on the model parameters by a statistical regression method. Similar research has been conducted by Qu et al. [26] to build a regression model of earthquake intensity envelope function, which indicates that the thickness of soil where microseismograph locate has an influence on the value of parameters.
Due to the limitation of strong earthquake observation techniques, however, the sample size of these researches is not large enough, which makes the validation of parameters less than satisfactory. Furthermore, most of the researches focus on the parameters' ability to control envelope function shapes. It is of equal importance to interpret the physical meanings of the parameters, so as to uncover the characteristics of earthquake waves statistically.
Through the analysis of a tremendous amount of natural earthquake records, an intensity non-stationary envelope function with a log-normal distribution form is proposed. A phase difference method is used to reveal the underlying relationship between intensity non-stationarity and phase difference spectrum of earthquake waves. The phase difference method also provides a unique perspective to consider the intensity non-stationarity of earthquake waves, which improves the spectral representation method by predetermining the phase distribution. Modulating the Clough-Penzien spectrum with the proposed intensity non-stationary envelope function, a fully non-stationary model is obtained. In the last section, a numerical example is provided to simulate artificial waves by spectral representation, and the mean EPSDs of simulated waves are proved to be identical with the target EPSD.

Phase Difference Spectrum
Given the above, owing to the similarity between the shape of the phase difference spectrum and envelope function, the phase difference spectrum provides us a unique perspective to research the intensity non-stationary characteristics of ground motions.
The Discrete Fourier Transform of time series is given by: where, where f k denotes the phase angle, and H k denotes the amplitude. The phase difference is then defined as: where, is the Heaviside step function, which guarantees the value of phase difference restricted over the range -2π ≤ Δf < 0. The frequency distribution of Δf is the so-called phase difference spectrum.

Intensity Non-Stationary Model
The analysis of selected earthquake records indicates that their phase difference spectra mainly obey the logarithmic normal distribution. Hence, an intensity non-stationary envelope function (see in Fig. 1 where f(t) is a logarithmic normal distribution function multiplied by a factor I 0 . The logarithmic mean value μ and the logarithmic standard deviation σ are two parameters controlling the shape of the envelope, while the factor I 0 adjusts the peak value of f(t) to 1. E and V are the arithmetic mean and the arithmetic variance respectively. Because V controls the concentration of energy, so it will be mentioned as the shape parameter in later sections. t p is the mode of the lognormal distribution function, which represents the peak moment of the envelope.

Fully Non-Stationary Model
Kanai-Tajimi spectrum is commonly used to generate artificial seismic waves, which however only possess frequency non-stationary characteristic. But if modulate Kanai-Tajimi spectrum with the proposed envelope function, evolutionary power spectral density (EPSD) [27,28] can be obtained, which possesses time-frequency non-stationary characteristics. The spectral representation method can then be adopted to generate fully non-stationary seismic waves.
Kanai-Tajimi spectrum is inapplicable for its zero-frequency component is non-zero, hence the power spectral density suggested by Clough and Penzien is adopted here, which is then improved by Liu et al. [29] to be compatible with the Chinese national code for seismic design of buildings [30].
The Clough-Penzien model passes the Kanai-Tajimi spectrum through a soil filter with parameters ω f and ζ f ; ω g , ζ g is the characteristic frequency and damping ratio; S 0 is the intensity parameter, which can be represented by where the parameter a max is the peak acceleration corresponds to different design intensity, the recommended values of a max can be seen in Tab. 2. The parameter r is the peak factor, the recommended values of r can be seen in Tab. 1.
Tab. 1 shows the suggested model parameters for Clough and Peng model in several different soil types; Parameter a max in Eq. (6) can be obtained according to Tab. 2.
The fully non-stationary model modulates the Eq. (5) with an envelope function f(t), hence the EPSD E (ω,t) can then be described by Eq. (7).
where S a (ω) is a power spectral density function described by Eq. (5), f(t) is a time-dependent envelope function with lognormal form; I 0 is a model parameter which adjusts the peak of envelope function to 1; μ and σ are logarithmic mean value and standard deviation which control the shape of the envelope function.
These model parameters will be analyzed through a statistical approach based on an earthquake databank and then the recommended value will be provided.

Selection and Classification
According to the theory of mean shear wave velocity [30], site conditions of where the stations locate are divided into four types (see in Tab. 3). For the purpose of studying the intensity non-stationary property of

Genetic Algorithm
A genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate highquality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover, and selection [31].  The detailed parameter fitting procedure using the genetic algorithm is as follows: Step 1: Generate n pairs of values for parameter μ and σ randomly as the initial population. The corresponding envelope functions can thereby be expressed as: where the value of I 0 varies with μ i and σ i to ensure that the peak value of f i (t) equals exactly to 1.
Step 2: Instead of fitting the envelope obtained by connecting the extreme points, here we fitted the normalized cumulative energy function H ENVE and H TAR calculated by envelope function f i (t) and target earthquake record respectively, see in Fig. 3. The maximum error can then be calculated by: R i can reflect the fitness of the selected values of parameters. H(t) is the normalized cumulative energy function, which can be defined in Eq. (10).
where T stands for the total duration of target earthquake records. The purpose of this algorithm is to find a set of values that minimize the maximum deviation function R i .
Step 3 Select parents for reproduction based on their fitness, and then apply operators, including crossover and mutation, to generate offspring.
Step 4 Repeat Step 1 to Step 3 until all the value of R i is less than 10%, then the minimum values of μ min and σ min are selected as the final value.
The parameters used in the genetic algorithm can be seen in Tab. 5.
A total of 20850 pairs of parameters have been determined from the procedure above. Their relationship with magnitude and epicenter distance can readily be seen in Figs. 4 and 5. It's quite obvious that the parameter μ varies proportionally with magnitude M and epicenter distance R, while the relationship between σ and M, and between σ and R is no so clear.

Regression Model of Parameters
In order to better describe the relationship among the parameters μ, σ, M and R, an attenuation model [32] is introduced, as can be seen in Eq. (11).   where Y can be either μ or σ; c 1 , c 2, and c 3 are unknown constants; R 0 is the Epicenter distance; ε is the random error. A regression fitting procedure has been conducted for model parameters according to different site conditions. The principle is to minimize the value of J.
where ΔX and ΔYare the deviations in two horizontal directions; w x and w y are the weight coefficients, taken as 1.0 in this paper. The results of the regression are presented in Fig. 6. The detailed results are presented in Tab. 6.   According to different site conditions, the Kolmogorove-Smirnov test was conducted and the results showed that the parameter μ obeys Generalized Extreme Value Distribution (GEVD), of which the probability density function has the form where the k 1 , α, and β are model parameters. The analysis also indicates that the parameter σ obeys Logarithmic Normal Distribution (LND) where the σ 1 and μ 1 are model parameters, and the subscript is applied for distinction. The detailed results can be seen in Tab. 7.
For the purpose of engineering application, the recommended values of μ and σ are listed in Tabs. 8-11 according to three levels of M and R.

The Statistical Distribution of μ/σ
The above analysis considers μ and σ individually, but actually, these two parameters are not independent. In order to further establish the relationship between μ and σ, a similar statistical analysis procedure has been conducted to find out that the ratio of μ to σ obeys the Gamma Distribution (GD), whose probability density function has the form: where λ = 1/β; Γ(α) is the Gamma Function, which is defined as The model parameters α and β are obtained through a curve-fitting procedure. The results can be readily seen in Tab. 12.

The Statistical Distribution of μ with t p Fixed
When t p is fixed, the correspondence relationship between the two model parameters can be described as: Hence, once t p and σ are determined, μ can be calculated subsequently. The normalized envelope function with t p equals to 15 s, 25 s, 30 s respectively are demonstrated in Fig. 7. The analysis indicates that μ obeys Generalized Extreme Value Distribution. The fitted model parameters are presented in Tab. 13.

The Statistical Distribution of σ with V Fixed
The arithmetic variance V represents the level of energy concentration. When V is fixed, model parameters μ and σ are constrained by some relation in this form.
Hence, once the values of σ and V are obtained, μ can then be obtained. Fig. 8 displaces the normalized envelope function with V equals to 30, 90, 150 respectively. The analysis shows that the parameter σ conforms to the lognormal distribution.The fitted model parameters are presented in Tab. 14.

Numerical Examples
In this section, a set of artificial earthquake waves will be generated via a numerical method. The related parameters are demonstrated in Tab. 15.
Once the model parameters are determined, the uniformly modulated non-stationary stochastic process can be generated by the spectral representation method [33,34].
where  φ n are random phase angles uniformly distributed in the range of [0, 2π]; S a (ω) is the power spectral density function of ground motions defined in Eq. (5); E(ω,t) = f 2 (t)S a (ω) is the target evolutionary PSD function; ω u is the cut off frequency; x 0 (t) is a stationary random process with power spectral density function S a (ω); x(t) is the so-called uniformly modulated non-stationary stochastic process, one of the sample processes is demonstrated in Fig. 9.
A total of 300 ground motions have been generated by the spectral representation method, and the Spanos-Tratskas method [27] is adopted here to estimate the evolutionary PSD of simulated ground motions. The result demonstrated in Fig. 10 shows that the EPSDs of simulated ground motions are  identical to the target EPSD, then it's feasible to conduct structural analysis through these artificial motions. The detailed ground motion parameters are presented in Tab. 16, where PGA, PGV, PGD denote the peak ground acceleration, velocity, and displacement respectively; RMSA, RMSV, RMSD denotes the root mean acceleration, velocity, and displacement respectively. Fig. 11 shows the   The maximum and minimum response curve, and response curve with exceeding probability of 50% and 80% are plotted respectively.

Conclusion
Using the genetic algorithm, the parameters of the proposed non-stationary envelope function with lognormal form can be obtained based on a group of 3475 earthquake records. By statistical analysis of the obtained parameters, several important conclusions can be concluded as followers: (1) The model parameter μ increases with the magnitude and epicenter distance, which means that the peak moment of earthquake records arrives later with the increment of magnitude and epicenter distance. However, there is no obvious relationship between the model parameters and earthquake parameters, which means that the earthquake magnitude and epicenter distance have almost no effect on the shape of the envelope function.
(2) The model parameter μ and σ obeys generalized extreme value distribution and logarithmic normal distribution respectively, while The ratio μ/σ is subject to Gamma distribution. For the convenience of engineering application, the recommended value of μ and σ corresponding to particular magnitudes and epicenter distances are also presented in the paper.
(3) The Clough-Penzien spectrum was modified by the proposed intensity non-stationary envelope function to generate a time-variant evolutionary power spectral density. Using spectrum representation, 300 spectrum-compatible artificial ground motions are generated, and the numerical results showed that the mean EPSD of the generated ground motions are consistent with the target EPSD.