Intelligent Detection of Small Faults Using a Support Vector Machine

The small fault with a vertical displacement (or drop) of 2–5 m has now become an important factor affecting the production efficiency and safety of coal mines. When the 3D seismic data contain noise, it is easy to cause large errors in the prediction results of small faults. This paper proposes an intelligent small fault identification method combining variable mode decomposition (VMD) and a support vector machine (SVM). A fault forward model is established to analyze the response characteristics of different seismic attributes under the condition of random noise. The results show that VMD can effectively realize the attenuation of random noise and the seismic attributes extracted on this basis have a good correlation with the small fault. Through the analysis of the SVM algorithm and the fault forward model, it is proved that it is feasible to realize intelligent predictions of small faults by using seismic attributes as the input of a SVM. The fault prediction method using a SVM that is proposed in this paper has higher accuracy than the principal component analysis method, as the prediction results have important guiding significance and reference value for later coal mining. Therefore, the method presented in this paper can be used as a new intelligent method for small fault identification in coal fields.


Introduction
The geometric and interfacial properties of faults in rocks are subjects of critical importance to stress concentration and mining safety [1]. The Biot-Gassmann theory, in the case of low frequency activity (less than 1000 Hz), describes the relation of reflection amplitude to frequency as well as petrophysical and fluid features [2]. Since the 1990s, with the development of 3D seismic work in coal mining areas, seismic data have been used to identify faults with a drop greater than or equal to 10 m and the coincidence rate is high. In areas with good seismic geological conditions, seismic data have been used to control faults with a drop greater than or equal to 5 m and the coincidence rate is between 60% and 75%, laying a strong geological guarantee for a high yield, high efficiency, and safe production in coal mines. In recent years, with the progress of seismic exploration technology, coal mines have increasingly higher requirements for exploration accuracy and many mines have included the interpretation of faults of about 3 m into the geological task [3]. Small faults in coal seams, especially those with a drop of less than 3 m, are important factors that often induce safety accidents. For example, the phenomenon of water splashing and dripping that occurs near fault zones can easily lead to water bursting in a mine. A gas outburst is easy to occur on both sides of the fault zone or the coal strata distortion zone. Small fault zones are also prone to caving at the top of the roadway [4]. Therefore, identifying small faults in coal seams is very important to prevent accidents, such as water bursts, gas outbursts, roof falls, and rock bursts, and to ensure safe production in mines [5].
The existence of faults often makes the phase of seismic data unstable, but within the effective frequency band of seismic data, as the frequency increases, small faults become more clear [6]. In recent years, renowned scholars have conducted substantial research work on technology and methods of small fault identification, mainly using time-frequency attributes, seismic coherence attributes, instantaneous seismic attributes, curvature attributes, and texture attributes [7][8][9][10]. In the early 1980s, Morlet et al. [11] first applied the short-time Fourier spectrum for seismic interpretation. In the 1990s, Partyka et al. [12] obtained the spectrum (amplitude spectrum and phase spectrum) of the seismic trace by conducting continuous time-frequency analysis of the seismic trace, that is, transforming the time-frequency analysis technology into a practical and simple interpretation tool and forming the seismic spectrum decomposition technology. Partyka et al. [13] also used spectral decomposition technology to predict river courses and achieved good results. Marfurt et al. [14] used this technology for thin-layer visualization and sedimentary facies analysis. Wei et al. [2] applied spectral decomposition technology on reservoir fluid identification and achieved preliminary application results.
Furthermore, there are various methods, such as coherent volume technology, variance volume technology, and ant tracking technology, which can be applied in small fault identification. Feng et al. [15] established a set of small fault identification technology based on the joint interpretation of 2 and 3 dimensional seismic data, which improved the accuracy of small fault identification and successfully explained 56 small faults in the Etoke area. Lu et al. [16] controlled a fracture skeleton and a fracture by, respectively, controlling the calculation time window and ensuring that the fused attribute body was in good agreement with the logging information. Zhuang et al. [17] extracted an ant attribute from the seismic data of the 82 mining area of the Qinan coal mine in Huaibei, which improved the accuracy of fault interpretation. However, these technologies are applied in conditions in which the signal-to-noise ratio (SNR) of the data is high and the application efficiency of the above technologies is generally poor for seismic data with low SNR. Then, the derivative and gradient classes as the main analysis methods are developed. However, derivative or gradient methods tend to amplify the noise for low SNR seismic data, so various edge-preserving filtering methods have been developed, such as structure-oriented constraint filtering and edge-preserving focusing filtering. Support vector machines (SVMs) have unique advantages in solving small sample, nonlinear, and high-dimensional pattern recognition problems; have excellent generalization abilities; and are robust to solve classification and regression problems. Based on the measured data, Tan et al. [18] selected fault dip, drop, dip angle, and fault properties as characteristic influence factors to establish a SVM prediction model for the horizontal length of small faults in the Zhaoguan mine, and compared the prediction results with those of traditional multiple regression models. The results showed that the SVM prediction model is more accurate when the sample size is smaller. Sun et al. [19] established a SVM two-classification fault recognition model by analyzing the seismic attributes of structural and non-structural parts, and explained that some small faults that could not be recognized by conventional seismic profiles. He et al. [20] studied fracture classification methods based on an approximate support vector machine. The results of the latter study showed that the SVM algorithm is effective in fault identification, but the SNR of original seismic data is seldom discussed in the above studies, and the application results in the low SNR region need to be studied further.
At present, safe and efficient mining of coal requires more accurate identification of small faults in coal seams. However, when noise is contained in 3D seismic data, fault identification errors are easy to occur. Based on the good ability of variable mode decomposition (VMD) in denoising and the high accuracy of a SVM in two classification problems, this paper proposes to combine VMD with a SVM, and apply it to the identi-fication of small faults. In this method, the seismic signal is decomposed by modal change to effectively remove random noises and improve the SNR of seismic data. Then, the seismic attributes sensitive to the fault response are extracted, the appropriate seismic attributes are selected, the SVM is used for learning and training, and the prediction of small faults is finally realized.

Basic Principles of VMD
Huang et al. [21] proposed a brand new signal processing method based on the concept of instantaneous frequency, namely empirical mode decomposition (EMD). This method is essential to stabilize a signal, decompose the real fluctuations of different scales in the signal step by step, and form a series of data sequences with different characteristic scales. EMD is a powerful tool for analyzing non-stationary and nonlinear signals, but it has some problems, such as lack of strict mathematical basis, low algorithm efficiency, and modal aliasing. To solve these problems, Dragomiretskiy et al. [22] proposed a method called VMD, which is a powerful signal analysis tool similar to EMD and has a firm mathematical theoretical basis. The VMD method decomposes the signal into a finite sum of intrinsic mode functions (IMFs). The original signal is decomposed into multiple eigenfunctions by applying a single model and the reconstructed signal is obtained after the residual errors are eliminated based on the threshold criterion.
Signals are decomposed into k-IMF signals by the VMD algorithm and each decomposed mode is processed in the following three steps: 1. For each mode, calculate the related analytical signal through the Hilbert transform; 2. For each mode, adjust the respective estimated center frequency by adding an exponential term and transform the frequency spectrum of the mode to the baseband; 3. Estimate the bandwidth by performing Gaussian smoothing on the demodulated signal.
In this way, a variational constraint problem can be obtained and, then, an unconstrained problem can be obtained by using the quadratic penalty function term and the Lagrangian multiplier operator. According to the iterative relationship of the function, an IMF is output to solve the problem.
The specific construction steps are as follows (Equations (1)-(4)): By the Hilbert transformation, the analytic signal of each modal function uk (t) is obtained to determine the unilateral spectrum of the signal.
1. Mix the analytical signal in each component with a pre-estimated center frequency and modulate the spectrum of each mode into the response base frequency band: 2. Calculate the square norm of the demodulation signal gradient above and estimate the bandwidth of the modal signal. Introduce the constraint conditions to construct the optimal variational model to minimize the sum of the aggregate bandwidth of each component: K is the number of components: , , … , and 1 , 2 , … , , respectively, are the obtained K frequency band components and the center frequency of the corresponding frequency band. 3. Introduce the quadratic penalty factor and Lagrange multiplication operator to change the constrained variational problem into the unconstrained problem (the transformation from constrained to unconstrained is equivalent here and the proof is no longer expanded). The quadratic penalty factor α can ensure the accuracy of signal reconstruction in the case of Gaussian noise and the Lagrange multiplier can ensure the rigor of model constraints. The "saddle point" of the augmented Lagrange expression is obtained by using the alternating direction multiplier algorithm and the determination accuracy ε is given to be greater than 0 until the iteration stop condition is satisfied: 4. At the end of the iteration, k-IMF components are obtained.

Basic Principles of a SVM
A SVM is a machine-learning method based on statistical learning theory, VC dimension theory, and structural risk minimization principles. It shows many unique advantages in solving small sample, nonlinear, and high-dimensional pattern recognition problems, and, to a large extent, it overcomes the problems of "dimension disaster" and "over-learning" [23]. It finds the best compromise between model complexity and learning ability, according to the limited sample information, in order to obtain the best generalization ability. The basic principle of a SVM regression machine is as follows: When the regression function is assumed to be the fitting data, it is necessary to find a w as small as possible. To this end, the universal number of Euclidean space is minimized. Where w and b are the normal vector and offset of the regression function, respectively, and assuming that all training data are fitted with the function without error under the accuracy , this leads to the following optimization problem (Equations (5) and (6)): When the above constraint conditions cannot be fully satisfied, the relaxation variables ξ i and ξ i * can be introduced, and the optimization problem can be transformed into the following problem (Equation (7)): The optimization function ( ) w φ is quadratic and the constraints are linear, so the optimization problem is a typical quadratic programing problem that can be solved by using the Lagrange multiplier method.
For linear classifiers, here is a simple example. Now we have a two-dimensional plane on which there are two different kinds of data represented by circles and crosses. Since the data are linearly separable, the two types of data can be separated by a line that acts as a hyperplane, where all the points on one side of the hyperplane correspond to y of negative 1 and all the points on the other side correspond to y of 1 ( Figure 1). This hyperplane can be represented by a classification function (Equation (7)): when f(x) is equal to 0, x is on the hyperplane; if f(x) is greater than 0, y is equal to 1; and if f(x) is less than 0, y is equal to −1 (Figure 2). In other words, when a new data point x is encountered while classifying, the category of x is assigned −1, if f(x) is less than 0, and the category of x is assigned 1, if f(x) is greater than 0.

Small Fault Prediction Process Based on VMD and a SVM
Based on the good ability of VMD in denoising and the high accuracy of a SVM in two classification problems, this paper proposes an intelligent identification algorithm for small fault prediction using VMD and a SVM. The main steps of this method are as follows ( Figure 3): 1. Based on the characteristics of small faults in coal seams, construct the fault model containing a coal seam and carry out the forward simulation; 2. Add random noise to the forward seismic records and, then, use VMD for denoising. Analyze the denoising effect of VMD and the response characteristics of different seismic attributes to the fault, and select the related seismic attributes with good response effect to the fault for fault identification.
3. Take the exposed fault data of the coal seam and its seismic attributes as the learning samples, use the SVM for learning and training, and apply the small fault prediction to the actual seismic data of the coal field.

Modeling
In order to study the response characteristics of seismic attributes to small faults, fault models with different drops are constructed. The model is 1000 m long, mainly including loess, mudstone, and a coal seam. The velocity, density, and thickness of the longitudinal and transverse waves of each layer are shown in Table 1

Analysis of the Faults' Seismic Response Characteristics
In order to analyze the seismic response characteristics of different fall faults in detail, the instantaneous amplitude, instantaneous frequency, waveform characteristics, Q value, and frequency bandwidth are extracted, and a total of 5 seismic attributes are extracted ( Figure 5). In Figure 5, the locations of seismic traces 20, 40, 60, and 80 correspond to the faults with a drop of 1 m, 3 m, 5 m, and 10 m, respectively. In order to compare and analyze the response characteristics of each seismic attribute to the fault, the attribute value is calculated after normalization; the changes of the faults with the value of each seismic attribute show certain regularity.
The analysis shows that there is a fault in the local minimum of instantaneous amplitude. The local maximum value of waveform difference is located on faults and the maximum value distinctively increases with the increase of fault drop. There is a fault in the local extreme of instantaneous frequency and instantaneous bandwidth. There is also a fault at the local minimum of attenuation coefficient. When the fault drop is small, there is a response, but the characteristics are not clear. When the drop is 3 m or less, each seismic attribute shows that the response characteristics are enhanced with the increase of the fault drop. From the above five seismic attributes, the waveform differential attribute has the best effect on fault characterization. However, the characterizations of faults are not completely consistent among the seismic attributes and a single seismic attribute cannot be used to identify faults completely and correctly. Multiple seismic attributes are beneficial to overcome the multi-solution and better identify faults.

Analysis of the VMD Denoising Effect
A random noise of 30 dB was added to the seismic profile of the noise-free fault model, and the seismic profile and seismic attributes after adding noise are shown in Figure 6. With the increase of noise, the response characteristics of each seismic attribute and fault are greatly affected and cannot effectively respond to the location of the fault as shown in Figure 7.  VMD denoising was carried out on the denoised seismic profile, and the denoised seismic profile and its seismic attributes are shown in Figures 8 and 9. By comparing Figure 7 with Figure 9, it can be seen that VMD has a good denoising effect, and the relationship between each seismic attribute and fault is essentially consistent with the result without noise. Therefore, VMD denoising can effectively remove the random noise and improve the SNR of seismic data.

Geological Survey of the Working Area
The study area is located in the east of Jiaxiang County, Shandong Province. It presents, generally, a monoclinic structure that is high in the west and low in the east, and the strata strike north to north west and tend to east and north east. The stratigraphic strike in the northern part of the original mine field turns to near east to west and the second-level folds develop. East of the Jiaxiang branch fault, the stratum in the deepening zone presents a wide and gentle fold structure, and the second-level folds are developed, which are mainly NW-trending synclines. The folds are incomplete due to multiple transformations and fault cutting. Affected and controlled by regional faults, the sub-structures in the area are dominated by nearly north-south and north-north-trending faults, most of which are north-north-trending faults, but there are also a few east-west faults locally.
The main coal seam in the study area is number 3 coal: its thickness is 4.15 ~ 10.15 m; the average thickness is 8.34 m; the bottom distance from limestone is 16.25 ~ 49.1 m; and the average is 28.44 m. Comprehensive evaluation of the whole area determined that it can be stably mined for thick coal seams. Intelligent identification of small faults in this coal seam is beneficial to safe and efficient mining in the future.

Introduction of Learning Samples for Small Fault Identification
Affected by regional NE-trending faults, the coal measures of the study samples were cut into NE-trending strip graben-and-horst structures. In the area, the two groups of faults that are NNE trending and near east-west trending are mainly developed, and the NW trending regulating faults are developed at the same time. The intersecting and cutting of these groups of faults form a net-like fault plane combination mode, which is mainly dominated by high Angle interlayer normal faults and relatively developed by small interlayer faults. A total of 154 faults with fault spacing greater than 5 m were explained, of which 26 were reverse faults and the rest were normal faults.
Combined with the above analysis of various seismic attributes, it can be seen that seismic attributes have relatively sensitive responses to faults and can be used as effective samples for fault identification. Based on the above analysis and summary of various seismic attributes, this paper uses "1" and "0" as labels for faults and non-faults. When using support vector machines, the instantaneous amplitude, instantaneous bandwidth, instantaneous frequency, attenuation coefficient, and waveform difference are used as input data. The output data is fault or non-fault (the fault is indicated by the label "1" and the non-fault is indicated by the label "0"). Table 2 and Table 3 show partial results of fault samples and non-fault samples, respectively. Tables 4 and 5 show the average value and the median value, respectively, of seismic attributes of fault samples and non-fault samples. From Tables 2 and 3, we can see that, at the location of faults, the instantaneous amplitude value, instantaneous bandwidth value, instantaneous frequency value, and waveform difference value are small, while the attenuation coefficient value is large. At non-fault locations, the characteristics of the attributes are reversed at fault locations. In Table 4, there are significant differences in the average values of instantaneous amplitude and instantaneous bandwidth, and, in Table 5, there are significant differences in the median values of instantaneous amplitude and instantaneous frequency. The average value and the median value of seismic attributes of fault samples and non-fault samples shows that the seismic attribute can be used to distinguish faults. In summary, there are differences in attribute values between fault and non-fault locations. A SVM is used to learn and train the processed fault and non-fault samples, and apply it to the recognition of the number 3 coal seam faults in the work area.

Intelligent Recognition of Small Faults
Seismic attributes, such as instantaneous amplitude, waveform difference, instantaneous frequency, frequency band width, and attenuation coefficient, are sensitive to faults and can be used to identify faults. Therefore, on the basis of the VMD processing of the 3D seismic data in this area, seismic attributes of amplitude, instantaneous frequency, and frequency band width are extracted along coal seam 3. The results are shown in Figures 10-14. 1. Instantaneous amplitude. Amplitude attributes are the most widely used and the most effective attributes to reflect the property characteristics of underground areas. The instantaneous amplitude attribute is a reflection of the intensity of seismic wave reflection. Its main characteristics are the difference of wave impedance in the formation and the presence of faults in the local minimum position, so it is well applied in fault identification. Figure 10 represents a diagram of the instantaneous amplitude attribute of the study area. It shows evident anomalies, with the cool color bands in the property map generally corresponding to fault areas. 2. Waveform difference. The waveform difference attribute is one of the best seismic attributes for fault characterization, because the seismic wave will scatter when passing through the geological anomalous body, resulting in obvious differences. Therefore, it can obtain clearer imaging results than other traditional attributes. There are faults in the local maximum of the waveform difference and the maximum value increases significantly with the increase of the fault drop. In Figure 11, the warm color band area with maximum value is clearly distributed, showing warm-color band patches like a network and the effect is relatively ideal. 3. Instantaneous frequency. Instantaneous frequency is the seismic attribute obtained by sampling the midpoint, one by one, according to the frequency of the trace set, revealing the fault at the local extremum. In Figure 12, the boundary, marked by an abrupt change in tone, is the theoretical fault development position, which can better identify the small fault and improve its multi-solution problem. 4. Instantaneous frequency bandwidth. The bandwidth attribute is the width between high and low cut frequency in seismic data. It mainly reflects the characteristics of seismic waveform in seismic data and can be used to analyze the heterogeneity of the formation, in its local extreme value location and where there are faults. Therefore, the application of this attribute is useful to identify of small faults in the study area. In Figure 13, the boundary, marked by an abrupt change in tone, is the theoretical location of the fault development, which is roughly consistent with the interpretation of results map. 5. Attenuation coefficient. The attenuation coefficient is an important parameter to describe geological body anomaly. In the subsurface of non-uniform geological bodies with different attenuation coefficients, the seismic reflection wave has different response characteristics under the condition of energy attenuation. There is a fault at the local minimum of the attenuation coefficient. When the fault drop is small, there is response, but its characteristics are not clear. In Figure 14, the minima of the attenuation coefficient correspond essentially to the fault location in the interpretation map, but the characterization of small structures is not obvious. In order to illustrate the effect of the fault prediction method of this paper, the traditional principal component analysis method was used to analyze the instantaneous amplitude, waveform difference, instantaneous frequency, frequency band width, and attenuation coefficient of coal seam 3, in which the proportion of the first principal component reached 91.82%, as shown in Figure 15. Eight faults (F1-F8) were exposed during coal mining. Figure 15 contains most of the information of various seismic attributes: based on one single attribute, the yellow area may be a fault zone, but only five zones revealed faults that conform to this rule; F1, F3, and F6 did not show evident fault development characteristics; and fault prediction accuracy is 62.5% by using the principal component analysis method.
Seismic data, such as instantaneous amplitude, waveform difference, instantaneous frequency, frequency band width, and attenuation coefficient, of coal seam 3 were taken as input data and the SVM model was used for fault prediction. The fault prediction results of coal seam 3 were obtained as shown in Figure 16. In the figure, the value of the red label is 1, which represents the fault. A white label with a value of 0 indicates non-fault. It can be seen from the figure that the faults of coal seam number 3 in this working area are relatively developed. By using the method proposed in this paper, 7 faults revealed by the actual data are predicted successfully and the accuracy is 87.5%, which is significantly higher than the prediction accuracy of the principal component analysis method. Only F1 is not predicted correctly, which may be due to the small size of the fault and the development of surrounding faults.  The comparison results of Figures 15 and 16 show that the method of predicting small faults using a SVM proposed in this paper has good application effects and a high prediction accuracy.

Conclusions
Seismic attributes, such as instantaneous amplitude, instantaneous frequency, and waveform characteristics, have a certain response to faults, among which the waveform difference attribute has a good effect on fault characterization. Average value and the median value of seismic attributes of fault samples and non-fault samples show that the seismic attribute can be used to distinguish faults. However, the description of faults by various seismic attributes is not completely consistent and a single seismic attribute cannot be used to identify faults correctly. Using multiple seismic attributes is beneficial to overcome the multi-solution and identify faults better.
When seismic data contains noise, the seismic response characteristics of the faults will be affected to some extent. VMD can effectively attenuate the random noise and seismic attributes extracted based on VMD denoising are more conducive to fault prediction.
The fault prediction method using a SVM proposed in this paper has higher accuracy than the principal component analysis method and can be used as a new fault prediction method.
Based on the actual geological data of a mining area, this paper carries out VMD denoising and builds a SVM model to conduct sample training and fault identification. The prediction results have important guiding significance and reference value for later coal mining.