Abstract

The signal corresponding to English speech contains a lot of redundant information and environmental interference information, which will produce a lot of distortion in the process of English speech translation signal recognition. Based on this, a large number of studies focus on encoding and processing English speech, so as to achieve high-precision speech recognition. The traditional wavelet denoising algorithm plays an obvious role in the recognition of English speech translation signals, which mainly depends on the excellent local time-frequency domain characteristics of the wavelet signal algorithm, but the traditional wavelet signal algorithm is still difficult to select the recognition threshold, and the recognition accuracy is easy to be affected. Based on this, this paper will improve the traditional wavelet denoising algorithm, abandon the single-threshold judgment of the original traditional algorithm, innovatively adopt the combination of soft threshold and hard threshold, further solve the distortion problem of the denoising algorithm in the process of English speech translation signal recognition, improve the signal-to-noise ratio of English speech recognition, and further reduce the root mean square error of the signal. Good noise reduction effect is realized, and the accuracy of speech recognition is improved. In the experiment, the algorithm is compared with the traditional algorithm based on MATLAB simulation software. The simulation results are consistent with the actual theoretical results. At the same time, the algorithm proposed in this paper has obvious advantages in the recognition accuracy of English speech translation signals, which reflects the superiority and practical value of the algorithm.

1. Introduction

Modern information technology, multimedia technology and artificial intelligence technology are combined with speech recognition technology, so as to realize the extraction, processing, and analysis of all kinds of speech. The expected goals of the conventional speech processing technology or speech coding technology in processing speech mostly focus on the following characteristics: minimum bit rate, minimum algorithm complexity, minimum delay and maximum speech quality [13]. The traditional speech processing technology is mainly Fourier transform technology, which mainly realizes the analysis, processing, and reconstruction of speech signal. To some extent, it can realize the reconstruction of time-domain and frequency-domain characteristics of speech signal and realize the conversion of speech signal between time-domain and frequency-domain; however, the processing effect of this traditional signal processing method for time-varying nonstationary signals is very poor, and it cannot accurately track the time-varying characteristic structure corresponding to the signal. At the same time, it is precisely because of this defect that Fourier transform cannot be widely used [46]. In the conventional underlying algorithms of English speech recognition, the main basic recognition principles are mainly focused on the recognition algorithm based on phonetics, the recognition algorithm based on speech template matching, and the speech recognition algorithm based on neural network. The three basic feature algorithms are the most advanced neural network algorithm [79], which mainly simulates the human neural network system. It has the corresponding adaptability, parallelism, robustness, and learning. However, during the long-term test of this algorithm, it is found that the corresponding recognition time is too long, resulting in too long time delay of corresponding speech recognition. In addition, the corresponding training time is relatively troublesome and the consumption of resources is too huge.

The wavelet transform denoising algorithm is essentially a multiscale signal analysis algorithm, which can map the speech signal to be processed to the corresponding wavelet domain, and carry out targeted processing on the noisy wavelet coefficients that do not pass the speech characteristics in different sizes according to the corresponding speech system and the wavelet coefficients of noise [10]. In the actual processing process, based on the series space function obtained by the expansion and translation of the wavelet generating function corresponding to the wavelet signal, and according to the corresponding threshold standard, the best approximation of the original signal is found, so as to distinguish the original signal from the noise signal [1114]. However, the traditional wavelet signal is subject to the selection of threshold. When the corresponding wavelet signal threshold is selected small, the corresponding part of noise information will be retained in the corresponding denoising process, which is equivalent to the counter effect of denoising at this time. When the corresponding threshold is relatively large, the corresponding denoising will filter out the corresponding useful information in the corresponding speech, so that the real signal corresponding to the speech signal is not complete. At the same time, the soft threshold and hard threshold algorithms used in the traditional wavelet signal denoising process will make the denoising effect of the whole speech recognition poor and cause the problem of signal distortion. The selection of wavelet function in traditional wavelet denoising system will further affect the denoising effect of wavelet transform [1517]. Based on this, it is of great significance to reasonably select the threshold of wavelet transform and optimize its algorithm for realizing high-precision and low delay speech recognition.

In view of the problems that the above traditional wavelet signal algorithm still has difficulties in selecting the recognition threshold and the recognition accuracy is easy to be affected, this paper will improve the wavelet denoising algorithm and analyze the combination of soft threshold and hard threshold according to its corresponding algorithm decomposition ability and denoising threshold. It further solves the distortion problem in the process of English speech translation signal recognition of denoising algorithm, improves the signal-to-noise ratio of English speech recognition, and further reduces the root mean square error of signal, so as to achieve good noise reduction effect and improve the accuracy of speech recognition. In the experiment, this algorithm is compared with the traditional algorithm based on MATLAB simulation software. The simulation results are consistent with the actual theoretical results. At the same time, the algorithm proposed in this paper has obvious advantages in the recognition accuracy of English speech translation signals, which reflects the superiority and practical value of this algorithm.

The structure of this paper is arranged as follows: Section 2 of this paper will analyze and study the current research status of wavelet transform denoising algorithm. Section 3 of this paper will analyze the optimized threshold selection algorithm and the improved wavelet denoising algorithm based on this algorithm. Section 4 of this paper will experiment with the algorithm proposed in this paper and analyze the experimental results. Finally, this paper will be summarized.

2. Correlation Analysis: Research Status of English Speech Translation Recognition Algorithm Based on Wavelet Transform

Under the background that the wavelet transform speech denoising algorithm is widely used in speech recognition technology, a large number of scientific research institutions and speech recognition experts have studied and analyzed the wavelet transform speech recognition technology. Its main research characteristics are concentrated in three levels: maximum value, multiresolution, and data compression [18]. Relevant research institutions in Europe and America first proposed to filter the noise signal in the speech signal by using wavelet decomposition and corresponding reconstruction technology based on the multiresolution theory. This method is pioneering [19]. Based on the above, relevant research institutions have improved this technology. They propose to use the maximum value of wavelet transform to denoise the signal, and its corresponding theory is defined as the singularity detection theory [20]. Based on the improvement of singularity detection theory, relevant researchers proposed a nonlinear wavelet transform threshold method with good effect in the denoising layer, which solves the processing defects of previous algorithms for nonlinear speech signals [21]. Based on the threshold method of nonlinear wavelet transform, the researchers of relevant research institutions optimized it and obtained a wavelet denoising algorithm with stationary invariants [22]. At the same time, it gradually developed into a wavelet denoising algorithm with translation invariants. This algorithm is essentially an orthogonal small transform, which can realize the compression of speech data. In the processing process of this algorithm, the noise signal after wavelet transform will be transformed, the corresponding signal will be enhanced, and the coefficients of the corresponding wavelet transform will continue to increase. At this time, the corresponding noise will be evenly distributed in the coefficients after wavelet transform, and the corresponding noise data will continue to approach in the coefficient domain of small scale; this will make it convenient and fast to filter the noise in the subsequent processing of the algorithm. However, this algorithm has the problem of threshold selection. When the threshold selection of the corresponding wavelet signal is small, part of the corresponding noise information will be retained in the corresponding denoising process, which is equivalent to the counter effect of denoising at this time. When the corresponding threshold selection is relatively large, the corresponding denoising will filter out the corresponding useful information in the corresponding speech, so that the real signal corresponding to the speech signal is not complete [23, 24]. Based on this, how to make rational use of the corresponding threshold based on wavelet transform is critical and meaningful.

3. Wavelet Denoising English Speech Translation Signal Recognition Algorithm Based on Improved Threshold Algorithm

This section will analyze the principle of the improved wavelet denoising English speech recognition algorithm. The corresponding recognition principle block diagram is shown in Figure 1. From the figure, it can be seen that the corresponding English speech recognition process is mainly as follows: English speech translation signal input, preprocessing and analyzing the corresponding speech signal, and denoising the wavelet signal. The corresponding threshold processing algorithm needs to be added to the corresponding wavelet signal processing flow, the distortion is tested, and the corresponding judgment is made based on the corresponding measurement estimation; the processing results after speech recognition processing are obtained and finally recognized. It can be seen from the figure that the core processing module of this paper is mainly the application of the improved threshold algorithm and the corresponding wavelet denoising English speech translation signal recognition technology based on this.

3.1. Analysis of Improved Threshold Algorithm

The algorithm based on which the threshold is improved in this paper is the soft threshold algorithm. In the soft threshold algorithm, the corresponding splitting process of its corresponding wavelet in processing English speech translation signals is to split the English speech dataset En into two corresponding smaller English speech data subsets. The mathematical function expression of the corresponding data subset is shown in formula (1) below; it can be seen from formula (1) that the wavelet basis corresponding to different processing data also basically determines the classification mode of the corresponding dataset.

Under the soft threshold algorithm, the corresponding prediction process is mainly to eliminate the remaining redundancy after the first step splitting, so as to give the corresponding tighter data expression. The corresponding mathematical expression is shown in formula (2). In formula (2), the corresponding represents the corresponding wavelet coefficient and the corresponding represents the corresponding algorithm prediction operator.

In the process of updating the corresponding soft threshold value, the corresponding high-frequency English semantic translation signal information is adjusted and processed through the corresponding update operator to obtain relatively readable low-frequency information. The corresponding processing expression is shown in formula (3), where the corresponding in formula (3) represents the scale coefficient of the whole system, The corresponding represents the corresponding update operator.

Based on the particularity of the English speech translation signal, its corresponding characteristics are shown in Figure 2. The soft threshold algorithm is improved based on the speech characteristics shown in the figure. The corresponding improved threshold algorithm function is shown in formula (4), where the corresponding represents the improved threshold value, and its corresponding threshold value is defined between the soft threshold and the hard threshold; at the same time, with the increasing of the threshold value, the corresponding threshold value is closer to the reasonable threshold value. The corresponding mathematical symbol in formula (4) has certain numerical restrictions.

Based on the above formula (4), the corresponding square threshold function is obtained, as shown in formula (5), in which the corresponding mathematical symbol also needs to be numerically limited.

Based on the above formula (5), the improved threshold calculation function in this paper is obtained. The final value of the corresponding threshold is bounded between the soft threshold value and the corresponding hard threshold value, in which the corresponding mathematical symbol still has the corresponding numerical limit, and the corresponding value a must also be defined between values 0 and 1, in which the value does not include the value 0.

The corresponding improved wavelet threshold calculation steps are shown in Figure 3. It can be seen from the figure that the corresponding threshold algorithm is mainly divided into three steps. The most important step is to determine the number of wavelet decomposition layers in the initial process. It needs to decompose the wavelet signal by using the mixed noise of English speech, so as to calculate the corresponding number of decomposition layers; finally, the corresponding result coefficients are realized by layer by layer wavelet decomposition, and the corresponding threshold processing needs to be iterated continuously, so as to finally obtain the corresponding processing results.

3.2. Analysis of Improved Wavelet Transform Denoising Algorithm

Based on the above improved threshold algorithm, the corresponding improved wavelet transform denoising algorithm is redesigned. The corresponding algorithm mainly includes the following processes: English speech translation signal input, preprocessing and analyzing the corresponding speech signal, and denoising the corresponding wavelet signal. The corresponding threshold processing algorithm needs to be added to the corresponding wavelet signal processing process. Based on the corresponding measurement and estimation, the distortion is tested and the corresponding decision is made, the processing results after speech recognition are obtained, and the final recognition is carried out. The flow chart of the corresponding improved wavelet transform denoising algorithm is shown in Figure 4. It can be seen from the figure that the new algorithm increases the speech scoring mode, which is more conducive to the comprehensive evaluation of the corresponding recognition results.

In the corresponding English speech preprocessing stage, it is first necessary to add short-term energy to the corresponding monitoring data. The mathematical formula for the definition of short-term energy of the corresponding data is shown in formula (7):

In corresponding formula (7), represents the voice signal after the voice data is windowed, and the corresponding voice signal expansion is shown in formula (8), and in corresponding formula (8), function represents the unit impulse response function of the low-pass filter used by the data at this time.

The average amplitude is processed based on the short-time energy of the above data signal. The corresponding processed formula is shown in formula (9), which can better solve the universality of the algorithm in the signal preprocessing stage.

For discrete voice data information, the corresponding data endpoint needs to be detected and analyzed. At this time, the corresponding signal zero crossing rate needs to be further calculated. The corresponding signal zero crossing rate calculation formula is shown in formula (10), in which the corresponding sgn function represents the corresponding symbol function, and the corresponding expansion formula is shown in formula (11). The corresponding represents the corresponding window function, and its corresponding expansion is shown in formula (12). A threshold value will be preset in the corresponding window function.

The corresponding wavelet signal denoising part mainly includes the threshold improvement algorithm part and the corresponding signal analysis part. The corresponding wavelet signal threshold improvement algorithm part has been described in Section 3.1 above. After the wavelet processing is completed, the corresponding signal output results are obtained, and the corresponding output results need to be scored. The corresponding scoring system flow chart is shown in Figure 5. The corresponding scoring rules are mainly based on the HMM model. The corresponding scoring function includes the likelihood score of the signal, the score of the sentence, and the score of the posterior probability of the data. The corresponding scoring formulas are shown in formula (13), formula (14), and formula (15), respectively.

4. Experiment and Analysis

In order to verify the superiority of this algorithm, this paper verifies the corresponding system algorithm based on MATLAB and mainly analyzes the superiority of the improved threshold algorithm and the superiority of the wavelet transform algorithm based on this.

In the corresponding improved threshold algorithm, this paper mainly compares it with the traditional hard threshold algorithm and soft threshold algorithm. During the experiment, the corresponding experimental environment is consistent, and the corresponding processed original signal is shown in Figure 6.

Based on the above original signal, the corresponding adjustment coefficients are segmented. The main segmented values correspond to 0, 1, and 5, which correspond to the hard threshold, improved threshold, and soft threshold, respectively. The noise of the original signal is increased and analyzed. The corresponding adjustment factor is taken as 0.5, 2, 5, and 10, respectively. The corresponding waveform after the test is shown in Figures 7(a)7(c). It can be seen from the figure that the corresponding noise reduction ability of the improved threshold algorithm is significantly improved compared with the traditional threshold algorithm; after the corresponding signal processing, the waveform is smoother, and the corresponding phase distortion and signal loss are smaller.

In order to verify the signal processing and analysis ability of the improved wavelet transform denoising algorithm under the interference of complex environmental noise, a comparative experiment of speech noise processing is carried out. The corresponding experimental objects remain unified, and shot noise and ambient noise are added to the corresponding speech for processing and analysis. The corresponding processing results are shown in Figures 8(a) and 8(b). It can be seen from the figure that the algorithm proposed in this paper can deal with environmental noise and corresponding shot noise well, and the corresponding algorithm proposed in this paper has a fast processing speed.

From the above analysis, it can be seen that the algorithm proposed in this paper has obvious advantages over the traditional algorithm, and it has obvious advantages in the accuracy of English speech recognition.

5. Summary

This paper mainly analyzes the current research status of English speech recognition algorithms, deeply studies and analyzes the wavelet transform algorithm, deeply discusses its corresponding disadvantages, and gives the solution of this paper. Based on the disadvantages of threshold selection of the traditional wavelet transform denoising algorithm, this paper improves the wavelet denoising algorithm, analyzes the combination of soft threshold and hard threshold according to its corresponding algorithm decomposition ability and denoising threshold, further solves the distortion problem in the process of English speech translation signal recognition of denoising algorithm, and improves the signal-to-noise ratio of English speech recognition. At the same time, the root mean square error of the signal is further reduced, so as to achieve a good noise reduction effect and improve the accuracy of speech recognition. In the experiment, this algorithm is compared with the traditional algorithm based on MATLAB simulation software. The simulation results are consistent with the actual theoretical results. At the same time, the algorithm proposed in this paper has obvious advantages in the recognition accuracy of English speech translation signals, which reflects the superiority and practical value of this algorithm. In the follow-up research, this paper will focus on the analysis of English speech recognition and detection in a complex environment and the comparative analysis of wavelet transform algorithm and partial differential equation algorithm. At the same time, more experimental samples will be introduced for analysis and research in the subsequent experiments.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.