Research on Voice Print Recognition Algorithm based on Wavelet-GA-BP Neural Network

Aiming at the fact that common pattern recognition algorithms such as LPCC and MFCC have flaws in voice-print recognition, the paper put forward a new algorithm, which employed wavelet analysis, BP neural network and niche genetic algorithm. Firstly, the algorithm extracted time-domain and frequency-domain characteristic variation of voice signal by wavelet transform, secondly, trained the neural network niche genetic algorithm, which solved the problem of local minimum value caused by normal multi-layers neural network, lastly, took the wavelet transform variations as the training data of the optimization neural network. Simulation experiment was carried out. The results indicate the algorithm in this paper is superior to current recognition algorithms, it has fast recognition velocity, high recognition rate, low fault rate and strong robustness with different voice-makers, besides, it can automatically correct errors.


Introduction
As network technology rapidly develops, the information security is more and more important. Traditional password authentication based on personal identification technology has been insufficient, while biometrics technology is increasingly mature and showing its superiority. Among them, voice print recognition is a new technology developed in recent years, it is simple, precise, economical and non-contacted in comparison with other recognition technology [1]. This paper puts forward a new recognition algorithm based on wavelet analysis and BP-GA optimization algorithm, compared with traditional algorithm such as LPCC and MFCC, it has the advantage of fast recognition velocity, high recognition rate, low fault rate, automatic error correcting, and strong robustness with different voice-makers.

Wavelet Transform of Voice Signal
For the object of voice print algorithm is digital variation, the pre-processing is needed for voice signal. The LPCC and MFCC algorithm employ FFT frequency-domain analysis method, which can only extract frequency-domain feature, but can not extract time-domain feature, even if add time-window, the time-domain resolution is still inadequate, and the frequency-domain features are easily lost. Wavelet transform is a new time-frequency analysis method, which has the merit of resolution adjustable, strong anti-interference and can reflect signal's non-stationary and transient feature [2][3]. Besides, it has low computation and is easy to realize, fully reflects not only human ears's feature but also voice signal's dynamic feature, and improve the rate of final voice print recognition. , its wavelet transform is defined as the inner product of signal and wavelet base function, that is: (1) Similar with short-time Fourier transform, the original signal can be re-constructed from its wavelet transform, the formula is: (2) Suppose the signal with noises is: In eq.3, ( ) f i is the real voice signal, ( ) e i is gaussian white noise or 0ther noise, ( ) s i contains noise signal and useful low frequency signal to be extracted. Voice signals are mainly stable signals in engineering, while noises are mainly high-frequency signals. So we should select a wavelet base and determine the layer of wavelet decomposition. Make 3-layer decomposition of signal, the figure is shown in Fig.1

Basic conception of BP-GA
BP-GA is a new optimization algorithm raised recently, it has the characteristics of high efficiency, strong global searching capability. GA is independent of gradient information, and has no requirement of the object function's continuity, even its definite expression [4]. BP-GA takes the advantage of artificial neural network and genetic algorithm, not only overcomes the low efficiency and long convergence time of GA in structure optimization, but also improves global solving capability of GA, so it is an effective and applicable solution [5][6]. Its working process is shown in Fig.3.

Construction of BP neural network voice print recognition
Multi-layer feed forward neural networks divides the network into several layers, which are arrayed in order, the neuron in layer i only receives signals from neuron in layer (i-1), and the neurons in all layers have no feedback. For a forward network, when vector x is inputted, vector y is outputted through the network, so the forward neural network can be seen as a transformer of x to y mapping [7]. The neural network's topological structure is shown in Fig.4.  Three layers BP neural network is employed. The input vector in the first layer can be adjusted according to practice. The first layer is a normalized layer, where three nodes are taken. The input vectors are software threat, hardware threat and management vulnerability. Experts are organized to form an evaluation group to make an evaluation of each factor. The values in (0,1) are supposed as the risk evaluation results, i.e. the better the results are, the lower the risk of information subject's success is. The second layer is the input layer of BP network, five nodes are taken. The third layer is the output layer, one node is taken, the output characteristic function is s-style function, the output value is information construction risk degree, which are the continual numbers in ( 0 , 1 ) .  is the weight of output layer, 0  is risk degree. The mathematical expressions of the node are:

Design of genetic algorithm
Genetic algorithm contains six basic parts: coding, initial population, fitness function, genetic manipulation and parameter control and end rules [8]. Coding is the bridge of problem and algorithm.
In order to carry out genetic searching in a big space and improve algorithm accuracy, float-encoding is employed. Each individual of the initial population is composed of 40-bit binary strings, the value is determined as follows: generate a random number in ( 0 , 1 ) , if it is bigger than 0 . 5 , the value of this bit is 1 , or it is 0 . Now the most used method is based on the penalty function idea. As the problem talked about in this paper is the least error optimization problem, then the fitness function can be expressed as: Genetic manipulation contains five parts, they are selection, crossover, changing step, mutation and population update.
(1) Selection: individual is selected by roulette wheel method. (2) Step setting: using scaling method (3) Crossover: in an even crossover way, the mask sample is generated at random. (4) Mutation: mutation bit is selected randomly, then get the bit value reversed. (5) Population update: using pareto solution reserve strategy, the parent pareto solution is taken directly into child generation. Two terminated rules are adopted here, for one rule, when the difference of the maximum and minimum of the object function is smaller than given precision 1 6 e  , the algorithm is considered convergent, and the program is terminated. For the other rule, suppose the largest genetic generation value is 500, when the iteration time of algorithm reaches 500, the program is terminated. The operation results are shown in Fig.5.

Simulation analyses of an example with BP-GA algorithm
Take 200 samples in speech database as feature vector in this example. The object output of real man is 1, and the false man is 0, the criterion for recognition is 0.5. make up MATLAB simulation procedure, the initial population is 500, crossover rate is 0.5, mutation rate is 0.042, the step scale is 0.8. take tansig function as the transfer function in the network's anonymous layer and satlin function as the transfer function in the output layer, which is single output to improve training speed. The Optimized training results of BP neural network is shown in Fig.6.  Fig.6, the false rate of recognition is about 1.6% using wavelet-BP-GA algorithm. To test the validity of the algorithm, we made a comparison with MFCC algorithm and LPCC algorithm, the detailed data of the experiment are shown in table.1.