Speech encryption using chaotic shift keying for secured speech communication

This paper throws light on chaotic shift keying-based speech encryption and decryption method. In this method, the input speech signals are sampled and its values are segmented into four levels, namely L0, L1, L2, and L3. Each level of sampled values is permuted using four chaotic generators such as logistic map, tent map, quadratic map, and Bernoulli’s map. A chaotic shift keying mechanism assigns logistic map for L0, tent map for L1, quadratic map for L2, and Bernoulli’s map for L3 for shuffling the speech samples at every level. Further, the sampled values are permuted using Chen map which uncovers the chaotic behavior. Various testing methods are applied to analyze the efficiency of the system. The results prove that the proposed system is highly secured against the attackers and possesses a powerful diffusion and confusion mechanism for better speech communication in the field of telecommunication.


Introduction
Security and privacy are the two major concerns in the ever growing speech communication system. Speech cryptography is a solution used for transmitting spoken information in a masked way by encrypting the data at transmitters' end and decrypting at receivers' end. Cryptography is a method wherein detection of masked messages takes place; even the decoding is hard to come by. The encryption is derived by scrambling the original spectrum, and the reverse process is used for decryption.
In general, there are two types of encryption schemes namely symmetric encryption and asymmetric encryption. Symmetric key otherwise known as secret key or shared key or private key is one of the encryption methods [1] which use one key for encryption as they do for decryption process. Asymmetric cryptography [2,3] uses different encryption keys for encryption and decryption. In this case, whether it is public or private, an end user on a network has a pair of keys: one for encryption and the other one for decryption. These keys are labeled as public and private keys. Symmetric scheme associates with probability of occurrence of many things for the eavesdropper based on larger numbers of factorization and of mathematical functions. It can be inferred mathematically which is time consuming and lacks clarity.
These two general cryptographic methods are based on algebraic notations and theory of computational complexity. The chaotic methods generally rely on large numbers (chaos) belong to nonlinear dynamics field [4]. Chaoticbased cryptographic functions follow deterministic dynamics, non-guessable behavior with non-linear functions and chaos properties [5][6][7]. Chaotic-based cryptography combines the traditional cryptographic techniques and the chaotic synchronization to enhance the degree of security [8][9][10][11][12][13][14]. In this paper, higher degree of security is achieved by multiple level of permutation process on sampled speech at five levels using five different chaotic maps. Furthermore, the proposed system provides better withstanding capacity against various attacks. The chapterization of the study is furnished below. Section 2 throws light on five different chaotic mapping techniques. In Section 3, architecture and general principles of proposed speech encryption are discussed in detail. Section 4 introduces the finer aspects of chaotic switching and modulation method. In Section 5, a brief on its security analysis and test results are presented in order to defend the method. Section 6 carries the concluding remarks of the proposed study.

Logistic mapping
The logistic map is a one-dimensional mapping, having complex chaotic behavior that can arise from very simple nonlinear dynamical equations [15] (https://en.wikipedia.org/wiki/List_of_chaotic_maps). This kind of map usually takes the form of iterated functions. Mathematically, the logistic map is written as: where X n is a number between zero and one which represents the ratio of existing population to the maximum possible population and r is the control parameter that controls the behavior of the map. This nonlinear difference equation is intended to capture two effects: i. Reproduction where the population will increase at a rate proportional to the current population when the population size is small and ii. Density dependent mortality where the growth rate will decrease at a rate proportional to the value obtained by taking the theoretical "carrying capacity" of the environment with lesser current population.
The logistic map is a nonlinear transformation when r = 4. While varying the parameter r, different behaviors are observed. From almost all initial conditions, there is no oscillation of finite period. Minor variation in the initial population yields dramatic change in results over a period of time. The logistic map is used in this proposed work for permutation and substitution of L 0 parameters in chaotic switch.

Tent mapping
The tent map with parameter μ is the real-valued function f μ defined by f μ = μ min{X, X − 1}. For the values of the parameter μ within 0 and 2, f μ maps the unit interval [0, 1] into itself, thus defining a discrete time dynamical system on it equivalently, a recurrence relation [16] (https://en.wikipedia.org/wiki/List_of_chaotic_maps). In particular, iterating a point X 0 in [0, 1] gives rise to a sequence X n : where μ is a positive real constant. Choosing for instance the parameter μ = 2, the effect of the function f μ may be viewed as the result of the operation of folding the unit interval in two, then stretching the resulted interval [0,1/ 2] to get the interval [0,1]. Iterating the procedure, any point of X 0 , interval assumes new subsequent positions as specified above, generating a sequence X n in [0,1]. The μ = 2 case of the tent map is a nonlinear transformation.
Depending on the value of μ, the tent map demonstrates wide range of dynamical behaviors right from predictable to chaotic. If μ is less than 1, the point X = 0 is an attractive fixed point of the system for all initial values of X, i.e., the system will converge towards x = 0 from any initial value of X. If μ is 1, all values of X less than or equal to 1/ 2 are fixed points of the system. If μ is greater than 1, the system has two fixed points, one at 0, and the other at μ/ (μ + 1). If μ is between 1 and 2, the interval [μ − μ/2, μ/2] contains both periodic and non-periodic points, although all of the orbits are unstable. The tent map is used in this proposed work for permutation and substitution of L 1 parameters in chaotic switch.

Quadratic mapping
In simple mathematical formulation, quadratic map exhibits very complicated dynamical properties [16] (https://en.wikipedia.org/wiki/List_of_chaotic_maps) and concerns the asymptotic behavior of iterates, when n → +∞. Moreover, such features may change in a dramatic way under variation of the parameter a. This is related to the fact that for large n, being a high degree polynomial, depends in a complicated way on x and a. The quadratic mapping can be used as a model for the description of such dynamics with wider scope.
Consider the equation of the quadratic map: The areas on the quadratic map splits at certain fixed points. The fixed points are x n . In the proximity around one of our fixed points, if the map is iterated, the solution will likely to vary. Either it will attract the fixed point or repel. In the case of the quadratic map, there exists repulsion and attraction. If there is attraction to the fixed point, the fixed point is stable. If there is repulsion, the fixed point is unstable. In order to get a clear picture of what goes on in the quadratic map, the fixed points ought to be identified and its stability be analyzed.
Here, linearization may also be used.
If x is a fixed point, x = a − x 2 so, 0 = x 2 + x − a and x = ± (− 1 + (1 + 4a) 1\2 )/2. To find the stability and attraction of the fixed point in the neighborhoods around them, let x n = x ± δ n , where δ is a small distance. Then The quadratic map is used in this proposed work for permutation and substitution of L 2 parameters in chaotic switch.

Bernoulli's mapping
Bernoulli's map is a one-dimensional map x n + 1 = {2x n } where the {2x n } designate a fractional part of the number. It is convenient to represent the variable x in a binary notation, and then the digit 0 at the first position after the dot corresponds to residence of the state of the model in the left part of the unit interval, and 1 to reside in the right part. Such a transformation of the binary sequence is called the Bernoulli shift [16] (https://en.wikipedia.org/ wiki/List_of_chaotic_maps).
With an initial state defined by a random digital sequence obtained, say, by tossing a coin with a rule of heads or tails: x n = 0.0101101.... It is observed that during course of iterations, this will oscillate towards left or right half of the unit interval exactly to the random sequence defined. Here, it behaves in a chaotic manner. This transformation can also be defined as the iterated function map of the piecewise linear function.
In this mapping, a small one-step perturbation of initial condition, the iterations grow twice. The Bernoulli's map is used in this proposed work for permutation and substitution of L 3 parameters in chaotic switch.

Chen mapping
Chen map is often represented as Chen map and it is of one-to-one transformation [17] (https://en.wikipedia.org/wiki/List_of_chaotic_maps). It is given by where p and q are its parameters. It is invertible because the matrix has determinant value of 1, and therefore, its inverse has integer entries with larger numbers. One of the features is that the signal can be apparently randomized by the transformation but getting back to its original state warrants number of steps. This map is used in this proposed work for permutation and substitution of all the parameters before the chaotic shift keying at encryption side.

Architecture of proposed cryptosystem
First, the given speech signal is sampled in the range between 0 and 1 and are divided into four levels L 0 = − 1 to − 0.5, L 1 = − 0.5 to 0, L 2 = 0 to 0.5, and L 3 = 0.5 to 1. Each level of the speech samples is permuted with respect to the corresponding chaotic mapping techniques such as logistic map, tent map, quadratic map, and Bernoulli's map. These chaotic generators are used to generate the same amount of random numbers equal to the speech samples in each segment (Figs. 1, 2, and 3).
The random numbers generated by the chaotic generators are sorted in ascending order, and the corresponding indexes are taken from the sorted list. Based on the indexes of the random numbers, the sampled values of speech signals are permuted. The permuted parameters are substituted with the random numbers generated by corresponding chaotic generator.
The process of selection of chaotic generator for each level of sampled speech is carried out by chaotic switch keying technique. The method of chaotic switching represents the simplest form of modulation with chaotic attractors. The signal u(t) controls the switch which toggles between the chaotic systems and different parameters L 0 , L 1 , L 2 , and L 3 . The encryption scheme consists of four chaotic subsystems: i. Subsystem with the parameters L 0 -active when −1 ≤ u(t) < − 0.5 ii. Subsystem with the parameters L 1 -active when −0.5 ≤ u(t) < 0 iii. Subsystem with the parameters L 2 -active when 0 ≤ u(t) < 0.5, and iv. Subsystem with the parameters L 3 -active when 0.5 ≤ u(t) ≤ 1 Transmission of the chaotic attractor A 0 , generated by the first chaotic circuit based on logistic mapping (with the parameters L 0 ), corresponds to the value − 1 to − 0.5. Transmission of the attractor A 1 , generated by the second chaotic circuit based on tent mapping (with the parameters L 1 ), corresponds to the value − 1 to − 0.5. Transmission of the attractor A 2 , generated by the third chaotic circuit based on quadratic mapping (with the parameters L 2 ), corresponds to the value 0.5 to 0.75. And transmission of the attractor A 3 , generated by the second chaotic circuit based on Bernoulli's mapping (with the parameters L 3 ), corresponds to the value 0.75 to 1.
The entire system acts as a control which switches between the attractors A 0 , A 1 , A 2 , and A 3 . The receiver also consists of four chaotic subsystems which have to be identical and synchronized with the transmitter side. The first one is designed for demodulating the values between − 1 and − 0.5, the second one for the values between − 0.5 and 0, the third one for values between 0 and 0.5, and the fourth one for the values between 0.5 and 1. The demodulation is carried out on the basis of decisions within a regular time interval. An effective demodulation of a particular value is possible only when the chaotic systems on the transmitter and the receiver sides are exactly synchronized. After sequence of permutation process, the entire speech samples are appended.

Chaotic switching and modulation
The method of chaotic switching represents the simplest form of modulation with chaotic attractors [18][19][20]. It is suitable for deciphering digital signals. The essence of the chaotic modulation refers to modulation of the input signal y(t) by a chaotic signal u(t) generated by the chaotic signal generator. The signal y(t) is modulated by the signal u(t) in the chaotic modulator where multiplication occurs. The modulated signal s(t) is transmitted over the communication channel to the receiver where in the chaotic demodulator, the demodulation or division of the modulated signal s(t) with the chaotic signal u(t) is carried out. The equality of the receiver's and the transmitter's parameters and their synchronization is a condition for successful demodulation 5 Results and discussion The proposed system was tested in Matlab. This system was subjected to correlation test, SNR test, PSNR test, security analysis, randomness test, sensitivity test, histogram analysis, and robustness test which are carried out to prove the performance metrics [21]. Four sample speech signals are taken randomly from TIMIT database and are sampled at 8 kHz with length of 3 Sec to 8 Sec and 8000 samples per frame. The higher the chaos to signal ratio, the more secure the system is considered [22] so that five chaotic generators are used in this proposed system in which one is the primary and remaining four are secondary. Four samples are taken into account for SNR test and PSNR tests to measure the intelligibility of the speech and encrypted signal. The measuring tests are given below:

Correlation test
The auto-correlation function identifies the chaotic system that produces a strong encryption [23]. A useful measure to assess the encryption quality of any cryptosystem is correlation coefficient between similar segments in the clear signal and the cipher signal. It is calculated as: where C(x, k) is the covariance between the original signal x and the encrypted signal k. V(x) and V(k) are the variances of the signals x and k. The variance V(x) is computed as: where N s is the number of speech samples. The low value of the correlation coefficient r xk shows an encryption with good quality. The correlation coefficients for the three different encrypted speech samples with the chaotic maps are illustrated in Fig. 10, and the encrypted speech signals with proposed method using five different chaotic random number generator are tabulated in Table 1. From these results, we infer that the proposed algorithm produces encrypted speech with low correlation between similar segments in the original speech and the encrypted speech.
In other words, the encryption method offers good encryption results. In this proposed method, we have obtained correlation coefficient as 0.998; it shows that the original speech signal has been permuted to the extent of almost 100% in decryption process so it is tough to the eavesdroppers to hack the speech signal in channel during transmission.

SNR test
Signal-to-noise ratio (SNR) test is an ideal estimator for measuring the speech signal intelligibility [23]. The popular time domain metric is the SNR, which is defined as the average of the SNR values of short segments of the output signal and is calculated as: where y(i) is decrypted speech signal. If the SNR is closer to zero, the higher is the quality of the decrypted signal. In this proposed method, we have obtained SNR value as 0.23 × 10 −14 ; it shows the original speech signal has been almost recovered in decryption process. A quality measure of the proposal algorithm for four speech signals is represented in Table 2.

PSNR test
Peak signal-to-noise ratio (PSNR) is the ratio between the maximum possible power of original speech signal and the power of encrypted signal [23]. PSNR is a calculation of encryption quality of the original signal. A higher PSNR indicates that the encryption or reconstruction is of higher quality. The PSNR is obtained from: where x is the maximum absolute square value of the original speech signal, n is the length of encrypted signal, and x − k is the energy of the difference between original and encrypted signals.

Security analysis
The sensitivity test is performed in view of security analysis to evaluate the protection of the proposed cryptosystem against various attacks [24]. To ensure the sensitivity of encrypted speech, the input speech signal is permuted for multiple levels using multiple chaotic maps to change the position of sampled values as the chaotic generators generate a voluminous random numbers. For testing the sensitivity of the proposed cryptosystem, the encrypted signal is decrypted with the reverse process of encryption method using the corresponding chaotic maps at the same four levels and resulting good quality speech recovered.

Histogram analysis
This test is applied to evaluate the immunity of the algorithm against differential attacks, and few speech samples are chosen randomly from TIMIT dataset. The histogram analysis has been taken account to prove the strength of our algorithm, and the results are shown in Figs. 4, 5, 6, 7, 8, 9, and 10. Histogram of input speech sample shown in Fig. 4 is very closer to the histogram of decrypted signal shown in Fig. 9. The encrypted speech signal shown in Fig. 8 is very strong and fully masked. Figure 5 shows the histogram of splitted speech signal into L 0 , L 1 , L 2 , and L 3 , and Fig. 6 shows the histogram of permutation and substitution of parameters L 0 , L 1 , L 2 , and L 3 using four different chaotic generators. Figure 7 shows the histogram of reconstructed parameters L 0 , L 1 , L 2 , and L 3 . The result shown in Fig. 7 shows that the splitted signals L 0 , L 1 , L 2 , and L 3 have been reconstructed perfectly.

Robustness test
The LSB of each speech sample is inverted and obtained a modified speech signal. Actual and modified speech signals are encrypted using the same key, and two ciphered speech signals are generated. These ciphered speech signals are then compared by the number of sample change rate (NSCR) and the unified average changing intensity (UACI) [25]. The NSCR is given by and where x and x' are the two ciphered speech signals whose corresponding original signals have only one-sample difference; the values of the samples at position i of x and x' are respectively denoted by x i and x i '; l corresponds to the length of the speech vector. The ideal values for  NSCR and UACI are 100% and 33.3%. In Table 3, the minimum, maximum, and average values of NSCR and UACI, computed from the encryption of four different modified versions of each speech signal are given. The results are considerably close to the ideal values and independent on the position of the modified sample as compared with [25].

Discussion and conclusions
Speech encryption using multiple chaotic generators and dynamic chaotic shift keying is a proven model. In this method, the four different speech signals with different time duration are sampled and its values are permuted using logistic map and further the permuted values are segmented into four levels. Each level is permuted using four different chaotic generators. Chaotic shift keying mechanism is dynamically assigning different chaotic maps to different levels of sampled values for shuffling the speech samples at every level. The histogram of the encrypted signal shows that more sensitivity entails more security. The decrypted signal is very similar to the original speech as it shows the stability of reconstruction of original signal. Correlation test, SNR test, and PSNR testing methods are applied to estimate the performance of the system. The result obtained by the proposed system is highly screened from attackers and has a powerful diffusion and confusion mechanism and better for realtime speech communication. It is verified that the proposed method has high level of security and can recover the original signals quickly with good audio quality. Robustness test has also been carried out by using NSCR and UACI to assess the withstanding capacity of the algorithm against various attacks. The results endorse that the speech signal is highly masked from eavesdroppers.