Second-Order Statistical Approach for Digital Modulation Scheme Classification in Cognitive Radio Using Support Vector Machine and K-Nearest Neighbor Classifier

Cognitive radio systems require detection of different signals for communication. In this study, an approach for multiclass signal classification based on second-order statistical feature is proposed. The proposed system is designed to recognize three different digital modulation schemes such as PAM, 32QAM and 64QAM. The signal classification is achieved by extracting the 2nd order cumulants of the real and imaginary part of the complex envelope. These second-order statistical features are given to multiclass Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) classifier for classification. The modulated signals are passed through an Additive White Gaussian Noise (AWGN) channel before feature extraction. The performance evaluation of the system is carried using 400 generated signals. Experimental results show that the proposed method produces an accurate classification rate in the range 65%-89% for SVM classifier and 65-68% for KNN classifier


INTRODUCTION
A number of definitions can be found to describe Software Defined Radio, also known as Software Radio or SDR.Software Defined Radio is defined as: "Radio in which some or all of the physical layer functions are software defined" (Petrova et al., 2010).A radio is any kind of device that wirelessly transmits or receives signals in the Radio Frequency (RF) part of the electromagnetic spectrum to facilitate the transfer of information.In today's world, radios exist in a multitude of items such as cell phones, computers, car door openers, vehicles and televisions.A study of multi-class signal classification based on automatic modulation recognition through Support Vector Machines (SVM) is presented by (Petrova et al., 2010).
Obviously SDR in Cognitive Radio should be configured not only to independent standards, protocols and services but also to the extensively dynamic nature of bandwidth allocation by (Rajeshree and Kulat, 2011).Cognitive radio is envisioned as the ultimate system that can sense, adapt and learn from the environment in which it operates.A new robust Automatic Modulation Classification (AMC) algorithm, which applies Higher-Order Statistics (HOS) in a generic framework for blind channel estimation and pattern recognition, is proposed by (Hsiao-Chun et al., 2008).
Feature based method for automatic classification and recognition of seven digital modulations for Software Defined Radio is presented by (Roganovi et al., 2009).The classification is conducted with Artificial Neural Networks (ANN).The performance of energy detection based spectrum sensing for several real-world primary signals of various radio technologies is presented by (Lopez-Benitez et al., 2010).A method for the automatic classification using cumulants derived using fractional

JCS
lower order statistics is proposed by (Narendar et al., 2011).The performance of the classifier is presented in the form of probability of correct classification under noisy and fading conditions.
A novel approach based on fuzzy logic to classify signals with respect to standards on the basis of known radio parameters is presented by (Ahmad et al., 2010).Ideally it would like to classify the primary user systems with respect to existing "Known standards".A novel design of the Automatic Modulation Recognition (AMR) method with reduced computational complexity and fast processing speed is needed.A Discrete Likelihood-Ratio Test (DLRT)-based rapid-estimation approach to identifying the modulation schemes blindly for uninterrupted data demodulation in real time is described by (Xu et al., 2010).
Sensing of digitally modulated primary radio signals is described by (Popoola and Olst, 2011).In achieving this objective, a digital automatic modulation classifier was developed using an artificial neural network.A new framework for Cognitive Radio (CR) spectrum sensing based on linear and polynomial classifiers is proposed by (Hassan et al., 2010).A cooperative CR network is considered in this study with CR nodes collaborating in making the decision about spectrum availability.The automatic modulation classification methods based on likelihood functions, studies various classification solutions derived from likelihood ratio test and discusses the detailed characteristics associated with all major algorithms is presented by (Xu et al., 2011).Wavelet transform, a multi resolution analysis based classification of digitial modulation scheme is presented by (Kannan and Ravi, 2012).
In this study, an approach for the digital signals classification in cognitive radio based on cumulants and SVM is presented.

MATERIALS AND METHODS
The proposed system for the classification of digital signals in cognitive radio is built based on second-order statistics, multiclass SVM and KNN classifier for classification.The theoretical background of all the approaches are introduced here.

Second-Order Statistics
The autocorrelation function or sequence of a stationary process, x (n) is defined in Equation 1: where, E{⋅} denotes the ensemble expectation operator.The power spectrum is formally defined as the Fourier Transform (FT) of the autocorrelation sequence (the Wiener-Khintchine theorem) is given in Equation 2: where, f denotes the frequency.An equivalent definition is given in Equation 3: where, X (f) is the Fourier Transform of x(n) is given in Equation 4: A sufficient, but not necessary, condition for the existence of the power spectrum is that the autocorrelation be absolutely summable.The power spectrum is real valued and nonnegative, that is, P xx (f) ≥0; if X (n) is real valued, then the power spectrum is also symmetric, that is, P xx (f) = -P xx (f).
The higher-order moments are natural generalizations of the autocorrelation and cumulants are specific nonlinear combinations of these moments.The firstorder cumulant of a stationary process is the mean C 1x : E{x(t)} the higher-order cumulants are invariant to a shift of mean.Hence, it is convenient to define them under the assumption of zero mean.If the process has nonzero mean, then subtract the mean, apply the following definitions to the resulting process.The second-order cumulants of a zero-mean stationary process are defined by (Roganovi et al., 2009) which is given in Equation 5: The first-order cumulant is the mean of the process; and the second-order cumulant is the auto covariance sequence.Note that for complex processes, there are several ways of defining cumulants depending upon which terms are conjugated.The zero-lag cumulants have special names: C 2x (0) is the variance and is usually denoted by 2 x σ .

Support Vector Machine
Support Vector Machines (SVMs) are a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis.The standard SVM is a non-probabilistic binary linear classifier, i.e. it predicts, for each given input, which of two possible classes the input is a member of.A classification task usually involves with training and testing data which consists of some data instances.Each instance in the training set contains one "target value" (class labels) and several "attributes" (features).SVM has an extra advantage of automatic model selection in the sense that both the optimal number and locations of the basis functions are automatically obtained during training.The performance of SVM largely depends on the kernel (Smola et al., 1998).The mathematical derivation of SVM classifier presented by (Xiao-Juan and Dan, 2010) is as follows.
SVM is essentially a linear learning machine.For the input training sample set defined in Equation 6: The classification hyperplane equation is let to be in Equation 7: Thus the classification margin is 2/|ω|.To maximize the margin, which is to minimize |ω|, the optimal hyperplane problem is transformed to quadratic programming problem as follows in Equatin 8: After introduction of Lagrange multiplier, the dual problem is given in Equation 9: According to Kuhn-Tucker rules, the optimal solution must satisfy in Equation 10: For every training sample point x i , there is a corresponding Lagrange multiplier.And the sample points that are corresponding to a i = 0 don't contribute to solve the classification hyperplane while the other points that are corresponding to a i > 0 do, so it is called support vectors.Hence the optimal hyperplane equation is given in Equation 13: The hard classifier in Equation 14 is: For nonlinear situation, SVM constructs an optimal separating hyperplane in the high dimensional space by introducing kernel function K (x.y) = φ(x)φ(y) hence the nonlinear SVM is given in Equation 15: And its dual problem is given in Equation 16: Thus the optimal hyperplane equation is determined by the solution to the optimal problem.A SVM classifier can predict the input data into two distinct classes.However, it can be used as multiclass classifiers by treating a K-class classification problem as K two-class problems.This is known as one vs.rest or one vs.all classification.
The SVM classifier implementation is standard implementation.In the MATLAB environment the Science Publications

JCS
LIBSVM software is used.LIBSVM is a integrated software for support vector classification, regression and distribution estimation.It also supports multi-class classification.

KNN Classifier
In pattern recognition, the K-Nearest Neighbor algorithm (K-NN) is a method for classifying objects based on closest training examples in the feature space.K-NN is a type of instance-based learning where the function is only approximated locally and all computation is deferred until identification.In K-NN, an object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors (k is a positive integer, typically small).If k = 1, then the object is simply assigned to the class of its nearest neighbor.The neighbors are taken from a set of objects for which the correct identification is known.This can be thought of as the training set for the algorithm, though no explicit training step is required.

Proposed System
The proposed system for the classification of digital signals in cognitive radio mainly consists of two different phases which include the training phase and classification phase.All the phases are explained in detail in the following sub sections.Three different types of digital modulation schemes are considered for the classification (PAM, 32QAM and 64QAM).

Training Phase
In the proposed method, 2nd order cumulants of real and imaginary part of the complex envelope are used as features for the classification of digital signals.The training phase is shown in Fig. 1.The generated signal is first modulated by using PAM, 32QAM and 64QAM modulation schemes.These modulated signals are passed through an AWGN channel with a predefined SNR level.The second-order statistical features extracted from the received signals and stored in the database for the classification purpose.The SVM and KNN classifiers are trained by using the database generated in the training phase.The algorithm is as follows.

Algorithm I: Training Phase
[Input] Generated signals [Output] the feature vector of all modulated signal with noise as Database (DB) 1) Modulate the signal by using PAM modulation scheme.
2) Pass the modulated signal through an AWGN channel with predefined SNR level 3) Calculate the 2nd order cumulants by Equation ( 5) 4) Step 3 is repeated for real and imaginary part of the complex envelope.
5) Insert this feature vector and the known class into the database.
6) Repeat the above steps for 32QAM and 64QAM modulation schemes.
Figure 2-4 shows the generated signal, modulated signal and 1 dB noisy signal for PAM, 64QAM and 32 QAM respectively.

Classification Phase
In the classification phase, the unknown signal is classified as any one of the three modulation types.The second order statistical features are extracted from the unknown signal and this feature vector is processed with the features in the database by using the SVM and KNN classifier.The algorithm is as follows.5(2) Step 1 is repeated for real and imaginary part Of the unknown signal (3) Test with the trained SVM and KNN classifier and find the class of the unknown signal.

Performance Metrics
The performance of the proposed method for the classification of digital modulation scheme is measured by confusion matrix, classification accuracy and Positive Predictive Value (PPV).The performance evaluation methods are defined below.

Confusion Matrix
A confusion matrix represents information about actual and classified cases produced by a classification system.Performance of such system is commonly evaluated by classifying the correct and incorrect patterns.The typical construction of the confusion matrix for the two class problem is presented in Table 1.

Classification Accuracy
Classification accuracy is the mainly familiar method to evaluate the performance of the classifiers.Classification accuracy has been computed based on the number of correctly classified digital signals in order to evaluate the efficiency and robustness of the algorithm.The classification accuracy is defined in Equation 17:

Total number of correctly classifiedsignals Classification Accuracy
Total number of signals = (17)

Positive Predictive Value (PPV)
The PPV or precision rate is the proportion of digital signals with positive test results which are correctly classified.It is a critical measure of the performance of the proposed method, as it reflects the probability that a positive test reflects the underlying condition being tested for.The PPV is defined in Equation 18:

RESULTS AND DISCUSSION
A set of 400 signals with segment size of 1024samples are generated.These 400 signals are modulated by using PAM, 32QAM and 64QAM modulation schemes and passed through an AWGN channel of predefined SNR level.Among these 400 signals per modulation scheme are separated into two set and 300 signals per modulation scheme are randomly selected as training set and the remaining 100 signals per modulation scheme as testing set.The SNR level used in the proposed system are 0, 1, 5 and 10 dB.For each modulated scheme, there the 1600 modulated signal corrupted by AWGN per each segments.
Table 2-5 shows the confusion matrix obtained from the SVM classifier for 1024 samples at SNR level 0, 1, 5 and 10 dB respectively and Positive Predictive Value (PPV) also shown.Table 6-9 shows the confusion matrix obtained from the KNN classifier for 1024 samples at SNR level 0, 1, 5 and 10 dB respectively and Positive Predictive Value (PPV) also shown.From the results it is concluded that the classification accuracy increases as SNR increases and the SVM outperforms the KNN classifier.Figure 5 shows the overall classification accuracy of the proposed system.In Table 1, Among the 400 signals generated per each modulation scheme for 0 dB, the true positive value for PAM, 64QAM and 32QAM are 116, 400 and 267 respectively.The overall classification rate for 1024 samples at 0, 1, 5 and 10 dB is 65.25, 73.33, 88.83 and 88.91. Figure 5-8 shows the classification rate of the proposed system using different classifier with different noise levels for PAM, 64 QAM and 32QAM respectively.

CONCLUSION
In this study, an approach for multiclass signal classification based on second-order statistical features is presented.The 2nd order cumulants of the real and imaginary part of the complex envelope are used as features for multi signal classification.The proposed system tested on three different modulation schemes PAM, 32QAM and 64QAM.Two different classifier, SVM and KNN classifier are used to classify the digital signals.From the results, SVM classifier outperforms the KNN classifier for digital signal classification and also it is observed that the classification accuracy of the PAM scheme for 0 dB is much lesser than all other schemes used.Confusion matrix is used to evaluate the performance of the proposed system and the experimental results prove that the proposed system provides satisfactory performance for the multi signal classification.

Table 2 .
SVM Classification accuracy for 1024 samples at 0

Table 3 .
SVM Classification accuracy for 1024 samples at 1

Table 8 .
KNN Classification accuracy for 1024 samples at 5

Table 9 .
KNN Classification accuracy for 1024 samples at 10