Deep Learning Approaches for Impulse Noise Mitigation and Classification in NOMA-based Systems

The new emerging networks such as smart grids, smart homes and Internet of Things have enabled user accessibility across the globe and employ non-orthogonal multiple access (NOMA) scheme to accommodate huge number of connected devices. These devices which include smart meters, sensor and actuators etc. suffer from impulse noise (IN) while operating with power systems. Furthermore, NOMA scheme provides power domain multiple access (PDMA) which is found to be susceptible to IN. Based on the aforementioned IN intervention and its degrading effect on communication applications, novel mechanisms are desired to mitigate and classify the IN induced in the received signal. In this research work, novel IN mitigation and classification techniques are presented using deep learning methods for NOMA-based communication systems. The IN detection is performed by first identifying the IN occurrences using a deep neural network (DNN) which learns statistical traits of noisy samples followed by removal of harmful effect of IN in the detected occurrences. Using the proposed DNN, higher bit error rates (BER) were achieved when compared with the existing IN detection methods. The proposed method was further validated for high and low IN, and weak and strong IN occurrence probabilities. Moreover, another deep learning network is proposed in this research work to effectively distinguish between high IN and low IN in the noise contaminated NOMA symbols which can help improve the performance of IN detection models. Both of the deep learning methods proposed in this study show strong potential to address IN problem faced by the NOMA scheme.


I. INTRODUCTION
N On-orthogonal multiple access (NOMA) scheme is a new emerging technology for 5G which can fulfill the increasing demand of bandwidth in next generation networks. NOMA exhibits high spectral efficiency due to full bandwidth allocation to each user at the same time and frequency while differentiating users based on a power allocation strategy. In this scheme, the nearer user (lower-order user) receives less power than the farther user (higher-order user) from the Base Station (BS). At the transmitter side, BS superimposes signals of all users at the same time and frequency whereas at the receiver side, the nearer user handles signals of other users as interference thereby performing successive interference cancellation (SIC) to extract its own signal. Following this strategy, the farthest user decodes its signal without SIC and considers signals of other users as noise because its own signal contributes most part of the received signal [1].
The research conducted in the recent past shows increased usage of NOMA with background additive white Gaussian noise (AWGN) [2]. The authors further investigates the deteriorating effect of impulse noise in wireless and powerline communications in smart grids, health care monitoring applications, intra-vehicle communications, industrial and mining environments. The sources of such noise were found to be automotive ignition, power lines, industrial equipment, household appliances, electronic devices, medical equipment etc. In the literature, impulse noise (IN) is defined as noise that occurs for short instances randomly in time but carries huge amount of power [3]. Despite a few research studies [4]- [6] which have discussed the strong impact of IN on NOMA-based communication systems, most researchers have included AWGN as a noise benchmark. As NOMA uses power domain multiple access, it is crucial to detect the occurrence of IN since it can change the power level of signal which in turn degrades NOMA performance. The classical IN detection approaches perform blanking and clipping by setting an optimum threshold which is susceptible to vary in response to channel conditions and therefore causes model mismatch. Any change in the detection threshold of impulses in traditional methods can cause ambiguity in the receiver threshold of signal detection, thus deteriorating the performance of NOMA. The unique traits of NOMA scheme which are non-orthogonal resource allocation and subsequent interference cancellation, have additionally increased the user's response to such IN interference [5]. The aforesaid issues demand more research work to be carried out to evaluate the impact and suppression strategies of such noise on NOMAbased systems. Moreover, investigation on efficient IN classification methods can significantly help develop better IN mitigation strategies.
In this research paper, two deep learning strategies have been proposed; one for IN detection and the other for high/ low IN classification respectively to overcome the above discussed challenges in NOMA-based communication systems. The proposed IN mitigation approach first detects and then suppresses IN. For this purpose, the data samples with induced IN are identified using a deep learning network (DNN) and then such instances are decoded to remove the noise. The developed DNN scheme works independently of the traditional noise mitigation methods such as blanking, clipping etc. This ensures greater success in IN detection as compared to the traditional methods. The performance of the proposed DNN-based method is measured in terms of Bit error rate (BER) and compared in a NOMA-based system with the conventional approaches presented in the literature. Furthermore, the performance of DNN for IN mitigation is also measured for high and low IN, and weak and strong impulse occurrence probabilities considering a NOMA user pair scenario.
Although the occurrence of impulse is random in nature, high impulse occurrence is generally low but occasionally it becomes high.

A. EFFECT OF IMPULSE NOISE IN NOMA-BASED SYSTEMS
Extensive literature on NOMA, mostly in the context of next generation networks like smart grid or Internet of Things (IoTs), has presented several challenges in NOMA system implementation. The effect of IN on NOMA uplink [4] and downlink [6] system have also been analyzed. Research work in [4] has presented the outage performance of NOMA uplink systems in presence of IN. The vulnerability of NOMA system due to IN was validated by thorough Monte-Carlo simulations along with analytical results. In [6], authors presented the impact of IN on sum rate capacity of NOMA downlink systems. A particle IN scenario was used to quantify the actual loss due to IN. The authors further examined the performance of NOMA, in the presence of composite noise (lmpulse with AWGN) over Rayleigh fading channel. Pairwise loss of bits (LoB) expression was derived and further applied to develop a union bound on the bit error rate (BER). The research work presented quantification of the variation in channel conditions in the presence of composite noise as experienced by NOMA users. In [5], authors investigated the performance degradation of NOMA-based IoT networks due to IN and proposed mitigation technique. A deep learning based multistage nonlinear solution was proposed to estimate the IN parameter for received orthogonal frequency division multiplexing (OFDM) symbols originating from power domain multiple-NOMA (PDM-NOMA) scheme.

B. IN MITIGATION TECHNIQUES: MEMORYLESS NONLINEAR MITIGATION APPROACHES
The threshold-based IN scheme is described as a memoryless nonlinear mitigation approach and includes methods like blanking [8], clipping [9], and blanking/clipping [10]. In this mitigation approach, the short duration and the high amplitude of IN are investigated by using a threshold whose selection remains a challenging task. A threshold optimization technique is presented by the authors in [14] based on Neyman-Pearson criterion. In [11], the researchers provided an analytical equation for IN mitigation using blanking and clipping. A comparison of several analog domain processing techniques IN alleviation is presented in [15] which showed that selection of threshold value is the key parameter to improve performance of threshold-based nonlinear solutions. The model becomes mismatched when the threshold changes in response to channel conditions. As an implication, the performance of all the conventional threshold-based methods suffer from harsh impulsive environment.

C. IN MITIGATION TECHNIQUES: MACHINE LEARNING BASED MITIGATION APPROACHES
Deep learning is a popular machine learning method which has been applied [13], [16], [17] for IN mitigation in applications pertaining to signals and images, and power/bandwidth allocation in communication networks respectively. Due to its inherent ability to estimate the noise patterns and uncertainty in samples, deep learning network can be used for IN detection but requires appropriate processing strategies and network structures .
In an Orthogonal Frequency-division Multiple Access (OFDMA) system, the performance of conventional mitigation approaches can be degraded due to high Peak to Average Power Ratio (PAPR) of OFDM signals. Therefore, there is a trade-off between a false positive and true detection occurrence in the conventional threshold-based approaches. To address the aforesaid drawbacks, authors in [7] proposed a deep learning based IN mitigation technique for an OFDM based communication model. The presented IN elimination VOLUME 4, 2016 strategy was implemented in two parts where IN detection was followed by IN elimination. First, a DNN was developed to identify the IN effected signal samples. Then, in the second stage, identified IN effected signal was proposed to be either clipped or blanked. In [18], DNN processed the run-time sample value using median deviations filter output as an input feature for IN detection. In [19], Rank-Ordered Absolute Differences (ROAD) statistic was used to determine whether the run-time sample is affected by IN. A comparison of the proposed DNN with the state-of-the-art methods has been made in Table 1.
Although DNN hasn't been fully utilized for IN migration in NOMA-base systems in the past, state-of-the-art of work has been presented in the literature in resource allocation, optimization, and receiver design for NOMA-base systems. In [20], authors presented end-to-end optimization of NOMA with help of a unified deep multi-task learning framework termed as DeepNOMA by dealing with the nonorthogonal overlapped transmissions as multiple distinctive but correlated learning tasks. DeepNOMA was composed of DeepMUD and DeepMAS modules, which corresponds to multiuser detection and signature mapping, respectively. To ensure fairness among tasks and to avoid getting stuck into local optima, the authors proposed a novel multi-task balancing loss function. In [21], for NOMA joint signal detection, authors developed a novel deep learning aided endto-end receiver which simultaneously realizes the estimation, equalization, and demodulation. The authors proposed a deep learning-aided multi-user detection (DeepMuD) in [22] to allow the massive machine type communication by multiuser detection in uplink NOMA. The proposed DeepMuD model greatly improves the error performance of the uplink NOMA and performs better than the conventional detectors. In [23], the authors presented a uniform signal processing architecture-based deep neural network. Due to the universal function approximation property of DNN, the end-to-end optimization of NOMA transceivers is achieved by the authors. The authors optimized the transceivers to match cross-layer behaviors and extract the user access behaviors out of the time-series signals with the help of DNN. The integration of neural computation and NOMA to achieve low-cost data transmission with high efficiency was further analyzed.

D. RESEARCH CONTRIBUTION
The article [5] presented a theoretical framework on performance degradation and mitigation of IN in the context of NOMA-based IoT networks. On the contrary, in our research work, a model is developed to analyze the effect of IN in NOMA-based systems and mitigation approaches are presented by providing mathematical and simulation justifications which is a superior contribution.
The model presented in [5] determines only the blanking/clipping threshold using a deep neural network (DNN) for NOMA user pair and then uses the computed threshold (a classic mitigation technique) for IN mitigation. In comparison, our proposed DNN approach (a novel approach) has identified the IN contaminated bits in the NOMA user pair and removed the harmful effect of IN. Additionally, the performance of proposed DNN approach is evaluated in terms of low/high impulse and strong/weak likelihood of IN. An analysis of DNN performance for a NOMA user pair is also shown in which one performs SIC (user 1) and the other one does not (user 2). Moreover, classification of IN as low and high IN using a DNN is also performed. The presented classification method aims to improve the performance of proposed mitigation approach.

III. SYSTEM MODEL
The system model for IN detection and classification in NOMA symbols using a DNN is shown in Fig. 1. The communication channel is contaminated by Rayleigh fading and IN. In the pre-processing step, conversion of NOMA noisy/noiseless symbols into sample bits is carried out which is followed by DNN training. The trained DNN then performs either the detection or the classification of IN on test NOMA symbols.

A. IN MODELS
The proposed mitigation system employs the following IN models :

1) Bernoulli-Gaussian Model
According to Bernoulli-Gaussian model, level of noise can be expressed as: where n G and n I represent AWGN with mean zero and variance σ 2 G and σ 2 I , respectively [24], [25]. Here for Bernoulli random sequence, b is the arrival rate of IN with probability p, independent of n G and n I . The noise is independent and identically distributed (i.i.d.) random variable, with probability density function (pdf) given by: where G(v, 0, σ 2 G ) is a Gaussian pdf with mean µ x and variance σ 2 x . The average noise power ξ 0 is given by:

2) Laplacian-Gaussian Model
In comparison to Bernoulli noise model, pdf of Laplacian noise with zero mean and variance 2c 2 has a heavier tail and moves slower toward zero [26], [27].
where r is rate of arrival of IN, defined as Bernoulli random variable with pdf of Laplacian-Gaussian model.

B. IN ANALYSIS IN NOMA-BASED SYSTEMS
For non-orthogonal downlink transmission, BS simultaneously transmits signals to multiple user equipments (UEs) at the same time and frequency. Each user utilizes a fraction of total power P , therefore multiple users can use the same frequency bandwidth and they are distinguished by the power level that is allocated by the BS. The decoder detects the signals of a user by successful interference cancellation (SIC) while the other users are treated as noise [28]. For two user model, received symbol for user 1 (lower-order user) in the presence of IN can be written as: And received symbol for user 2 (higher-order user) in the presence of IN can be written as: where s 1 and s 2 are the transmitted symbols, α 1 and α 2 are the power allocation coefficients and h 1 and h 2 are the channel coefficients for user 1 and user 2 respectively.
For an i th user, the received symbol in presence of IN can be written as: where s i and s j are the transmitted symbols for i th and j th users respectively and h i is the channel coefficient for the i th user.
Considering the NOMA scheme, users' signals which are closer to BS are assumed to be interference and farther ones to BS are treated as noise. The second term in (10) i−1 j=1,j =i s j α j P h i = χ represents the level inter-cell interference symbols for U E i . Also, χ 0 = i−1 j=1,j =i α j P |h i | 2 is the average power of inter-cell interference which is considered as noise. α is power allocation coefficient, α 1 + α 2 + α 3 + · · · + α i = 1 and h is the fading coefficient of wireless channel. Now, the above equation becomes: The channel is represented as a Rayleigh fading channel. The pdf of Rayleigh fading channel with random variable v = h is given by: and its m th moment is: We can define average received Signal-to-noise ratio (SNR) as: where E b is the bit energy and E[h 2 ] is 2 nd moment. Let γ denote the instantaneous output SNR, defined as follows: Through two degrees of freedom, |h| 2 is chi-square distributed with the condition that random variable |h| is Rayleigh distributed. Also, γ is chi square distributed if |h| 2 is chi square distributed. Hence, the pdf of γ can be expressed as: The BER is given by: Substituting the values of P e (e/u) and f γ (u):

IV. EXISTING IN SUPPRESSION STRATEGIES A. BLANKING
In blanking method, the received signal is substituted by null if the signal is above blanking threshold T b for IN cancellation at the receiver. No change is accrued in the phase and other parameters and only amplitude is changed [12].
where T b is known as the blanking threshold.

B. CLIPPING
The clipping method performs IN cancellation by clipping the received signal at T c level if signal is above the threshold T c . A clipping algorithm is used for cancellation of IN at the receiver where IN is being added by the channel during communication. In clipping, no change is accrued in the phase and other parameters and change occurs in the amplitude only [12].
where T c is known as the clipping threshold.

C. BLANKING/CLIPPING
The blanking/clipping hybrid method clips the received signal at T 1 level if signal amplitude lies between the threshold T 1 & T 2 . The received signal is substituted by null if signal is above the threshold T 2 and remains unchanged if it is lower than the threshold T 1 . In this method, no change is accrued in the phase and other parameters and only amplitude is changed. [12].
where the blanking threshold (T 2 ) is greater than the clipping threshold (T 1 ).

V. PROPOSED DNN FOR IN DETECTION
In order to detect IN, a deep learning strategy is devised as a deep neural network (DNN) which searches out the instances of IN corrupted samples. Theoretically, deep learning is a subset of machine learning which learns and improves its own performance using machine learning algorithms. In this section, the layout of proposed DNN, its input features, output and signal detection approach are presented.

A. DNN LAYOUT
The proposed DNN is designed with three fully connected hidden layers U [1] , U [2] and U [3] , each of which consists of n 1 , n 2 and n 3 neurons respectively. These layers map the input to the appropriate outputs. The number of layers and neurons are chosen by performing repeated experiments such that minimum training loss is achieved. The input to the DNN is a set of three features denoted by z = [z 1 , z 2 , z 3 ] T which are discussed in the next subsection and the output is represented by o/p . An activation function af at each layer enables connection from the preceding layer to the following layer using parameter matrix and bias vector. The parameter matrix W and bias vector q connect the hidden layers with each other and to the output layer using the number of neurons (n r ) chosen for the R th layer. The output has one layer which generates a binary sequence of '0' and '1' . The relationship between the hidden layers and the output layer o/p is given by: U [2] = af [2] (W [2] U [1] + q [2] ) (23) o/p = af [4] (W [4] U [3] + q [4] ) (25) Here ReLU (z) represents Rectified Linear Unit as an activation function which is used by the hidden layers to give 0 when the input z is less than 0 and gives value equal to z otherwise. The activation function applied by the output is Sigmoid function denoted by Sgmd(z) and is rounded off to give either 0 or 1 as the output. The training performance of DNN is measured using a cost function which calculates the average error associated with a predicted value of output for a training dataset. The error cost function is directly proportional to the difference of true output and predicted output and is given as [29].
Here k are training samples, λ is the regularization parameter and n r is the number of neurons in R th layer. The DNN uses backward propagation algorithm and is trained with Adam's Optimizer [30] to optimize the network performance. While training, the DNN computes the values of parameter matrices W and the bias vectors q to reduce the error function Error(W, q).

B. I/P FEATURES USED IN DNN FOR IN DETECTION
Correct I/P feature extraction and feature relationship with the output can help eliminate the problem of over-fitting which is often faced while training the DNN. The input features should be chosen so that redundancy is removed from the learning patterns of DNN, enabling better classification of sample as corrupted or un-corrupted. The noise patterns in the data samples can be identified if the test sample bit is analyzed along with its neighboring samples. For this purpose, two features have been chosen. The input layer comprises of the input sample value and the following two features: 1) ROAD Statistic Value: ROAD statistic is widely used to detect noise in 2-D images. It gives a high value for noisy samples and produces a low value otherwise [19]. In this research work, ROAD score is calculated using the following steps for 1-D sample data bits stored in a vector with dimensions of 1 × 2n: (a) The deviation magnitude between the current sample s i and the remaining samples in the neighborhood (both right and left neighbors of data samples) is denoted by Absd(i) and is calculated as : .., s i1 , s i+1 , ..., s i+n ]| (29) (b) Sort Absd(i) values in increasing order: (c) The ROAD feature is computed by adding up the first n values of arr(i): arr(i).
2) Difference Median Output: The Difference median output is represented by err i as follows: where the variable median represents a standard moving average filter taking median as the central tendency of any 2n + 1 samples at a time.

C. IN SIGNAL DETECTION
DNN is a conceptual approach that has been widely used to represent complex non-linear systems with the help of proper training. To learn about IN instances in the incoming sample values, the chosen DNN inputs include incoming sample value, difference median output feature and ROAD statistic feature which are discussed above. The value of both input features is found to be high for an input IN sample and low for a input noiseless sample. Based on the learning of statistical traits of noisy samples, the DNN output is a binary sequence of 0's and 1's where 1 represents a noisy sample detection and 0 represents an un-corrupted sample.

VI. PROPOSED DNN FOR IN CLASSIFICATION
Due to varying yet randomly changing amplitudes of the impulse, it becomes a difficult task to classify a noisy sample as high or low impulse. To tackle this problem, a DNN is proposed to classify the IN into high and low impulse categories. Similar to the IN detection DNN in Section V, classification DNN comprises of 3 hidden layers with n 1 , n 2 and n 3 neurons, one input layer and one output layer. ReLU function and Sigmoid Layer were chosen as activation functions in the hidden layers and output layer respectively. The DNN used backward propagation algorithm for training and Adam's Optimizer for optimizing the network performance. The input consists of input sample and two computed features helpful in IN classification. The output consists of one layer which generates a binary sequence of '0' and '1'. A '0' at location k as an output showed that the received samples k was corrupted with low IN whereas a '1' output indicated that the received sample contains high IN. The relationships among input, hidden and output layers have already been explained in Section V.

A. I/P FEATURES USED IN DNN FOR IN CLASSIFICATION
Although the arrival of IN is random and so is its amplitude but in theory, it follows binomial distribution. According to the distribution pdf, occurrence of high amplitude samples is less than the low amplitude samples in NOMA symbols. The high and low noise patterns in the corrupted samples can be identified if the current sample is analyzed along with its neighboring samples. For this purpose, following two significant features which are helpful in IN classification were chosen as input features along with the input sample value: 1) Difference median output: The Difference median output is represented in Section V (32) is denoted by err i .

A. SIMULATION PARAMETERS
The key simulation parameters which were common in training both the DNNs were set to the following optimal values: Learning Rate η, a hyper-parameter for fine-tuning the DNN = 0.01; λ = 0.1 ; Number of sample bits (n) = 5; The number of neurons are set as n 1 = 20, n 2 = 20 and n 3 = 10 in the first, second and third hidden layers respectively. These simulation parameters are also reported in Table 2 for a better understanding .

B. DNN PERFORMANCE FOR IN DETECTION
The proposed DNN is trained using 10 6 NOMA-based BPSK symbols defined in (8) and (9) which are contaminated by IN represented by Gaussian-Bernoulli noise model in (1). In Fig. 2, BER performance of proposed DNN-based IN mitigation approach is evaluated for average probability of impulse occurrence . Fig. 2 shows a comparison of BER performance of DNN based mitigation with blanking, clipping and blanking/clipping for IN detection. In general, BER performance is observed to improve with an increase in the SNR. It is evident that DNN outperforms conventional methods in terms of BER improvement. The results depicted in Fig. 2 show that blanking and clipping are very susceptible to noise at low levels of SNR and finding a suitable threshold to distinguish between desired and contaminated signals has been a tough task. On the contrary, the proposed DNN has successfully detected IN even at low levels of signal. This phenomenon can be seen at 5 dB SNR, where proposed DNN method has detected approximately 0.1 Mbits more true symbols out of 1 Mbits compared to conventional methods. At high SNRs, the DNN identified 700 more true symbols compared to conventional methods. This is because the conventional methods perform better detection at high SNR, therefore the performance difference between proposed DNN and conventional methods decrease.
It is noteworthy that the performance degradation level depends upon the user order. Higher-order users perform less SIC operation than the others and is strengthened by better received power. However, they experience interference from the lower-order users. In principle, a power coefficient α determines the transmit signal power of the lower-order user (user 1) as α 1 and that of the higher-order user (user 2) as α 2 = 1 − α 1 . Therefore, they are quite sensitive to any additional noise/interference. In addition, it is observed that low IN power and low IN occurrence rate which do not affect the performance of the time division multiple access (TDMA) users have a degrading impact on NOMA users. Specifically, the performance of the higher-order NOMA users is affected and this degradation is expected to increase as the number of UEs increases. Considering the above mentioned scenario, we now show that the proposed DNN based approach can efficiently eliminate the IN and has retrieved the theoretical performance of NOMA user pair specifically for the higherorder user which is more sensitive to noise. In Fig. 3, the performance of NOMA user pair using conventional method (blanking/clipping) and DNN based IN mitigation technique has been illustrated. The comparison reveals that at 5 dB SNR, performances of user 1 and user 2 are approximately same hence a comparison is insignificant. At 30 dB, DNN identified 700 more true symbols for user 1 and 2000 more true symbols for user 2 compared to conventional method. Due to more contaminated symbols in user 2 signal, the BER performance of user 2 is poorer than user 1. Due to  Fig. 3 shows the consistent performance of DNN for user 1 for IN mitigation compared to user 2. User 1 utilizes SIC to eliminate inter-user interference therefore, it has to counter IN noise only. In contrast, user 2 is inherently more effected by inter-user interference and IN, therefore user 2 has variant BER performance with different values of SNRs.
The DNN response for low and high impulse scenario is evaluated by testing the DNN on NOMA user pair in Fig. 4. A relatively highly IN (50 dB) as well as a lower IN (20 dB) scenarios have been considered. Overall, there is less effect of high or low IN on user 1 and user 2 for SNRs less than 10 dB. As the SNR increases, the performance of DNN is observed to be improved. This is due to the fact that at high SNR, ratio of signal versus IN is improved leading the signal level to become traceable for successful IN detection. In Fig. 5, DNN responses for different value of parameter 'p' are evaluated by testing the DNN on NOMA user pair. The test scenario includes high value of parameter (p = 0.5) for strong likelihood of impulse occurrence and low value of parameter (p = 0.2) for weak likelihood of impulse occurrence. It is observed that the effect of impulse probability parameter is very low because the DNN gets trained more efficiently on high or low impulse as compared to occurrence probability of impulse. The training of DNN is performed on 1 Mbits of data samples so that the effect of probability is minimum on the result.

C. DNN BASED CLASSIFICATION OF IN
The training set consists of 10 6 NOMA symbols with a range of E b /N 0 that can enable the performance evaluation under various noise interference scenarios. The performance of DNN is evaluated for IN classification as high impulse and low impulse noise for 100 highly noise contaminated NOMA symbols at 5 dB SNR. A numerical value of 2 is considered as normalized threshold level for IN classification. Since IN   arrival is random, the proposed DNN has classified the IN based on the probability of arrival and the median amplitude deviation from the neighboring samples as input features.
In Fig 6, high impulse detection using DNN is shown. High impulses are represented by red stems and low impulses which are removed using DNN are represented by blue circles on the x-axis. The x-axis represents 100 random IN samples and the y-axis represents the normalized amplitude of impulses. In Fig. 7, low impulse detection is demonstrated where low impulse noise is represented by blue stems, and high impulses which are removed using DNN are represented by red circles on the x-axis. It is evident from Fig. 6 that all identified high impulses are higher than the normalized threshold level of 2. In Fig. 7, all identified low impulses are shorter than the normalized threshold level of 2 which shows the robust performance of DNN. The classification of impulse gives information about the nature of impulses which depends upon the type of source of impulse noise generation. It is clearly visible from Fig. 6 and Fig. 7 that most of the impulses are either low or high amplitude impulses and only two impulses are in the very high amplitude category. This type of information is valuable to set parameters of DNN for the IN detection .
The DNN has identified high IN in the incoming noisy NOMA symbols with an accuracy of 99% and low impulses with an accuracy of 87% respectively. However, DNN classified a few noiseless NOMA symbols as low IN instances. This has been due to the random IN peakedness which is similar to the amplitude of noiseless NOMA symbols and can not be accurately determined. Overall, the proposed DNN classified sufficient number of IN instances accurately. The IN classification in NOMA sybmols using deep learning approach has been carried out for the first time according to authors' best knowledge in this research work and is a significant contribution of this research work.

VIII. CONCLUSION
In this research work, we have presented deep learning approaches for IN mitigation and classification for NOMA symbols. The proposed method for IN mitigation has shown superior performance when compared with several existing mitigation techniques. Another deep learning approach developed for IN classification has been attempted for the first time to classify IN as high impulse or low impulse noise with fairly good accuracy which can be used to improve the performance of IN detection models. The achieved results using deep learning methods show that deep learning has strong potential to solve several complex communication problems and can be explored further.