INTRUSION DETECTION SYSTEM USING DEEP LEARNING METHODOLOGIES

: Intrusion Detection Systems (IDS) are the backbone that helps secure organizations and individuals from malicious internet traffic. Deep-Learning is another field of computer science that enables us to build productive Artificial Intelligence (AI) models that can be applied in a variety of fields. In this paper, we discuss the CSE-CIC-IDS2018 dataset for internet intrusion detection and provide a detailed study and analysis of various deep learning approaches that could be used to make a secure intrusion detection system. We test the accuracy of these algorithms and their effectiveness in detecting the malicious traffic for multiclass classification of the traffic in 14 different classes including benign traffic and malicious traffic. The outcome of which is to obtain a model framework based upon deep learning to build a usable model for an intelligent IDS that could potentially be used for real time data traffic security

The problem encountered by most of the problems such as the one discussed above, is lack of abundant data. Data proves to be useful when it is present in an abundant quantity, is of the right format and contains important information. Since AI algorithms need abundant quantities of data to learn patterns in data for the problem in hand. The dataset discussed in this paper and used for training is the CSE-CIC-IDS2018 dataset. This is an intrusion detection dataset that contains real time internet traffic data that has been categorized into 14 different classes. One of them being benign traffic data and 13 others being various kinds of exploits. There are 80 columns that represent various fields such as port number of the data packet and so on. The Literature Survey section of the paper discusses further on the breakdown of the data set.
This paper is an in-depth analysis of deep learning algorithms such as Convolutional Neural Networks (CNNs). The Literature survey section of the paper discusses deep learning algorithms for an in-depth understanding. The Methodology section of the paper deals with how the models are applied with respect to the dataset in hand. The paper also discusses a possible framework for a Deep learning based Intrusion Detection System to be used. This paper is aimed at beginners who are new to the field of artificial intelligence and plan to learn more applications of such. Also at professionals working on deep learning and cyber security applications for a deep insight on what to use and how to apply the concepts.

LITERATURE REVIEW
The studies in this paper can be categorized as follows: In depth workings of Deep-Learning Algorithms.
Applications of AI in cyber security as well as its application in future relevance.

CSE or the Communications Security Establishment and Canadian Institute of Cybersecurity
Intrusion Detection Systems Dataset from 2018 contains network traffic data organized into 10 5281 IDS USING DL METHODOLOGIES different files. With more than 60 crore data points and 80 columns.

STUDY OF DEEP LEARNING APPROACHES
Deep Learning is a subfield of Artificial Intelligence. The backbone of deep learning and the key differentiator between deep learning and traditional machine learning is the artificial neuron.
Deep learning algorithms are made by constructing networks of these artificial neurons called an artificial neural network (ANN) or simply neural networks. Some of the networks discussed below are: : number of neurons present in the layer ; :represents the transformation with the help of weight matrix and bias matrix ; : the transformation function between layer − 1and .     network except it has special layers known as convolutional layers. The job of convoluted layers is to pass the input through a matrix of 'n x n' defined for that particular layer and it changes the input for further segmentation. The filters are used to detect features in the input provided. At the start of the layers, the initial filters may detect something trivial as an 'edge', but as the network grows deeper, the filters are successfully able to detect whole objects. The process of passing every 'n x n' successive pixel to the filters is called convolution. The pooling layers reduce the images in size by performing various operations such as max-pooling and average-pooling.

RELATED WORK
Ferrag et al. in [1], presents a similar study using CSE-CIC-IDS2018 dataset and Bot-IOT based dataset using generative deep learning approaches to test how the algorithms hold up in regards to the datasets. They present seven algorithms and test their efficiency, accuracy and false alarm rate. These algorithms are pitched against each other to obtain which ones prove to be more stable. The CNN model proposed here gives a 97% accuracy rate as compared to the 96% by the RNN-LSTM model and 82-97% of the DNN model.
The study conducted by Berman et al. in [2], provides an insightful look upon the workings of various deep learning neural networks such as general adversarial networks, deep auto-encoders and restricted Boltzmann machines. Their paper also discusses the various kinds of attacks such as spams, malware based attacks, bot based attacks and SQL injection. Their work provides a basis for other researchers to advance the cyber security field using deep learning as its core.
Xin et al. in [3], provide a study on the challenges faced by deep learning and machine learning to be applied in the field of cyber security. They work upon the NSL-KDD dataset and DARPA Roopak et al. in [6] propose various deep learning models for cybersecurity threat detection in IOT based networks. They work upon the CSE-CIC-IDS2017 dataset and with an accuracy of 97.16%, predict DDOS attacks. The model they proposed was based upon CNN + LSTM networks. They also show a brief comparative study with regards to machine learning algorithms.
Alom et al. presented an approach to build IDS using KDD-99 dataset in [7]. Using auto encoders and restricted Boltzmann machines, they reached 91.86% and 92.12% accuracy which is significantly greater than unsupervised extreme machine learning algorithms. Pairing K-Means clustering and deep learning approaches proved to be a success as presented here.
In their paper [8] Geluvaraj et al. state that with the advancement of cybersecurity and machine learning, so come newer kinds of attack which may prove to be fatal. These attacks will be difficult to detect using the traditional models of machine learning. Cybersecurity will be aided by smart AI to build better infrastructure for security.
Jian-hua in [9], discusses the intersection of the two fields of cybersecurity and artificial intelligence. The work consists of studying the areas of research that is going on regarding how cyber threats could be detected using AI aided models, the attacks that could be conducted on AYUSH CHOUBEY, AVN KRISHNA those AI models themselves and provide insights on some defense mechanisms. Using this work, they propose a way to build encrypted neural networks that are secure themselves and could be used for cybersecurity threat detection.

DNN-Threat Identification.
The model achieved an accuracy score of 94.83 % on the training dataset and 94.58% on the testing dataset.

CNN-Threat Identification.
The model achieved an accuracy score of 94.2 % on the training dataset and 95% on the testing dataset.

CONCLUSION
In this paper, we conducted a study in the field of cyber security and artificial intelligence and various areas of research that include Intrusion Detection systems and Deep Learning. We learnt about the CSE-CIC-IDS2018 dataset and the types, methodologies, and technologies behind the attacks. We learnt about some deep learning models and how neural networks are used to construct them. Using this knowledge, we proposed two different deep learning architectures and models to make a two-step IDS with intrusion detection and threat classification. We then performed a comparative study based upon the results achieved by these two architectures and four models using standard benchmarks such as accuracy, precision, recall, and f1-score. This helped us conclude a generalized framework for an IDS that could be built using deep learning methodologies.

CONFLICT OF INTERESTS
The author(s) declare that there is no conflict of interests.