Denial of Service (DoS) Attacks using PART Rule and Decision Table Rule

Intrusion detection is an efficient method of dealing with network security related problems [1]. Network Security has become a serious concern due to the development and expansion in the field of Information Technology [2]. This appreciable improvement in network technologies has showed a way for invaders or hackers to devise an unauthorised means into a network system. Therefore, an effective and timely Intrusion Detection System, which helps to enhance the security of a network, is needed when attack(s) is/are noticed [3]. Intrusion detection is a security approach used to protect computer networks from unauthorised access [1].


Introduction
Intrusion detection is an efficient method of dealing with network security related problems [1]. Network Security has become a serious concern due to the development and expansion in the field of Information Technology [2]. This appreciable improvement in network technologies has showed a way for invaders or hackers to devise an unauthorised means into a network system. Therefore, an effective and timely Intrusion Detection System, which helps to enhance the security of a network, is needed when attack(s) is/are noticed [3]. Intrusion detection is a security approach used to protect computer networks from unauthorised access [1].
An intrusion can be defined as any attempt that violates the basic elements of information security: confidentiality, integrity and availability [4]. There is necessity to apply data mining in Intrusion Detection System owing to the huge amount of existing intrusion dataset and also recently emerging network dataset [5]. There is need for effective and efficient intrusion system as conservative intrusion detection approach can no longer match the newly emerging dataset.
Coupled with enormous data available today with lots of record duplications, which to use for optimal data analysis becomes challenging. Data deduplication thus, helps to remove such bottlenecks, thereby leaving a copy of each record in a set of data; this leads to the reduction in the amount of data to be moved into the network [6].

Research Motivation
In the work of ref. [4], Hypothesis Testing was applied on KDD dataset. The significant attributes or features of the dataset were extracted; the records of the thirteen significant attributes were used in the research. The training set was run on an existing Decision Tree algorithm which resulted in some rules. The mean of each rule was determined and later used to form hypothesis. The accuracy of the system was tested using some detection metrics. Meanwhile there is the need to valid the accuracy of the existing result by applying data deduplication with other mining algorithm on the intrusion dataset to help offer more accurate classification.

Research Objective
The objectives of the research work are to develop deduplicated program, classify intrusion dataset using PART and Decision table Rules and also to carry out performance evaluation on the KDD dataset.

Methodology
Review of few existing works was carried out. The NSL-KDD dataset which is an improvement upon KDD '99 data was used. The records of Denial of Service (DoS) attacks and normal traffic based on the thirteen significant attributes were extracted, this contains Eighteen thousand, One hundred and Thirteen (18113) records. The dataset was run on data deduplication program developed using C#.
Decision table and PART Rules were used to classify the Denial of Service (DOS) attacks and normal traffic from WEKA data mining implementation. The performance of the system would be tested on the test data using classification rate, detection rate and false alarm rate, after which the comparative analysis would be carried out against the work of Oladunjoye [7]. Table 1 shows the result obtained when the dataset was run on Data deduplicated program. 9711 records of Normal traffic were reduced to 7761, which amount to 20.1% reduction. 737 records of Apache2 were reduced to 440, which is 40.3% reduction. 359 records of Back were reduced to 65, which is 82% reduction. 7 records of Land were reduced to 3, resulting in 57.1%. 293 records of Mail bomb were reduced to 4, which amount to 98.6% reduction. 4557 records of Neptune were reduced to 295, which is 93.5% reduction. 41 records of Ping of Death (PoD) were reduced to 14, which equate to 65.8% reduction. 685 records of Processtable were reduced to 367, which is 46.4% reduction. 665 records of smurf were reduced to 10, which is equivalent to 98.5%

Abstract
Network Security has become a major and critical issue as a result of the vast growth in the field of Information Technology. This paper adopted the result of an existing extraction or attributes selection of KDD '99 dataset. The dataset was run on data de-duplicated software developed using C# Programming Language and final mining analysis was carried out on Waikato Environment for Knowledge Analysis (WEKA) with the adoption of PART and Decision reduction.12 records of teardrop were reduced to 2, which corresponds to 83.3% reduction. 2 records of teardrop were reduced to 1, which is 50% reduction while 994 records of warezmaster were reduced to 180, which is 80.9% reduction.

Performance of rules generated using decision table rules
The performance of rules generated on test data using Decision Table Rules from Table 2, Figures 1 and 2 show that out of 2303 records of Normal traffic, 2303 were correctly classified while 20 were wrongly classified. Out of 140 records of Apache2, 117 were correctly classified while 23 were wrongly classified. All records of Back, Neptune, PoD and processtable were correctly classified. A record of mail bomb was wrongly classified. Out of 2 records of Smurf, 1 was correctly classified while the remaining 1 was wrongly classified. Out of 49 records of warezmaster, 45 were correctly classified while 4 were wrongly classified. Teardrop and udpstorm have no record in the test data.

Performance of rules generated using part rules
The performance of rules generated on test data using PART Rules from Table 3, Figures 3 and 4 show that all records of Apache2, Back, Mail bomb, PoD processtable and Smurf were correctly classified. The 2 records of Land were wrongly classified. Out of 92 records of Neptune, 92 were correctly classified and 1 was wrongly classified. Out of 2323 records of Normal traffic, 2313 were correctly classified while 10 were wrongly classified. Out of 49 records of warezmaster, 45 were correctly classified while 4 were wrongly classified. Teardrop and udpstorm have no record in the test data. Table 4 shows the confusion matrix obtained from the Decision Table Rules Classification when DOS attacks and Normal Traffic test data were used. Out of 140 records of Apache2, 117 were correctly classified, while 21 and 2 were incorrectly classified as Neptune and Normal respectively. All records of Back, Neptune, Ping of Death (POD) and processtable were correctly classified. The 2 records of Land were incorrectly classified as Neptune. A record of Mail bomb was incorrectly classified as Normal. Out of 2323 records of Normal      Confusion matrix obtained from denial of service (dos) and normal traffic using part rules Table 5 shows the confusion matrix obtained from the PART Rules Classification when DOS attacks and Normal Traffic test data were used. All records of Apache2, Back, Mail bomb, Ping of Death (POD), Processtable and Smurf were correctly classified. The 2 records of Land were incorrectly classified as Neptune. Out of 93 records of Neptune, 92 were correctly classified while 1 was incorrectly classified as Apache2. 2313 records of Normal were correctly classified out of 2323 while 1, 1, 1, 5 and 2 were incorrectly classified as Apache2, Back, Mail bomb, Neptune and Warezmaster respectively. Out of 49 records of Warezmaster, 45 were correctly classified while 4 were incorrectly classified as Normal.

Confusion matrix obtained from denial of service (dos) and normal traffic using decision table rules
NO * =Normal, WM * =Warezmaster, US * =Udpstorm, TD * =Teardrop, SM * =Smurf, PR * =Processtable, PD * =Pod, NE * =Neptune, MB * =Mailbomb, LA * =Land, BA * =Back, AP * =Apache2. Performance evaluation with existing system Table 6 shows the number of records that are correctly classified incorrectly classified and not classified for each denial of services attacks and normal traffic.       used and 99.1% for JRIP rules. It can be deduced that PART rules is competitively better with this type of classification than the other two methods. Figure 6 shows the % of the normal connections that are not correctly classified in the training and testing sets. The result show that FAR is 0.86 when decision tree rules is applied, 0.43 when PART rules is used and 0.55 JRIP rules is used. This indication that the percentage of records that are misclassified is minimal when rules in PART used. Therefore, PART rules are preferably better in term of false Alarm rate for this type of classification. Figure 7 show the % of the number of attacks connection that is correctly classified. The result indicates that the number of attacks that are correctly classified when decision tree Rules in used is 92.6%, 98.3% when PART rules in used whiles 97.2% when JRIP rules in used. PART rules perform better than the two other methods in term of sensitivity. Figure 8 shows the specificity is 99.1% when decision tree is used, 99.6% when PART rules is used and 99.4% when a JRIP rule is used.

Conclusion
The system shows that PART Rules performed better than other methods in terms of Classification Rate, False Alarm Rate, Sensitivity and Specificity.