An empirical study of reflection attacks using NetFlow data

Reflection attacks are one of the most intimidating threats organizations face. A reflection attack is a special type of distributed denial-of-service attack that amplifies the amount of malicious traffic by using reflectors and hides the identity of the attacker. Reflection attacks are known to be one of the most common causes of service disruption in large networks. Large networks perform extensive logging of NetFlow data, and parsing this data is an advocated basis for identifying network attacks. We conduct a comprehensive analysis of NetFlow data containing 1.7 billion NetFlow records and identified reflection attacks on the network time protocol (NTP) and NetBIOS servers. We set up three regression models including the Ridge, Elastic Net and LASSO. To the best of our knowledge, there is no work that studied different regression models to understand patterns of reflection attacks in a large network. In this paper, we (a) propose an approach for identifying correlations of reflection attacks, and (b) evaluate the three regression models on real NetFlow data. Our results show that (a) reflection attacks on the NTP servers are not correlated, (b) reflection attacks on the NetBIOS servers are not correlated, (c) the traffic generated by those reflection attacks did not overwhelm the NTP and NetBIOS servers, and (d) the dwell times of reflection attacks on the NTP and NetBIOS servers are too small for predicting reflection attacks on these servers. Our work on reflection attacks identification highlights recommendations that could facilitate better handling of reflection attacks in large networks.


Introduction
The capacity of large networks has grown significantly in order to sustain the level of performance required by machine learning, scientific and engineering applications.In this context, the security of the network has become critical to meet the expectations of its users.Analyzing network attacks requires awareness of the sequence of events encountered by the network component.While recent works have focused on analyzing attacks on specific network components [1,2], answering how an attack on a network occurs requires an integrated approach towards correlation-based log mining [3,4].Correlation analysis has been widely used to detect intrusions in large networks [5][6][7][8][9], with its strengths in aggregating several alerts with low false positives and low false negatives, resulting in tremendous improvements in detection accuracy.A recent study empirically evaluated the Pearson and Spearman-Rank correlation algorithms using NetFlow data obtained from an enterprise network [10].From their study, they observed that reflection attacks on the Secure Shell (SSH) and Domain Name Service (DNS) servers exist in the NetFlow data and those attacks are not correlated.
Several recent large-scale Distributed Denial-of-Service (DDoS) attacks studies have provided valuable insights into DDoS attacks [11][12][13][14].These studies have shown that DDoS attacks are regularly executed on many network protocols.In a DDoS attack, a large volume of network packets is generated to flood a target host without using an intermediary.In contrast to DDoS attacks, which do not mask the sender's source IP address, a reflection attack is a special type of DDoS attack that uses any TCP or UDP-based service as a reflector and masks the sender's source IP address [15].A spoofed network packet, where the source IP address is replaced by the IP address of another device, is typically used to send the response to the victim.Thus, an attacker can magnify the amount of malicious traffic and obscure the sources of the attack traffic to cause significant disruption to the operation of a large network.As such, it is important to identify correlations of reflection attacks, as it is to identify the dwell time between these attacks.We define the dwell time as the time elapsed between the start time of one reflection attack and the start time of the next reflection attack.When network attack prediction schemes are supported by knowledge of the dwell times of an attack, it can help the network administrators in using network attack mitigation schemes to respond to an impending attack [16].When the dwell times of an attack are small, a network attack mitigation scheme which scatters the attack traffic can be used to absorb the attack.
Several recent works have developed Pearson correlation-based methods that identified DDoS attacks [17], identified reflection attacks [10], detected activities of groups of bots [18], and detected network intrusions [19].S. Chawla et al. [17] proposed a framework that used Pearson correlation to identify DDoS attacks and flash events.D.P. Hostiadi and T. Ahmad [18] proposed a new model that detected correlations of activities of group of bots.Their model consists of four phases: (a) data preprocessing, (b) data segmentation, (c) feature extraction and (d) bot group detection.They implemented (a) the Mean Absolute Error metric that measures the similarity of activities between two groups of bots, and (b) the Pearson correlation algorithm that finds relationships between the activities of two bot groups.A. Heryanto et al. [19] implemented a feature selection workflow that uses Pearson correlation to identify important network metrics for detecting intrusions.E. Chuah et al. [10] applied Pearson correlation and Spearman-Rank correlation to identify the dates of reflection attacks.Although these works showed that the Pearson correlation algorithm can identify relationships between malicious activities, it has some limitations that we address in this paper.First, Pearson correlation only identifies relationships between two samples.Second, several correlated samples can be produced and all the correlated samples must be manually analyzed before an attack can be identified, which is not desirable because it is a time-consuming process that incurs a significant delay in identifying correlations of a network attack.Therefore, we use the power of regression models, which belong to a type of supervised learning that learns a relationship between a dependent variable and multiple independent variables.We train the Ridge, Elastic Net and LASSO regression models on NetFlow data to obtain the regression coefficients for all independent variables, and determine the applicability of these regression models in identifying correlations of reflection attacks.
In this paper, we conduct an empirical analysis of reflection attacks in a large enterprise network, carefully compare the Ridge, LASSO and Elastic Net regression models and present several new findings.The correlation of reflection attacks on the NTP server and correlation of reflection attacks on the NetBIOS server are new ones and have not been reported in an earlier paper [10].We validate our approach on 1.7 billion NetFlow records obtained from a large enterprise network operated by Los Alamos National Laboratories, and apply statistical validation methods to ensure that the results are accurate.The main contributions of this paper are given as follows: • We identify reflection attacks in a large enterprise network and provide estimates of NetFlow records which are not correlated with the reflection attack.
• We analyze the NetFlow records which are associated with a reflection attack to drill down into their specific activity.Based upon the insights gained from our correlation analysis, we discuss how these findings can be used to improve the network's security against reflection attacks.• We extract the NetFlow records associated with the reflection attack and obtain their dwell times.
Our initial assumption is that reflection attacks are correlated in the NetFlow data.We compared the Ridge, Elastic Net and LASSO regression models and are surprised to learn that the regression coefficients learned by all three regression models are close to 0 or equal to 0. Furthermore, the dwell times of reflection attacks ranged from 0 to 198 seconds, multiple source and destination devices were associated with reflection attacks on the NTP and NetBIOS servers, and a small percentage of network traffic was generated by the reflection attack.
The remainder of this paper is organized as follows: First, we review the related works in Section 2.Then, we describe the network model and NetFlow data in Section 3. We present the motivation and describe the details of our approach in Section 4. Our evaluation on the NetFlow data obtained from an enterprise network is presented in Section 5. We discuss the results and limitations of our approach in Sections 6 and 7 respectively, and we conclude with a summary and future work in Section 8.

Related work
We divide the related works into two categories: (a) machine learning-based intrusion detection systems that detected DDoS attacks, and (b) correlation analysis-based intrusion detection systems that detected DDoS attacks.

Machine Learning-based Intrusion Detection Systems
We focused on very recent works that developed Intrusion Detection Systems (IDS) which integrated machine learning techniques to detect DDoS attacks.In [20], the authors proposed a natural language processing (NLP)based approach called DDoS2Vec that learns the characteristics of DDoS attacks.They evaluated their approach on one year's worth of flow samples obtained from an Internet Exchange Provider and compared the performance of DDoS2Vec with Word2Vec, Dos2Vec and Latent Semantic Analysis.In [21], the authors proposed a novel approach that stacks multiple deep neural networks (DNN) to detect DDoS attacks.They evaluated their approach on a benchmark Cybersecurity dataset and compared the performance of their method with existing machine learning models.In [22], the authors implemented an approach that compared multiple Support Vector Machine (SVM) kernels that are trained with uncorrelated features to detect reflection amplification DDoS attacks on the Simple Network Management Protocol (SNMP) and DNS servers.In [23], the authors proposed a feature reduction method that integrated Information Gain (IG) and correlation-based feature selection techniques to detect reflection and standard DDoS attacks.They evaluated their method on two public Cybersecurity datasets and compared the performance of their approach with state-of-the-art feature selection methods.In [24], the authors developed a decision tree-based IDS that uses the J48 classifier to detect reflection amplification DDoS attacks, and evaluated their method on the CICDDoS2019 Cybersecurity dataset.In [25], the authors proposed a novel technique that combined clustering and classification machine learning algorithms.Their technique consists of three phases.In the first phase, the DBSCAN clustering algorithm is used to separate DDoS traffic from normal traffic.In the second phase, the Euclidean distance metric is used to calculate the features in each cluster.In the third phase, a classification model is built and a label that indicates whether a cluster contains DDoS traffic or normal traffic is assigned to each cluster.They evaluated their method on two public Cybersecurity datasets and compared the performance of their classifier with the Decision Tree, Random Forest, Naive Bayes and Support Vector Machine classifiers.In [26], the authors compared the Random Forest (RF), Naive Bayes (NB), Logistics Regression (LR), K-Nearest Neighbour (KNN) and Multilayer Perceptron (MLP) algorithms to filter normal traffic from DDoS traffic, and evaluated these algorithms on two public DDoS datasets.In [27], the authors proposed a deep learning model to detect DDoS attacks.They evaluated their model on the CICDDoS2019 dataset that includes traces of DDoS attacks on several network protocols, and showed that a three layer deep neural network model achieved the highest detection accuracy.Further, in [28], the authors proposed an approach called MOP-IDS.It is based on a Multi-Objective Optimization (MOO) process composed of: (a) clustering alerts generated by multiple IDS to decrease the set of alerts, (b) filtering alerts to create a set of potential false alarms, (c) grouping similar alerts produced by the different IDS and (d) classifying an alert as a false positive or false negative.The performance of MOP-IDS was evaluated using accuracy, true positive rate and true negative rate metrics on three public Cybersecurity datasets that contain denial-of-service traces .In [29], the authors proposed a novel approach called CANN.It is based on K-Means clustering that sums two distances: (a) the distance between a data sample and its cluster center and (b) the distance between a data sample and its nearest neighbour.A new 1-dimensional distance based feature is created and used by the K-Nearest Neighbour classifier to classify each data sample into normal or abnormal.The performance of CANN was evaluated using accuracy, detection rate and true positive rate metrics on a public Cybersecurity dataset that contains traces of DDoS attacks.In [30], the authors proposed a two-stage machine learning architecture that uses (a) the K-Means clustering algorithm to detect an attack, and (b) Decision Trees (DT), Random Forest, Adaptive Boosting (AB) and Naive Bayes algorithms to classify several types of attacks.They evaluated their architecture on a public Cybersecurity dataset that includes traces of denial-of-service attacks, and showed that decision trees and random forest algorithms achieved the highest classification accuracy.In [31], the authors compared the performance of Modest Adaboost (MA), Real Adaboost (RA) and Gentle Adaboost (GA) on five public Cybersecurity and DDoS datasets.They showed that (a) the error rate of Modest Adaboost is higher compared to the error rates of Gentle and Real Adaboost and (b) Gentle and Real Adaboost have the same error rate performance.In [32], the authors proposed a method that uses L2 regularization and dropout techniques to improve the performance of Convolution Neural Networks (CNN) for IDS.They showed that their method achieved the highest precision, recall and F1-scores compared to several popular machine learning techniques, and evaluated their method on a Cybersecurity dataset that contains traces of DDoS attacks.In [33], the authors proposed a modified System Call Graph (SCG) that uses a Deep Neural Network (DNN) to integrate information from different detection techniques.They evaluated their approach on three Cybersecurity datasets that include traces of DDoS attacks, and showed that their model achieved high detection rates and low false positives.

Correlation Analysis-based Intrusion Detection Systems
We focused on very recent works that developed Intrusion Detection Systems which integrated correlation techniques to detect DDoS attacks.In [34], the authors introduced a new feature selection method called CorrCorr.It uses the Multivariate Correlation (MC) and Addition-Based Correlation (ABC) methods to generate feature correlations and normal network traffic profiles from which anomalies that deviate from the normal profile are detected.They evaluated their method on two public Cybersecurity datasets that include traces of DDoS attacks.In [35], the authors present a tool that efficiently correlates cross-host attacks across multiple hosts.Their tool uses tagged provenance graphs that models the techniques and operational procedures used by an attacker.They define a novel Graph Similarity-based Alert Correlation (GSAC) technique that determines the entities that are associated with alerts generated on different hosts, and evaluated their tool on two public Cybersecurity datasets that contains attack traces on multiple hosts.In [36], the authors proposed a distributed denial-of-service attack detection method that combines the Enhanced Random Forest (ERF) ensemble learning method and an Optimized Genetic Algorithm (OGA).In [37], the authors demonstrated an approach that utilizes Multivariate Correlation analysis to identify DDoS attacks in real-time.In [38], the authors presented a correlation-based approach that transformed clusters of alerts into graph structures and computed signatures of repeated network patterns to characterize clusters of alerts.They evaluated their approach on real-world attack scenarios that include DDoS attacks.In [39], the authors proposed an efficient framework for correlating alerts in early warning systems.Their framework combines statistical and stream mining techniques to extract sequences of alerts that are part of multistep attack scenarios, and evaluated on two DDoS attack scenarios.In [40], the authors proposed a hybrid model that integrated Multi-Feature Correlation (MFC) and a deep neural network.They evaluated their model on the UNSW-NB15, AWID, CICIDS 2017 and CICIDS 2018 Cybersecurity datasets which include traces of DDoS attacks.

Summary
To the best of our knowledge, there is no work that compared multiple regression models to identify correlations of reflection attacks on the NTP servers and identify correlations of reflection attacks on the NetBIOS servers.A summary of the main attributes of the reviewed works is given in Table 1.Differently to the works in Table 1, we (a) developed an approach that evaluates the ability of the LASSO, Ridge and Elastic Net regression models in identifying correlations of reflection attacks on the NTP servers and correlations of reflection attacks on the NetBIOS servers, (b) identify the devices and network traffic associated with the NTP and NetBIOS servers reflection attacks, and (c) identify the dwell times between reflection attacks on the NTP and NetBIOS servers.3 Network Model and Data In this section, we present the network model to which our approach is applicable in Section 3.1.Then, we describe the NetFlow data in Section 3.2.

Network Model
Our approach is based on a generic client-server network model as depicted in Fig. 1 [41].The network consists of client devices, servers and routers.Client devices are separate computers that access a service made available by a server.The server is another computer that the client accesses the service by way of the network.Traffic between these networks are managed by the router, which forwards packets to their destination Internet Protocol (IP) addresses.The workflow for an Intrusion Detection System (IDS) consists of two phases, as depicted in Fig. 2 [42].In phase 1, network packet data between a client and a server, two clients or two servers are collected by the Router, and then the data is sent to the Data store which aggregates the data.Once the data is aggregated, in phase 2 the Analysis console retrieves the data and analyzes it to identify an attack.

Email server
Database server

Router Router
Client device Client device

NetFlow Data
The NetFlow data is collected on most networks [43].An example of a NetFlow record is given as follows: 6652800, 4666, Comp107130, Comp584379, 6, Port04167, 443, 130, 82, 71556, 55117 In this NetFlow record, there are 11 fields.The first field contains the start time (6652800).The second field contains the duration of the communications between the source and destination devices (4666).The third and fourth fields contain the source (Comp107130) and destination (Comp584379) devices, respectively.The fifth field contains the network protocol number (6, i.e., TCP).The sixth and seven fields contain the source (Port04167) and destination (443) ports, respectively.The eighth and ninth fields contain the number of packets (130) and bytes (82) sent by the source device, respectively.The tenth and eleventh fields contain the number of packets (71556) and bytes (55117) sent by the destination device, respectively.

Identifying Correlations of Reflection Attacks
To execute a reflection attack, an attacker uses a tactic called IP (Internet Protocol) spoofing which replaces the real sender's source IP address with the IP address of another device, as depicted in Fig. 3.This causes the target device to respond to the request and send the answer to the victim host IP address.For example, a firewall may be configured to allow port 137 (i.e., NetBIOS) traffic so that computers on a local area network can communicate with network hardware and transmit data across the network.An attacker can take advantage of such a rule in the firewall and use some NetBIOS servers as intermediaries to execute a reflection attack on other NetBIOS servers.

NetBIOS servers IP:
NetBIOS servers IP: Fig. 3 IP address spoofing process.
Thus, our objective is to determine if reflection attacks are "correlated" or "not correlated".By "correlated", we mean the NetFlow records which are assigned the largest positive regression coefficients by the regression model.By "not correlated", we mean the NetFlow records which are assigned regression coefficients close to 0 by the regression model.In this paper, we aim to identify correlations of reflection attacks in the NetFlow data.The research problem that we address in this paper is given as follows: Given (a) the NetFlow data, (b) a network protocol number, and (c) a range of dates: • Identify the NetFlow records which are assigned the largest positive regression coefficients or the smallest regression coefficients by the regression model.• Identify the devices which are associated with the reflection attack and obtain the amount of traffic which is generated by the attack.• Identify the time elapsed between the start times of two adjacent NetFlow records which are associated with the reflection attack.As such, the workflow we propose consists of three phases, as depicted in Fig. 4. The first phase in the workflow is Data preprocessing.It extracts the features in the NetFlow data and organizes the features into data structures.After the data structures are generated, the second phase of Regression models training applies different regression algorithms to learn the regression coefficients of multiple features given a target feature.This phase corresponds to "identifying" correlations of reflection attacks from the NetFlow data.Then, the third phase of Regression models validation applies statistical validation techniques to determine whether the regression model's estimated values are close to the observed values in the data.Next, we present the details for each of the three modules in the workflow.

Data Preprocessing
In the data preprocessing phase, the goal is to present the NetFlow data in a structured format so that the data can be easily processed by data analysis algorithms [44].To attain this, we need to address three issues: (a) the NetFlow data contains vectors of network traffic, (b) the NetFlow records are unlabelled, and (c) the magnitude, range and unit of the feature values are different.By unlabelled, we mean that there are no NetFlow records labelled as "malicious" or "benign" in the NetFlow data.

Data Formatting
The NetFlow data is captured in a way such that the network traffic is represented by four vectors corresponding to all the NetFlow records for one day.The four vectors are: (a) the number of packets sent by the source device, (b) the number of bytes sent by the source device, (c) the number of packets sent by the destination device, and (d) the number of bytes sent by the destination device.To address this issue, we construct a feature-count data matrix.In this data matrix, the columns represent the NetFlow records and the rows represent the samples of a vector in the NetFlow records.To construct the feature-count data matrix, we implemented a function in the Data preprocessor module.The process is given as follows: • Obtain the number of NetFlow records and initialize a feature-count data matrix, where the columns and rows of the data matrix are equal to the number of NetFlow records.• Fill the diagonals in the data matrix with the vector value contained in number of packets sent by the source device, number of packets sent by the destination device, number of bytes sent by the source device or number of bytes sent by the destination device corresponding to the respective NetFlow record.• Fill all the remaining cells in the data matrix with zero.

Data Scaling
In data scaling, the values in the data are transformed so that the values fit within a specific scale.The values in the NetFlow data vary in terms of the magnitude, range and unit.The size in the number of packets sent by a source device is typically lower than the size in the number of bytes sent by that source device.Furthermore, the range of values for the number of packets sent and the number of bytes sent are different.Thus, in order for a regression model to interpret the features on the same scale, we need to perform data scaling.
There are two standard methods for scaling data values [45]: (a) normalization, and (b) standardization.Data normalization scales the data values into a range of [0, n].In contrast, data standardization scales the data values to have a mean of 0 and a standard deviation of 1. Data normalization is useful when the data is needed in the bounded intervals.However, it is difficult to identify an outlier.In contrast, data standardization produces useful information about outliers, which makes the regression model less sensitive to outliers [45].Thus, we scale the values in the feature-count data matrix so that the values are centered around the mean with a unit standard deviation.

Regression Models Training
After the feature-count data matrix is generated, we need to identify (a) which NetFlow records are correlated during the reflection attack, and (b) which NetFlow records are not correlated during the reflection attack.To attain this, we use the Ridge, LASSO and Elastic Net regression models to obtain the regression coefficients for multiple NetFlow records given a target NetFlow.
Multiple linear regression (MLR) and polynomial regression (PLR) are standard regression algorithms that are widely used to model complex relationships with many variables [46].Multiple linear regression models the relationship between the dependent variable and two or more independent variables using a straight line.In contrast, polynomial regression models the relationship between the dependent variable and two or more independent variables as a n th degree polynomial.However, MLR and PLR models are susceptible to overfitting on the training data, which causes the model to perform poorly on new data.Differently to MLR and PLR models, which do not use regularization, the LASSO, Ridge and Elastic Net regression models use regularization to constrain the regression coefficients and improve the model's accuracy.Regularization is achieved by penalizing variables that have a large coefficient value.The LASSO, Ridge and Elastic Net regression models functions are given in Table 2.The LASSO regression model includes a penalty term called L1-norm [47].It sets the regression coefficients of some of the independent variables to zero.The Ridge regression model includes a penalty term called L2-norm.Differently to L1-norm, L2-norm shrinks the regression coefficients of all the independent variables towards zero [48].The Elastic Net regression model includes both L1-norm and L2-norm penalty terms [49].

Handling Bias in Regularized Regression Models
To perform regression analysis, we need to address two issues: (a) handle bias in the regularized regression model, and (b) select the penalty parameter.In statistics, bias is anything that leads to a systematic difference between the observed values in the data and the estimates which are produced by a regression model [46].The LASSO, Ridge and Elastic Net regression models add a penalty term in the cost function.The penalty term penalizes a regression model with large regression coefficients, which reduces the model's variance.For example, if the number of packets sent by NTP server A ranges from 80,000 to 120,000 per minute and the number of packets sent by NTP server B ranges from 800 to 1,200 per minute, the regression coefficient for the number of packets sent by NTP server B of 1 packet change will be a much larger coefficient in regard to its change in the number of packets sent compared to a 1 packet change in the number of packets sent by NTP server A. If a larger regression coefficient for NTP server B is obtained, then the regularized regression model will penalize NTP server B's regression coefficient.As a result, a biased model can be produced.To resolve this issue, we standardize all the values in the feature-count data matrix.Then, we input the standardized feature-count data matrix into the LASSO, Ridge and Elastic Net regression models, train the regression model and obtain the fitted regression model.

Selecting the penalty parameter
The penalty parameter (λ) is a value that controls the amount of shrinkage of the regression coefficients in the LASSO, Ridge and Elastic Net regression models [46].When λ = 0, no regression coefficients are removed.When λ increases, more regression coefficients are removed.When λ = ∞, all the regression coefficients are removed.To select the best value for λ, we use a general approach called k-fold cross validation [44].It extracts a portion of the data and sets it aside to be used as a test set.The remaining portions of the data are used as the training set.The regression model is trained on the training dataset.Then, the test dataset is used to test the regression model.10-fold cross-validation is typically used to obtain the best λ value [44].We implemented a function in the Regression models trainer module to perform 10-fold cross validation and select the penalty parameter.The process is given in Algorithm 1:

Regression Models Validation
Once the regression model is trained, we need to assess the model's accuracy.There are two standard metrics for measuring how close the values estimated by the regression model and the observed values in the data are.The metrics are [45]: (a) coefficient-of-determination (R 2 ), and (b) Root Mean Squared Error (RMSE).The coefficient-of-determination is the proportion of variation in the dependent variable that is predictable from the independent variables.Differently to R 2 , RMSE is the average difference between the regression model's estimated values and the observed values.A RMSE value ranges between 0 and infinity.If the RMSE value is close to 0, it shows that the regression model replicated the observed values accurately.However, it becomes difficult to interpret a large RMSE value.In contrast to the RMSE value, the R 2 value ranges between 0 and 1.If R 2 = 0, it shows that the regression model's estimated values are different from the observed values.If R 2 = 1, it shows that the regression model's estimated values match the observed values.Thus, we use the R 2 statistic to obtain the accuracy of the Ridge, LASSO and Elastic Net regression models.

Accounting for Inflation in R 2
The R 2 statistic is at least weakly increasing when more independent variables are added to the regression model.If redundant independent variables were included in the regression model, the R 2 value remains the same or increases.Consequently, the R 2 statistic alone cannot determine if the independent variables are useful.To resolve this issue, we obtain the adjusted R 2 value [50].It determines whether adding more independent variables actually increases the regression model's fit.We implemented a function in the Regression models validator module to calculate the adjusted R 2 .The formula for calculating the adjusted R 2 is [45] where n is the number of NetFlow records associated with the reflection attack and p is the total number of independent variables.

Evaluation on an Enterprise Network
We conduct our study of reflection attacks on an enterprise network operated by Los Alamos National Laboratories.The network hosts 60,000 devices and provides storage and user account services.The NetFlow data is collected in the network [51].One day's worth of NetFlow data contains 220,000,000 NetFlow records on average.All the NetFlow records are unlabeled.It was reported that the NetFlow data contains compromised devices [52], but the times and number of compromised devices are not known.Thus, we randomly select eight days worth of NetFlow data for analysis.

Phase 1: Identify Correlations of Reflection Attacks
To ascertain whether reflection attacks are correlated or not correlated, first we obtain the NetFlow records which are associated with a reflection attack.We implemented a function in our workflow to scan the NetFlow data and extract NetFlow records containing the same source and destination port numbers.We applied the function to the eight days of NetFlow data and identified reflection attacks on several network protocols, though we focused on a subset of attacks as reflection attacks on the NTP and NetBIOS servers.DDoS attacks on the NTP and NetBIOS servers have been widely reported [12,13].For each day, we assigned the first NetFlow record as the dependent variable and assigned the remaining NetFlow records as independent variables.Then, we trained the Ridge, LASSO and Elastic Net regression models on the four attributes in the NetFlow data separately and obtained the fitted regression models.The four attributes are: (a) number of packets sent by the source device, (b) number of bytes sent by the source device, (c) number of packets sent by the destination device, and (d) number of bytes sent by the destination device.

Reflection Attack on the NTP Server
First, we obtain the R 2 and adjusted R 2 values for the Elastic Net, Ridge and LASSO regression models trained on the number of packets sent by the source device attribute.The adjusted R 2 shows if adding more NetFlow records in the LASSO, Ridge and Elastic Net regression models increases the R 2 value.To obtain the adjusted R 2 value, we set p = 1 and n = the number of NetFlow records associated with the NTP server reflection attack.The R 2 and adjusted R 2 values are given in Table 3. From Table 3, we observed that (a) the R 2 and adjusted R 2 values for the Ridge regression model ranged from 0.01 to 0.03 on days 1, 3, 4, 5, 7 and 8, 0.05 on day 2 and 0.07 on day 6, (b) the R 2 and adjusted R 2 values for the Elastic Net regression model ranged from 0.01 to 0.03 on days 1, 4, 5, 6, 7 and 8, 0.04 on day 3 and 0.07 on day 2, and (c) the R 2 and adjusted R 2 values for the LASSO regression model ranged from 0.01 to 0.02 on days 1, 2, 3, 4, 6 and 7, 0.03 on day 8 and 0.06 on day 5.We obtained the R 2 and adjusted R 2 values for the Ridge, Elastic Net and LASSO regression models trained on the number of bytes sent by the source device, number of packets sent by the destination device, and number of bytes sent by the destination device attributes.Their R 2 and adjusted R 2 values ranged from 0.01 to 0.07 over the eight days.
On all the eight days, the R 2 and adjusted R 2 values for the Ridge, Elastic Net and LASSO regression models are close to 0, indicating that the accuracy of all three regression models are the same.Furthermore, the range of R 2 and adjusted R 2 values in all three regression models trained on the four attributes separately are the same.Moreover, the R 2 and adjusted R 2 values for the Ridge, LASSO and Elastic Net regression models are the same, indicating that more independent variables added to all the three regression models did not increase the regression model's fit to the observed data.Thus, the number of packets sent by the source device attribute can be used as the primary attribute.of packets sent by the source device attribute.A residual is the difference between the regression model's estimated value and the observed value in the data.Residual analysis belongs to a class of techniques for evaluating the goodness-of-fit of a fitted regression model.If a regression model is a good fit to the observed data, all its residual values will be close to 0 or equals to 0. If a regression model is not a good fit to the observed data, some of its residual values will not be close to 0. To obtain the proportion of residuals, we implemented a function in the Regression models validator module.The process for obtaining the proportion of residuals is given as follows: (a) obtain the residual value for each sample in the regression model, (b) obtain the percentage of all unique residual values, and (c) obtain the cumulative distribution of the percentage of unique residual values.The proportion of residuals in the Elastic Net regression model for day 1 is shown in Fig. 5. From Fig. 5, we observed that (a) the residuals range from 0 to 80, and (b) a proportion of the residuals are greater than 0. When the residuals are greater than 0, it shows that the values estimated by the Elastic Net regression model differ from the observed values in the data.We obtained the proportion of residuals in the Elastic Net regression model for days 2 to 8. On all the 7 days, their residuals range from 0 to 80 and a proportion of those residuals are greater than 0. Next, we obtained the residuals from the Ridge and LASSO regression models for days 1 to 8. On all the eight days, their residuals ranged from 0 to 80 and a proportion of those residuals are greater than 0. Next, we determine which one of three regression models best fit the data.To achieve this, we apply a standard technique called the general F -statistic.The F -statistic is used to compare statistical models that have been fitted to a dataset, in order to identify the statistical model that best describes the population from which the data is sampled [45].First, we define the null and alternate hypotheses.The null hypothesis is that the sum-of-squares error (SSE ) of one regression model is close to the SSE of a different regression model.The alternate hypothesis is that the SSE of one regression model differs significantly from the SSE of a different regression model.The formula for computing the general linear F -statistic is [45] df M 2 , where M 1 and M 2 are two different regression models, df M 1 and df M 2 are the degrees of freedom associated with regression models M 1 and M 2 respectively.When F * ≥ 3.95, we reject the null hypothesis in favour of the alternate hypothesis.We implemented a function in the Regression model validator module to obtain the F -statistic.We apply the general linear F -statistic on the Elastic Net, Ridge and LASSO regression models and obtained the F * value.
A summary of F -tests on the Elastic Net, Ridge and LASSO regression models is given in Table 4. From Table 4, we observed that from days 1 to 8 the F * value is 0. Since F * ≤ 3.95, we fail to reject the null hypothesis.On all the eight days, a proportion of residuals in all three regression models are greater than 0, indicating that the values estimated by all three regression models differ from the observed values in the data.The residual values in all three regression models ranged from 0 to 80. Furthermore, the F * value for all three regression models is 0. When (a) the F * value is 0, (b) the range of residuals in all three regression models are the same, and (c) a proportion of residuals in all three regression models are greater than 0, the Elastic Net regression model can be used as the main model.
Next, we obtain the regression coefficients for all NetFlow records in the Elastic Net regression model.A summary of regression coefficients is given in Table 5.From Table 5, we observed that all regression coefficients obtained for days 1 to 8 are close to 0 or equal to 0. We obtained the regression coefficients of all NetFlow records from the Ridge and LASSO regression models for days 1 to 8. On all the eight days, the regression coefficients of all NetFlow records in the Ridge and LASSO regression models are close to 0 or equal to 0. On days 1 to 8, the regression coefficients of all NetFlow records associated with the NTP server reflection attack are close to 0 or equal to 0, indicating that reflection attacks on the NTP servers are not correlated.

Reflection Attack on the NetBIOS Server
As was done with the NTP servers, we obtain the R 2 and adjusted R 2 values for the Elastic Net, Ridge and LASSO regression models trained on the number of packets sent by the source device attribute.We set p = 1 and n = the number of NetFlow records associated with the Net-BIOS server reflection attack.The R 2 and adjusted R 2 values are given in Table 6.From Table 6, we observed that (a) the R 2 and adjusted R 2 values for the Ridge regression model ranged from 0.01 to 0.02 on days 1 to 8, (b) the R 2 and adjusted R 2 values for the Elastic Net regression model ranged from 0.01 to 0.02 on days 1 to 8, and (c) the R 2 and adjusted R 2 values for the LASSO regression model ranged from 0.01 to 0.02 on days 1 to 5, 7 and 8 and 0.04 on day 6.We obtained the R 2 and adjusted R 2 values from the Ridge, Elastic Net and LASSO regression models trained on the number of bytes sent by the source device, number of packets sent by the destination device, and number of bytes sent by the destination device attributes.Their R 2 and adjusted R 2 values ranged from 0.01 to 0.05 over the eight days.
On all the eight days, the R 2 and adjusted R 2 values from the Ridge, Elastic Net and LASSO regression models are close to 0, indicating that the accuracy of all the three regression models is the same.Furthermore, the range of R 2 and adjusted R 2 values from all the three regression models trained on the four attributes separately are the same.Moreover, the R 2 and adjusted R 2 values for the Ridge, LASSO and Elastic Net regression models are the same, indicating that more independent variables added to all the three regression models did not increase the regression model's fit to the observed data.Thus, the number of packets sent by the source device attribute can be used as the primary attribute.
Next, we obtain the residuals in the Elastic Net, Ridge and LASSO regression models trained on the number of packets sent by the source device attribute.The proportion of residuals in the Elastic Net regression model for day 1 is shown in Fig. 6.From Fig. 6, we observed that (a) the residuals range from 0 to 80, and (b) a proportion of the residuals are greater than 0. When the residuals are greater than 0, it shows that the values estimated by the Elastic Net regression model differ from the observed values in the data.We obtained the proportion of residuals in the Elastic Net regression model for days 2 to 8. On all the 7 days, their residuals range from 0 to 80 and a proportion of those residuals are greater than 0. Next, we obtained the residuals in the Ridge and LASSO regression models for days 1 to 8. On all the eight days, the residuals in the Ridge and LASSO regression models ranged from 0 to 80 and a proportion of those residuals are greater than 0. Next, we determine which one of three regression models best fit the data.We apply the general linear F -statistic on the Elastic Net, Ridge and LASSO regression models and obtained the F * value.A summary of F -tests on the Elastic Net, Ridge and LASSO regression models is given in Table 7. From Table 7, we observed that from days 1 to 8 the F * value is 0. Since F * ≤ 3.95, we fail to reject the null hypothesis.On all the eight days, a proportion of the residuals from the Elastic Net, Ridge and LASSO regression models are greater than 0, indicating that the values estimated by all the three regression models differ from the observed values in the data.The residuals in all the three regression models ranged from 0 to 80. Furthermore, the F * value for the Elastic Net, Ridge and LASSO regression models is 0. When (a) the F * value is 0, (b) the range of residuals in all the three regression models are the same, and (c) a proportion of residuals in all the three regression models are greater than 0, the Elastic Net regression model can be used as the main model.
Next, we obtain the regression coefficients for all NetFlow records in the Elastic Net regression model.A summary of the regression coefficients is given in Table 8.From Table 8, we observed that all the regression coefficients obtained for days 1 to 8 are close to 0 or equal to 0. We obtained the regression coefficients of all NetFlow records in the Ridge and LASSO regression models for days 1 to 8. On all the eight days, the regression coefficients of all NetFlow records in the Ridge and LASSO regression models are close to 0 or equal to 0. On days 1 to 8, the regression coefficients of all NetFlow records associated with the NetBIOS server reflection attack are close to or equal to 0, indicating that the reflection attack on the NetBIOS servers are not correlated.The first phase of our analysis is characterized by the identification of correlations of reflection attacks on the NTP servers and correlations of reflection attacks on the NetBIOS servers.We observed that (a) reflection attacks on the NTP servers are not correlated, and (b) reflection attacks on the NetBIOS servers are not correlated.Our next objective is to identify the devices and the amount of traffic generated by the NTP servers and NetBIOS servers reflection attacks.To realize this, we obtain the source and destination devices which are associated with those reflection attacks.Therefore, we count the number of unique source and destination devices associated with the NTP servers reflection attacks.A summary of source and destination devices is given in Table 9.From Table 9, we observed that from day 1 to day 8, multiple source and destination devices are associated with reflection attacks on NTP servers.
Next, we obtain (a) the number of packets and bytes sent by these source and destination devices associated with these NTP servers reflection attacks, and (b) the total number of packets and bytes transmitted in the network.The total number of packets, number of malicious packets and percentage of malicious packets are shown in Fig. 7. From Fig. 7(a), we observed that the percentage of malicious packets sent by these source devices range from 0.22% to 20.44% over eight days.From Fig. 7(b), we observed that the percentage of malicious packets sent by these destination devices range from 2.56% to 52.01% over eight days.The total number of bytes, number of bytes contained in the malicious packets and percentage of bytes in those malicious packets are shown in Fig. 8. From Fig. 8(a), we observed that the percentage of bytes in those malicious packets sent by these source devices is 0% on all eight days.From Fig. 8(b), we observed that the percentage of bytes in the malicious packets sent by these destination devices is 0% on all eight days.This result shows that the malicious packets associated with the reflection attack on these NTP servers contained 0-byte payloads.While the percentage of malicious packets sent by these source devices ranged from 0.22% to 20.44% and the percentage of malicious packets sent by these destination devices ranged from 2.56% to 52.01% over eight days, all the malicious packets contained 0-byte payloads, indicating that the reflection attack did not overwhelm these NTP servers.As was done with the NTP servers, we count the number of unique source and destination devices which are associated with the NetBIOS server reflection attack.A summary of source and destination devices is given in Table 10.From Table 10, we observed that from day 1 to day 8, multiple source and destination devices are associated with reflection attacks on NetBIOS servers.Next, we obtain (a) the number of packets and bytes sent by these source and destination devices associated with the reflection attack on these NetBIOS servers, and (b) the total number of packets and bytes transmitted in the network.The total number of packets, number of malicious packets and percentage of malicious packets are shown in Fig. 9. From Fig. 9(a), we observed that the percentage of malicious packets sent by these source devices range from 0.64% to 14.63% over eight days.From Fig. 9(b), we observed that the percentage of malicious packets sent by these destination devices range from 6.34% to 45.65% over eight days.The total number of bytes, number of bytes contained in the malicious packets and the percentage of bytes in those malicious packets are shown in Fig. 10.From Fig. 10(a), we observed that the percentage of bytes in those malicious packets sent by these source devices is 0% on all eight days.From Fig. 10(b), we observed that the percentage of bytes in those malicious packets sent by these destination devices is 0% on all eight days.This result shows that the malicious packets associated with the reflection attack on these NetBIOS servers contained 0-byte payloads.While the percentage of malicious packets sent by these source devices ranged from 0.64% to 14.63% and the percentage of malicious packets sent by these destination devices ranged from 6.34% to 45.65% over eight days, those malicious packets contained a 0-byte payload, indicating that the reflection attack did not overwhelm these NetBIOS servers.

Phase 3: Identify the Dwell Times of Reflection Attacks on NTP and NetBIOS Servers
The second phase of our analysis is characterized by the identification of devices associated with the NTP and NetBIOS server reflection attacks and the network traffic generated by those attacks.We observed that (a) multiple source and destination devices are associated with those reflection attacks, and (b) a small percentage of network traffic is generated by the NTP and Net-BIOS server reflection attacks.Our next objective is to identify the dwell time of NTP and NetBIOS server reflection attacks.To achieve this, we obtain the time elapsed between the start times of adjacent NetFlow records associated with the reflection attack.The dwell time for reflection attacks on the NTP server on days 1 to 8 are shown in Fig. 11.From Fig. 11(a) to Fig. 11(h), we observed that the dwell time ranged from 0 seconds to 68 seconds over eight days.
The dwell times of NTP server reflection attacks ranged from 0 seconds to 68 seconds over eight days, indicating that the time elapsed between reflection attacks on these NTP servers are small.
As was done with the NTP servers, we obtain the dwell time for reflection attacks on the NetBIOS server.The dwell time on days 1 to 8 are shown in Fig. 12. From Fig. 12(a) to Fig. 12(h), we observed that the dwell time ranged from 0 seconds to 198 seconds over eight days.
The dwell time of NetBIOS server reflection attacks ranged from 0 seconds to 198 seconds over eight days, indicating that the time elapsed between reflection attacks on these NetBIOS servers are small.

Discussion
From these results, we have shown that the LASSO, Ridge and Elastic Net regression models are unsuitable as a means for identifying correlations of reflection attacks.Our analysis of the NetFlow data from a large enterprise network helps to bring awareness to the extent to which reflection attacks are correlated.The fact that reflection attacks on these NTP servers are not correlated and reflection attacks on these NetBIOS servers are not correlated is not obvious.For example, the regression coefficients in the Elastic Net, LASSO and Ridge regression models from day 1 to day 8 are close to 0 or equal to 0. We summarize our findings and recommendations in Table 11.
We observed that the network traffic generated by these reflection attacks did not overwhelm the NTP and NetBIOS server on all eight days.While network administrators are less concerned with a reflection attack that did not lead to a loss of network service, it is better to equip the network with reflection attack detectors to reduce the network service downtime.These recommendations are suitable for various networks as well, since complex networks including but not limited to peer-to-peer networks and Internet-of-Things networks can also benefit from NetFlow data analysis.matched the expected traffic in the enterprise network operated by Los Alamos National Laboratories, which assured that the NetFlow data is high quality.
With respect to the dates of NetFlow data, we may have missed reflection attacks which are not correlated.To address this issue, we randomly selected eight days worth of NetFlow data for our analysis.On the types of data analyzed in our study, we did not consider the Windows server logs [54], hardware performance data [55] or behaviour logs [56] as they are beyond the scope of this paper, nor perform deep packet inspection because it would require substantial resources out of the reach of this paper.Having said that, we showed that reflection attacks on the NTP and NetBIOS servers exist in the NetFlow data and those reflection attacks are not correlated.External validity is concerned with the extent to which the results presented in an empirical study can be generalized to other settings [53].Some data centers do not collect detailed security incident reports, and some may not release the security logs due to privacy concerns [57].Our conclusions are based on the NetFlow data of a large enterprise network, and the results we presented in our study may not generalize to other network models.Consequently, this makes our statistical analysis difficult to confirm.Having said that, network monitoring tools are currently being deployed on these networks [58].Therefore, validating our analysis has become attainable.

Conclusion and Future Work
An approach based on correlating NetFlow records in the NetFlow data is proposed to identify correlations of reflection attacks.We showed that reflection attacks on the NTP and NetBIOS servers exist in the NetFlow data and evaluated the Ridge, Elastic Net and LASSO regression models.We applied the k-fold cross validation and coefficient-of-determination and ensured accurate results.From our study, we learned that (a) reflection attacks on the NTP servers are not correlated, (b) reflection attacks on the NetBIOS servers are not correlated, (c) the dwell time between reflection attacks on the NTP and NetBIOS server is short, and (d) a small percentage of network traffic is generated by reflection attacks on the NTP and NetBIOS server.
In our future work, we plan to apply our approach on NetFlow data from more networks, and identify correlations of reflection attacks other than reflection attacks on the NTP and NetBIOS servers.

Fig. 4
Fig. 4 Our approach consists of three modules: (a) Data preprocessor, (b) Regression models trainer, and (c) Regression models validator.

Fig. 5
Fig. 5 Proportion of residuals in the Elastic Net regression model for day 1.

Fig. 6
Fig. 6 Proportion of residuals in the Elastic Net regression model for day 1.

Fig. 7
Fig. 7 Total number of packets, number of malicious packets and percentage of malicious packets transmitted in the network.

Fig. 8
Fig.8Total number of bytes, number of bytes contained in the malicious packets and percentage of bytes in those malicious packets transmitted in the network.

Fig. 9
Fig. 9 Total number of packets, number of malicious packets and percentage of malicious packets transmitted in the network.

Fig. 10
Fig. 10 Total number of bytes, number of bytes contained in the malicious packets and percentage of bytes in those malicious packets transmitted in the network.

Fig. 11
Fig. 11 Dwell times of NTP server reflection attacks.

Fig. 12
Fig. 12 Dwell times of NetBIOS server reflection attacks.

Table 1
Summary of the main attributes of the reviewed works

Table 2
Summary of Regularized Regression Models

1
Select the penalty parameterRequire: feature-count data matrix M , regression model P Obtain the λ associated with the smallest MSE from array penalty mse; j ; for k ← 1 to number lambdas do λ ← lambda values[k]; Assign λ to penalty parameter in regression model P ; Train regression model P using M T rain

Table 3
Accuracy of Ridge, Elastic Net (ENet) and LASSO regression models trained on the "number of packets sent by the source device" attribute

Table 4 F
-test for the Elastic Net, Ridge and LASSO Regression Models

Table 5
Regression Coefficients in the Elastic Net Regression Model.

Table 6
Accuracy of Ridge, Elastic Net (ENet) and LASSO regression models trained on the "number of packets sent by the source device" attribute

Table 7 F
-test for the Elastic Net, Ridge and LASSO Regression Models

Table 8
Regression Coefficients in the Elastic Net Regression Model.

Table 9
Devices Associated with NTP Server Reflection Attacks

Table 10
Devices Associated with NetBIOS Server Reflection Attacks

Table 11
Findings and recommendations.Finding Recommendation Reflection attacks on the NTP Small percentages of network packets can be and NetBIOS servers are not ignored unless a large percentage of network correlated in the NetFlow data.packets associated with a reflection attack is observed in the NetFlow data.Spoofed requests triggered reflection Network administrators can implement attacks on the NTP and NetBIOS ingress filtering on their networks which servers allows detection of IP packet spoofing.The dwell times between reflection Network attack mitigation schemes can attacks on the NTP and NetBIOS implement Anycast to scatter the attack servers are short.traffic and absorb the attack.The LASSO, Ridge and Elastic Net Conducting an empirical study of various models did not identify correlations deep-learning techniques could improve of reflection attacks.detection of a reflection attack.