Virtual Network Security Control Mechanism based on Multi-level Alerting and Linkage Defense

The intrusion detection system can effectively detect network attacks dynamically by analyzing network traffic or data packets. However, for complex multi-step attacks with high concealment and persistence, traditional intrusion detection systems generally have the problems of large redundancy of alert data and poor data readability, which seriously affects the security administrators to quickly identify attack behaviors and intentions. To address the above problems, this paper proposes a multi-level alert aggregation method based on software-defined security and linked correction defense based on the Markov chain model. This method designs the reporting message format between the data plane security component and the control plane security controller, and automatically extends the time window by using the temporal proximity relationship of alarms. Multi-level alert aggregation is performed based on the attack attribute matching method, resulting in the effective aggregation of similar alarms. Further, the control plane uses the Markov chain model to generate alert association graphs to obtain the transfer probability between attack types, which is sent down to the security components for linkage defense. The experiments show that this mechanism can effectively achieve multi-level alarm aggregation and alarm information association linkage.


Introduction
At present, with the increasing stealth and persistence of network attacks, the challenge of network attack detection is becoming increasingly severe. An intrusion detection system (IDS) can effectively detect network attacks dynamically by analyzing data such as network traffic or data packets [1][2]. However, for complex multi-step attacks with high concealment and persistence, traditional IDSs generally have the problems of large redundancy of alert data and poor data readability, which seriously affect security administrators to quickly identify the attack behavior and attack intent. With the emergence and application of Software Defined Networking technology, it is necessary to effectively aggregate repeated alarms caused by the same or similar attack behaviors between data plane security components and control plane security controllers to achieve linkage defense [3][4].
Scholars at home and abroad have done a lot of fruitful research work in the field of network alarm aggregation. Using IP address attributes, all alarms with interrelated source or destination IP are clustered into one cluster and alarm aggregation is performed within each cluster, but when dealing with complete attack scenarios such as DDoS, all IPs will be interrelated, resulting in alarms being classified in the same cluster and thus rendering the sub-clustering ineffective [5]. In terms of temporal attribute processing, most of the related research methods focus on alarm aggregation within a sliding time window of a certain size, but if the time window is too large, alerts belonging to multiple attacks are aggregated together, if the time window is too small, all alarms triggered by a single attack cannot be aggregated [6]. To address this problem, the traditional sliding window approach is improved. First 2 storing the current alert information using a longer time window, and then using streaming processing to aggregate alerts within the sliding time window [7]. However, the size of the current window and the sliding window still needs to be determined subjectively. To solve the alarm aggregation problem with large fluctuations in time intervals, the aggregation method based on a fixed time threshold is improved. Each time a new alarm is generated, the relative mean square error of the time interval of the alarm sequence is calculated, and this is used as the coefficient of variation of the alarm aggregation time window, which can realize the dynamic update of the time threshold [8]. However, the dynamic time window calculated by this method is monotonically increasing. Only when the mean square error of the alarm interval tends to zero, the time window will converge, this condition is difficult to satisfy in practice.
To address the shortcomings of existing methods, this paper proposes a multi-level aggregation and association method for alarms based on self-expanding time windows. In terms of time window setting, the time windows of alarms with similar time intervals are dynamically stretched. In terms of IP address attribute processing, alerts with the same port, IP address, and subnet are aggregated together. Further, the alert correlation is performed by calculating the 1st-order state transfer probability among alerts to provide a basis for attack scenario identification.

Self-expanding time window setting method
Since different intrusion detection systems generate alerts in different formats and many of the attributes do not contribute much to alert aggregation, it is important to standardize the alert data types before performing alert aggregation. In this paper, we use the most typical existing alert message format IDMEF to describe the network alerts and perform alert aggregation. The alarm data structure is Structure={Time, SrcIP, DstIP, SrcPort, DstPort, AttackType}. Time is the start time of the attack represented by the alert, SrcIP and DstIP are the sources and destination IPs of the attack, SrcPort and DstPort are the source and destination ports of the attack, and AttackType is the attack type. Algorithm 1. Self-expanding time window-based division algorithm.
Input: Alert set, the number of alarms in Alert N, and the preset time interval threshold I. Output: The set of alarms divided by the self-expanding time window, stored in the 3-dimensional array , each page of which is an alarm within a time window.
else then 10) p = 1; 11) k + +; 12) setB j ( p, :, k) = Alert( i, :); 13) end else 14) end for 15) return setB j ; 16) End Among the various dimensional attributes of alerts, the Time attribute is crucial for alert aggregation. For repeated alarms generated by the same attack step, or a large number of similar alarms generated by vulnerability scanning and flooding attacks, the Time attribute interval between two adjacent alarms is small, while the overall span of the attack presents uncertainty; therefore, a division method based on self-expanding time window is proposed to automatically expand the time window by using the proximity relationship of alerts. The specific steps of its time window setting are as follows.
Step 1: Data preprocessing. The alarm start time Time property is converted to units of s.
Step 2: Arrange the alert sequence in ascending order by the Time property, and set the start time of the first alert to the beginning of the first time window.
Step 3: If the difference between the Time of the next alert and the Time of the previous one is less than the predefined interval threshold I, the latter alert will be included in the current time window. Otherwise, the time of the previous alert is set as the end of the current time window, and the time of the next alarm is set as the starting point of the next time window. The specific algorithms are as algorithm 1.

Multi-level aggregation method for network alarms
In the alert aggregation process, the traditional attribute similarity-based method calculates the similarity of AttackType, SrcIP, DstIP, SrcPort, and DstPort as part of the overall similarity for fusion calculation. This method is limited because: 1) Substituting AttackType into the similarity calculation will aggregate the alarm records generated by unrelated attacks and reduce the accuracy of aggregation. 2) IP addresses are often matched by bit, which can only reflect the probability of IP belonging to the same subnet to a certain extent, but cannot clearly determine whether they belong to the same subnet.
3) The method of substituting IP and port into the similarity calculation, it is difficult to reflect the actual situation of the attack because it ignores the meaning of the port itself. To solve the above problems, a multi-level aggregation method for alarms is proposed. Use AttackType to cluster alarms to avoid mutual aggregation of alarms of different AttackType. In terms of IP address and port processing, the subnet mask is used to clearly divide the subnets, and then subnet-level, host-level and service-level aggregation are performed. These 3 levels of aggregation correspond to the alarms generated by 3 different stages of attacks, and the aggregation results have strong practical significance. The specific aggregation process is as follows: (1) Step 1: Use AttackType to divide the first layer. The alerts with the same AttackType i are grouped in one cluster setA i . (2) Step 2: Alert classification based on self-expanding time window. Within the cluster setA i delineated in Step1, the set of alarms in the same time window is obtained using Algorithm 1, and the set of alarms in the j time window is denoted as setB j . (3) Step 3: Subnet-level aggregation. In the alarm set setB j delineated in Step 2, the alarms belonging to the same subnet of SrcIP are first divided into a set using the subnet mask, and the set of alarms in the k subnet is recorded as setC k , and then the alarms in the set are aggregated. The reason for using the subnet to which SrcIP belongs is to reflect the attack carried out by attackers using the same router as a springboard. (4) Step 4: Host-level aggregation. Within the alarm set setC k delineated in Step 3, the alarms with equal SrcIP attributes are divided into a set, and the set with SrcIP = IP p is called setD p , and then the alarms in this set are aggregated. This aggregation process can reflect the attacks performed by attackers using the same host as a puppet. (5) Step 5: Service level aggregation. Within the alarm set setD p delineated in Step 4, the alarms with equal SrcPort attributes are divided into a set, and the set with SrcPort = Port q is denoted as setE q , and the alarms in the set are aggregated. This aggregation process can reflect an attacker using the same service privilege as a springboard to execute an attack. The overall process of network alarm aggregation based on the idea of multi-level division is shown in Figure 1

Network alert correlation diagram generation method
An attack scenario executed by an attacker can be viewed as multiple logical steps, where the current logical step puts the attacker in a specific authority state, and the next possible attack step is only related to the current authority state. Thus, in the logical step, the multi-step attack satisfies the no aftereffect of the first-order Markov model. Therefore, the first-order Markov property can be used to model the multi-step alarm correlation process.

Subnet-level aggregation
Host-level aggregation Service-level aggregation Figure 1. Integral flow chart of multistage aggregation for network alerts. Alert Correlation Graph: The generated alarm correlation graph is an edge-weighted directed graph G(V, E). Where V is the set of alert type nodes, each node represents an attack-type, denoted as type i ; E is the set of directed edges e ij , e ij represents the edge from node type i to type j ; the weight p ij of e ij denotes the transfer probability from type i to type j . The value of p ij is determined by the probability of type i and type j occurring in sequence. In the alert association graph, the sum of the edge weights with the same node as the starting point must be 1. The alert correlation graph can quantitatively reflect the attacker's attack path and intention. The main indicators to measure the effect of alarm correlation are the false correlation rate FR and the missed correlation rate MR. The calculation formula is as follows: , Where N FCOR , N MCOR and N TCOR are the number of false correlations, the number of missed correlations and the total number of correlations, respectively. In addition, the correlation accuracy index Accuracy can comprehensively evaluate the false correlation rate FR and the missed correlation rate MR. The larger the value of Accuracy, the better the effect of alarm correlation. Accuracy is:

Experiment and analysis
Before alarm aggregation, the intrusion detection system needs to be used to generate alarm information. This experiment uses data from the LLDoS1.0 attack scenario in DARPA 2000, which is a common dataset for intrusion detection as it is used by the U.S. Department of Defense to simulate attacks on its The self-expanding time window method is used to divide the alarms of each attack type in Table 1, and the results are shown in Table 2, where n_set is the number of divided time windows, and n_alert max is the maximum number of alerts contained in the time window. In order to optimize the parameters of the time interval threshold I, when I takes a different value, the alerts are aggregated at three levels, and then the changes in the aggregation effect are observed. The alert aggregation effect is measured by the aggregation rate Rate, and the calculation formula is:

1
( 3 ) Figure 2. Final aggregation results of attack types. N aggregation is the number of alerts generated by aggregation, and N origin is the original number of alerts. Set the subnet mask to 255. 255. 255.0, and the final alert aggregation rate obtained after three-level aggregation is shown in Figure 2. It can be seen from Figure 2 that as the time interval threshold I increases, the final aggregation rate continues to rise. This is because the increase in I will increase the self-expanding time window, which in turn leads to the final division result in each time window. The number of alerts increases, so the aggregation rate increases. However, the increase of the time window will also cause some irrelevant alerts to be aggregated, which will reduce the accuracy of aggregation. Therefore, the inflection point of the connection slope of the aggregation rate is significantly reduced to determine the value of I. The I values of 90, 90, 30, 60, and 30s were chosen for the aggregation process of the attack types AttackType 1 to AttackType 5 . After the time interval threshold I is optimized, the aggregation results of each attack type alert are shown in Table 3. While getting simplified alerts, the aggregation results obtained in this experiment can also reflect the possibility of IP scanning and port scanning in the network to a certain extent. The higher the aggregation rate of host-level aggregation, the higher the possibility of IP scanning in the network; the higher the aggregation rate of port-level aggregation, the higher the possibility of port scanning in the network.    To verify the accuracy of the alert aggregation and correlation method proposed in this paper, the aggregation results obtained by the method in this paper and the method in [8] are used to perform alarm correlation, respectively. The results are shown in Figure 3 and  Figure 3 is 95.65%, which is higher than the algorithm in [8], 97.94%. Because the alert aggregation method proposed has a finer granularity and a more reasonable time window, so it can effectively aggregate similar alarms and reduce the proportion of false alerts.

Conclusion
In this paper, we propose a multi-level alert aggregation method based on software-defined security and linkage defense based on the Markov chain model. A self-expanding strategy is adopted in the setting of the time window, and multi-level alarm aggregation is performed using information such as IP and port. In addition, multi-step alarm correlation analysis is performed on the basis of aggregated alarms to achieve linked defense. The experiments prove that this method has a good alarm aggregation effect and high correlation accuracy.