An Intrusion Detection Scheme Based on Repeated Game in Smart Home

Smart Home brings a new people-oriented home life experience. However, the edge devices in this system are facing severe threats such as data security and equipment safety. To solve the above problems, this paper proposes an intrusion detection scheme based on repeated game. We first use the K-Nearest Neighbors (KNN) algorithm to classify edge devices and equip the intrusion detection system to cluster heads. Secondly, we use the regret minimization algorithm to determine the mixed strategy Nash equilibrium of the one-order game and then take a severe punishment strategy to domesticate malicious attackers. 0irdly, the intrusion detection system can detect malicious attackers by reduction of payoff. Finally, the detailed experimental results show that the proposed scheme can reduce the loss of attacked intrusion detection system and then achieve the purpose of defending against the attacker.


Introduction
Internet of things (IoT) is entering people's lives and makes the production and life of human beings more intelligent and convenient. Smart Home is a typical application of the IoT [1]. Smart Home integrates integrated wiring technology and network communication technology and is an effective management system [2]. However, Smart Home is facing severe security threats such as data security and device security [3]. e distribution of edge devices is too scattered to apply security technologies in a Smart Home. Besides, some equipment uses outdated versions that are unable to remotely upgrade weaknesses and vulnerabilities, making Smart Home devices vulnerable to attacks. For instance, equipment such as cameras and smart thermostats collect information about people's daily lives which can be traced directly or indirectly back to the person. Once the data of Smart Home devices is stolen, users' private information will be disclosed. erefore, it is urgent to design an effective security protection scheme to ensure user data security in the Smart Home.
Intrusion detection technology is a method to resist the attacker invasion, which can monitor, analyze, and deal with a variety of intrusions without affecting network performance as much as possible to improve the ability of networks to deal with external threats. According to the technology used, intrusion detection technology can be divided into three categories: anomaly detection, misuse intrusion detection, and hybrid intrusion detection. e abnormal detection technology can detect the new intrusion, but it is difficult to establish the attacker's behavior model [4]. Misuse detection technology has high detection accuracy, but it is difficult to collect and update intrusion information [5]. Hybrid intrusion detection technology combines misuse detection and anomaly detection, inherits the advantages of both, improves the detection rate, and decreases false positive rate [6]. To sum up, the existing intrusion detection technologies mainly have the following shortcomings: the volume of data is too difficult to process and the data dimension is too high to be reduced. Inspired by the above schemes, this paper models interactions between attackers and intrusion detection systems as the repeated game and proposes an intrusion detection scheme based on repeated game to protect the security of Smart Home. e main contributions are as follows: (1) To reduce the cost of equipping the intrusion detection system, this paper uses the K-Nearest Neighbors (KNN) algorithm to classify edge devices and equips the intrusion detection system for cluster heads to achieve the purpose of protecting Smart Home system. (2) To defend against attackers, we build interactions between attackers and intrusion detection systems as a repeated game model, use the regret minimization algorithm to determine the mixed strategy Nash equilibrium of this game, and set the severe punishment mechanism to force the attacker to take good action. (3) For the part of the simulation experiment, we compare the proposed scheme with Winner, ALL-S, ALL-P, and ALL-R with three factors: the intrusion detection rate, the attacker's payoff, and the intrusion detection system's payoff. e experimental results show that the proposed scheme can resist attackers. e remainder of this paper is organized as follows: Section 2 describes the representative achievements of intrusion detection technology. We propose an intrusion detection scheme based on repeated game in Smart Home in Section 3. Section 4 shows the performance of intrusion detection scheme based on repeated game. Finally, Section 5 summarizes the possible expansion and research directions in the future.

Related Work
Intrusion detection technology [7] can be divided into three types: anomaly detection, misuse detection, and hybrid intrusion detection.
is section mainly summarizes two kinds of techniques of anomaly detection and misuse detection. e anomaly intrusion detection [8] takes the intrusion activity as a subset of the anomaly activity, which is divided into feature selection-based anomaly detection, Bayesian inference-based anomaly detection, and pattern predictionbased anomaly detection. e feature selection-based anomaly detection is to accurately predict or classify detected intrusions by selecting a subset of metrics that can detect intrusions [9,10]. However, the metric set cannot encompass all the various intrusion types; and the preidentified specific metric set may miss intrusions in a particular environment alone.
e Bayesian inference-based anomaly detection is to judge whether the system has an intrusion event by measuring the variable [11,12]. However, this method requires correlation analysis of each variable for determining the relationship between each variable and the intrusion event. e pattern prediction-based anomaly detection considers the sequence of intrusion events and their correlation [13,14], but the unrecognized behavior pattern is judged as an abnormal event in this method.
Misuse intrusion detection [15,16] detects intrusion events by matching the defined intrusion pattern with the observed intrusion behavior, which can be divided into contingent probability-based misuse intrusion detection, state transition analysis-based misuse intrusion detection, and keyboard monitoring-based misuse intrusion detection. e contingent probability-based misuse intrusion detection maps the intrusion to an event sequence and then infers the intrusion occurrence by observing the event [17,18]. However, in this method, the prior probability is hard to give, and the event independences are hard to be satisfied. e state transition analysis-based misuse intrusion detection regards an attack as a series of state transitions of monitored systems [19,20]. However, the attack mode can only describe the sequence of events and is not suitable for describing complicated events. e keyboard monitoringbased misuse intrusion detection assumes that the intrusion corresponds to a specific keystroke sequence pattern and then monitors the user keystroke pattern and matches this pattern with the intrusion pattern to detect intrusion [21,22]. But this approach, without operating system support, lacks a reliable way to capture users' keystrokes, and users can easily cheat the technique by using alias commands.
To solve the above problems, we no longer detect the intrusion based on the characteristics of the attacker but consider intrusion detection system's payoff; that is, the intrusion detection system detects the attacker invasion by observing its payoff decrease.

Intrusion Detection Scheme Based on Repeated Game
is section describes how the intrusion detection system detects the attacker's malicious action and how to educate the malicious attackers to take good strategy. e notations definitions are shown in Table 1.

One-Order Game.
In Smart Home, due to a large number of edge devices and limited service capacity [23,24], it is impossible to run the intrusion detection system on each edge device, so we need to design a strategy to allocate the intrusion detection system on the edge device. We first use the clustering algorithm to divide edge devices into multiple clusters and then configure intrusion detection system for each cluster-head node in Smart Home [25,26]. Each cluster has a cluster-head node and several member nodes. e former is mainly responsible for information forwarding and executing the intrusion detection program within the cluster, and the latter is responsible for collecting information and passing the information to the cluster-head node [27,28]. Suppose that there are N edge devices, which are divided into k clusters by KNN algorithm, C 1 , C 2 , . . . , C k . We assume that an attacker can attack one cluster head at a time and model interactions between the intrusion detection systems and attackers as a one-order game model. at is, where P is the player in one-order game, that is, the intrusion detection system and the attacker, P � (a, d). S is the strategy space, S � (A a , D d ), and U is the player's payoff. e attacker has four strategies, A a � (a 1 , a 2 , a 3 , a 4 ). a 1 refers to the fact that attackers do not attack any cluster heads; a 2 refers to the fact that attackers attack the cluster-head node C i ; a 3 refers to the fact that attackers attack cluster heads C i after T times; a 4 refers to the fact that attackers attack the cluster-head node C j . Also, the intrusion detection system has four strategies, . d 1 refers to the fact that intrusion detection systems do not protect any cluster heads; d 2 refers to the fact that intrusion detection systems protect the cluster head C i ; d 3 refers to the fact that intrusion detection systems protect cluster heads C i after T times; d 4 refers to the fact that intrusion detection systems protect the cluster head C j . erefore, the strategy profile of attacker and intrusion detection system can be defined as e row represents the attacker's strategy and the column represents the intrusion detection system's strategy in M. Suppose that U a and U d are the payoffs of attackers and intrusion detection systems, respectively. us, where a refers to the attacker and d refers to the intrusion detection system. e strategy profile M 22 � (a 2 , d 2 ) refers to the fact that the attacker does not attack the cluster head, whereas the intrusion detection system protects the cluster head. At this time, the attacker gains the payoff 0 at the cost of c i , U a � −c i , and the intrusion detection system at the cost of r i to gain the payoff p i , U d � p i − r i . Similarly, we can get the payoff matrix of attackers and intrusion detection systems, as shown in X and Y: where c i is the cost of attacking cluster heads C i , c i ′ is the cost of attacking cluster heads C i after T times, r i is the cost of persistently protecting cluster heads C i , r i ′ is the cost of protecting cluster heads C i after T times, p i a is the payoff of attacking cluster heads C i , and p i d is the payoff of intrusion detection systems against attacks. It can be seen from the payoff matrix that there is no pure strategy Nash equilibrium in this game, and the intrusion detection system can observe malicious attackers according to its payoff decrease. Besides, the intrusion detection system always tries to determine the cluster head attacked by the attacker and then protect it to maximize its payoff. erefore, we use the regret minimization algorithm that determines the selection method of that future action according to the degree of regret to determine the players' mixed strategy Nash equilibrium. us, the probability of playing strategy d 1 in round T is defined as follows: where D d is the intrusion detection system's strategy set, Regret T d (d 1 ) is the regret value of playing strategy d 1 , and

Repeated Game.
During the process of interaction between the attacker and intrusion detection system, the intrusion detection system can detect attackers' invasion by observing the changes of their payoff. However, the attacker does not have the effect of his current strategy on the future payoff, that is, he only considers the payoff of one interaction; therefore, it is difficult to prevent the attacker in the one-order game. But if the intrusion detection system punishes the attacker, the attacker will have to consider the cost of the penalty brought by the intrusion detection system in the repeated game; and if the punishment cost of attacking exceeds the payoff of attacking, the attacker will be forced to take a nonattack strategy. us, the intrusion detection system does not need to implement supervision and then achieve the purpose of maintaining the normal order of the entire network.
In the repeated game, assuming that a et is the strategy adopted by player e in the tth round, the strategy set of player e in the previous T round is a e1 , a e2 , . . . , a eT . e total payoff of player e can be expressed as where δ is the discount factor, δ ∈ (0, 1). e bigger δ is, the more e pays attention to long-term payoff; and the smaller δ is, the more player e pays attention to current payoff. Since the intrusion detection system cannot detect the attacker for the first time, we assume that the detection rate of the intrusion detection system to the attacker is less than 1, q ∈ (0, 1). e probability of an attacker being discovered by an intrusion detection system after k times of attack is (1 − q) k−1 q. e total payoff of the attacker is In previous researches on network security protection, once an attacker is captured by the intrusion detection system, the network will delete this node. However, it will affect the whole network and will have no containment effect on the attacker's action. erefore, this paper designs a severe punishment mechanism to educate captured attackers into regular players. When the attacker is found to be uncooperative at the time slot k, within T penalty cycles, that is, k + 1, k + 2, . . . , k + T, the attacker's payoff can be defined as If the node is detected during the second attack, the node will be punished with a period of 2T, and the total payoff of the attacker in the penalty cycle is e loss of attacker in penalty cycle is We regard the loss of the attacker in the penalty cycle as an additional reward to the intrusion detection system. erefore, the intrusion detection system's payoff can be defined as where ΔU T a is the loss of attackers in the penalty cycle. By comparing the attacker's payoffs over the two penalty cycles, it can be seen that the attacker's payoffs decrease with increasing the number of betrayals. Besides, if the number of defections by an attacker exceeds the threshold of the intrusion detection system, the attacker will be eliminated; and the cluster-head node will no longer interact with the attacker.

Simulation Experiment
is paper uses Anaconda integrated development tool to verify the intrusion detection scheme based on repeated game. Firstly, we simulate the classification process of KNN algorithm and set four newly added nodes to prove its effectiveness. Secondly, we compare the payoffs of attackers and the intrusion detection systems in penalty cycles and regular interaction cycles to verify the effectiveness of the penalty mechanism.
irdly, we determine the optimal strategy for each round of interaction between the attacker and intrusion detection system by using the regret minimization algorithm. Finally, we compare the proposed scheme with four interaction strategies, Winner (take the strategy of the winner), ALL-S (remain strategy Scissor), ALL-P (remain strategy Paper), and ALL-R (remain strategy Rock), to prove that the proposed scheme can improve the player's payoff. e experimental parameters are shown in Table 2. Figure 1 depicts the classification results of the KNN algorithm. Figure 1(a) shows the original distribution of edge device nodes. Figure 1(b) shows the classification results of the KNN algorithm, with each symbol representing a class of edge devices. Figure 2 analyzes the results of the classification of the newly added nodes, with the newly added nodes marked in blue. For example, in Figure 2(a), the blue node (the newly added node) is classified as a first class. 4 Mobile Information Systems  Mobile Information Systems

e Comparison of the Attacker's Payoff and Intrusion
Detection System's Payoff. Figure 3 compares the attackers' payoffs in regular interaction cycles and penalty cycles. As you can see in Figure 3(a), the attacker's payoff does not change during regular interaction cycles, because the intrusion detection system does not play the defensive strategy. Figure 3(b) shows that the attacker's payoff gradually decreased with increasing the number of interactions. In the 4th interaction, the attacker's payoff tends to zero. Besides, the longer the penalty cycle is, the faster the attacker's payoffs will go to zero, and the larger the losses will be. is happened due to the punishment mechanism in this paper. erefore, for a rational attacker, it must normally interact with the intrusion detection system to maximize its payoff. Figure 4 compares the intrusion detection system's payoffs in the regular interaction cycle and the penalty cycle. It can be seen from Figure 4(a) that the intrusion detection system's payoff is −3 during the regular interaction cycle.  is is because the attacked intrusion detection system does not play any defective strategy. Figure 4(b) shows that the loss of the intrusion detection system decreases with increasing the number of penalty cycles; and the payoff of the intrusion detection system is the lowest when the penalty period is 5. To sum up, the proposed scheme can reduce the loss of intrusion detection systems when attackers launch attacks.

Application of Regret Minimization Algorithm in Rock-
Paper-Scissors Game. Table 3 defines the payoff matrix of two players in the rock-paper-scissors game. In this table, the rows represent the strategy of player A, the columns represent the strategy of player B, the first element in the tuple (0, 0) represents the payoff of player A, and the second element represents the payoff of player B. Table 4 analyzes how player A determines its optimal strategy based on the regret minimization algorithm. For example, in the first round, player A and player B choose Rock and Paper, respectively, and then player A's regret values when playing Scissor, Rock, and Paper are 0, 2, and 1, respectively; thus the probabilities of player playing Rock, Scissor, and Paper are 0, 2/3, and 1/3, respectively. Similarly, we can obtain the optimal strategy of player A in each round. Table 5 compares the payoffs of player A and player B when player A adopts five strategies: regret minimization strategy (Regret), ALL-R, ALL-P, ALL-S, and Winner, while player B adopts a regret minimization strategy. As can be seen from Table 5, when and only if player A adopts ALL-P, player B adopts Regret to obtain a lower payoff than player A, but the difference in payoff between player A and player B is small. However, under several other strategies, player B obtains the highest payoff by taking Regret. is is because player B maximizes the probability of the strategy with the maximum regret value. e payoff change curves of players A and B are shown in Figure 5. In this figure, the sharp increase and

e Payoff Comparison between Player A and Player B.
Mobile Information Systems 7 decrease in the payoffs of player A and player B are due to the adjustment of both players' strategies.

Conclusion
Designing an efficient and safe protection scheme is the key to promoting the application of the system. is paper proposes a security protection scheme based on repeated game. In this scheme, the intrusion detection system detects the malicious attackers by observing its payoff change and punishes the attackers who adopt malicious strategy severely to educate the attackers to take good action. e experimental results show that the proposed scheme can effectively defend against the attackers.
In future research studies, we will continue to explore new methods to determine the player's optimal strategy in the finite model.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.