A Model Checking-Based Security Analysis Framework for IoT Systems

IoT systems are revolutionizing our life by providing ubiquitous computing, inter-connectivity, and automated control. However, the increasing system complexity poses huge challenges for security as IoT devices are distributed, highly heterogeneous, and can directly interact with the physical environment. In IoT systems, bugs in device ﬁrmware, defects in network protocols, and design ﬂaws in automation rules can lead to system breach or failure. The challenge gets even more escalated as the possible attacks may be chained together in a long sequence across multiple layers, rendering the existing vulnerability analysis frameworks inapplicable. In this paper, we present ForeSee , a model checking-based framework to comprehensively evaluate IoT system security. It builds a multi-layer IoT hypothesis graph by simultaneously modeling all of the essential components in IoT systems, including the physical environment, devices, communication protocols, and applications. The model checker can then analyze the generated hypothesis graph to validate system security properties or generate attack paths if there are any violations. An optimization algorithm is further introduced to reduce the computational complexity of our analysis. Our framework veriﬁes hypothesis graphs with millions of nodes in less than 100 seconds. The illustrative case studies show that our framework can detect more potential threats than the existing approaches.


Introduction
Nowadays, Internet of Things (IoT) systems are deployed in a wide range of applications: smart home, industrial manufacturing, healthcare, transportation, and in many other sectors [1]. There are already 20 billion IoT devices connected to the internet, and the number is expected to rise to 75 billion by the year decouple and model an IoT system as a multi-layer graph, including physical environment layer, device layer, communication layer and application layer. The multi-layer graph models both intra-layer and inter-layer interaction between different components. Furthermore, ForeSee decomposes real-world IoT attacks into individual exploits and integrates them into the multi-layer graph, generating attack traces to show how they interact with system components. 35 The benefits of our approach are threefold. First, by considering all of the core components simultaneously, we can discover more vulnerabilities than existing frameworks do. For example, suppose an incompetent user is surfing the internet at home, and the indoor camera is running a vulnerable network service. If the user clicks the phishing site created by the attacker, the camera will 40 be exploited. The attacker can use the compromised camera to further spoof a "intruder detected" event to trigger the alarm or other unwanted device behavior. Existing works fail to model user state or behavior and thus will not be able to detect such an attack path.
Second, we identify and model various device interactions. For instance, if technique to detect various vulnerabilities and attacks. To alleviate the size explosion problem of the hypothesis graph, we design a state compression algorithm to intelligently generate independent sub-graphs without compromising the vulnerability detection ability.
In summary, we make the following contributions: • We formally represent IoT systems as multi-layer graphs to characterize data flow and the interaction of different components.
• We design a risk assessment framework for IoT to capture potential attack paths across multiple layers.
• We propose an optimization algorithm to reduce the state explosion prob-70 lem by constructing the hypothesis graph based only on the components relevant to the correctness property specified.
• We investigate the effectiveness of our model using a case study which is based on real-world IoT attacks.
• We evaluate the time and space complexity of our framework using the 75 SPIN model checker [18], and the result shows that it only takes seconds and around 100 MB memory to verify hypothesis graph with millions of nodes when there is a violation of the specified correctness property.

80
IoT systems connect physical world to the cyber space. Today's typical IoT systems have complex infrastructure including router(s), gateways (sometimes called hubs, basestation, etc.), end devices, and a cloud backend. Usually there is also a companion mobile app for remote control. The end devices can be categorized as sensors and actuators to perceive and modify physical states of 85 the system. However, numerous vulnerabilities have been found on IoT devices [19,16,20,17,21,22] and mobile apps [23,24]. Another important feature of
The trigger is some IoT event represented as device state change, for example, 90 the thermostat senses an increase of environment temperature, or the door lock is unlocked. The action represents some device behavior, such as turning on the light or sounding the alarm. Even though IoT apps' logic seems straightforward, researchers have identified dozens of malicious IoT apps which may cause system breach or other physical damage. 95 In addition, there are some human users interacting with the IoT system.
For example, the user's existence in a smart home will be sensed by a motion sensor, and the user can take actions such as turning on the TV or opening the window. However, existing works [25,26]  checker for the given correctness property. Moreover, granularity is also important when modeling the system. Some IoT features such as room temperature is a continuous number, but to make the system state finite, we need to discretize it or even make it boolean. There is a tradeoff between the number of potential vulnerabilities we can detect and the time and memory usage.

125
Due to its popularity and high efficiency, we choose SPIN model checker [18] to verify our hypothesis graph. SPIN accepts PROMELA [32] as system description language and linear temporal logic (LTL) formulas as correctness properties to be verified.

130
In this paper, we consider IoT system vulnerabilities (integrity violations) caused by flawed or malicious apps, user's behaviors, attacks, or their interactions via common channels such as physical environment features or shared devices. Due to the distributed and heterogeneous nature of the IoT systems, such violations are difficult to predict. To analyze the attacks' impact on system 135 security, we first need to integrate them into the system transition graph. While some real-world attacks to IoT systems happen at only one layer, many others involve multiple steps at different layers. We follow [33] and name every single step an atomic attack.
Furthermore, we consider both passive attacks and active attacks that hap-140 pen at all of the four layers of the IoT system. It is assumed that the attacker is aware of the commercial IoT system architecture. Besides, the attacker knows the communication between the gateway and the cloud; attacker also knows the protocols used for inter-device communication as they are industry standards.
The attacker's arsenal is all the vulnerabilities listed on Common Vulnerabili-145 ties and Exposures (CVE) [34] of all the devices installed and protocols used.
We assume that the remote cloud is trustworthy and do not attempt to model attacks on the cloud. the multi-layer IoT system transition graph. Thereafter, we decompose realworld IoT attacks into atomic attacks [33]. From the atomic attacks and the 155 multi-layer system transition graph, we build the hypothesis graph and perform vulnerability detection with respect to the specified correctness properties. Finally, if there is a violation of the specified property, an error trace is returned to help us identify the cause. In Section 6, we present a state compression algorithm that selects applications and user states relevant to the given correctness Before constructing the IoT hypothesis graph, we should determine the input of the framework shown as gray boxes in Figure 1. For a given IoT system, 165 App and Dev are already known. Then, we can determine Com and Env based  and protocol of the system. Once we know the vulnerabilities, we can establish the set of potential attacks on the IoT system. Correctness properties are system-specific. Soteria [35] proposed dozens of properties specific to smart home applications and five general properties such as no conflicting control commands or repeated commands in one code branch, etc. However, to the 175 best of our knowledge, currently there is no work that automatically generates correctness properties or comprehensively deals with user states and behaviors.

Multi-Layer Graph Construction
The heterogeneous and dynamic nature of IoT systems brings huge chal- with the physical environment, and can potentially interact with an infinite number of user states and behavior. Second, devices may be added to or removed from the IoT system frequently. Moreover, the framework should consider both system security and user safety as they are an essential part for most of IoT sys-185 tems. To deal with the above challenges, we propose a novel formal framework that abstracts a complicated IoT system into a clear, layered structure. Our approach effectively decouples the processing logic of one layer from another so that the vulnerabilities within one layer would not be mixed with others.
Moreover, multi-layer graph also enables us to detect violations which involve 190 multiple layers, such as inconsistency between physical and device layer. Figure   2 gives an overview of the IoT hierarchy, which consists of four layers -physical environment layer, device layer, communication layer, and application layer.
We abstract the internal behavior of each layer as a directed, unweighted state transition graph L = (V, E). In the graph, the node v ∈ V represents a and their values constitute distinct system states. Each atomic proposition is a boolean variable and describes the smallest unit of the system state that has the characteristic properties of an IoT element. By representing a system state at one layer using a collection of atomic propositions, we make our multi-layer M is the set of cross-layer edges which indicate the relationships between the adjacent layers and is formally defined as: where M (i,j) is the set of edges from layer i to layer j.
In the rest of this section, we give detailed definitions of system transition for each layer, along with the node mappings (i.e., cross-layer edges).
Physical environment layer describes facts about physical surroundings and the user states in the IoT system, such as room temperature, humidity, the the environment. The edges between nodes denote the system state transition, which may be caused by environmental change or the user's state change.
Suppose v i and v j are two nodes at physical environment layer. One atomic 220 proposition AP l at this layer describes the environmental temperature. If the temperature is larger than threshold θ, to open the window when room temperature is higher than 80°F.
The edge from communication to application layer signifies that the event 300 packets sent by the sensor are faithfully delivered to the decision maker, triggering the update of variable value in decision maker. Similarly, an edge from application layer to communication layer indicates that the decision maker' state is updated due to the application rules, and it also generates command packets to be sent to the actuator(s).

305
Only verifying that a system does not satisfy the property is not sufficient; we should also visit back to identify the root causes of attacks. In our framework, the interconnection among the layers is explicitly captured by their node mappings, which helps trace the influences from one layer to another and finally identify the propagation path a venerability. 310

IoT App Description Analysis
Since IoT applications decide the functionality of an IoT system and they are dependent on the users' configuration, we need to design an approach to automatically extract the app logic based on app description or the app source code. Our method extracts apps' semantic information from their descriptions 315 using NLP techniques. A typical IoT app description is in trigger-action format where the trigger is some IoT event and the action means some device behavior.
First of all, we use an NLP parser to construct the parse tree and split the sentence into the conditional clause and the main clause by doing a breadth-first search (BFS) on the parse tree to find the tree node with label SBAR, which is 320 the root of the subtree for the conditional clause. Then the conditional clause is obtained by concatenating the leaf nodes of this subtree. The main clause is constructed by removing the conditional clause from the original description.
After that, we extract the noun and verb phrases from each clause using regular expression chunker and match the noun and verb phrases with device name and 325 device actions, respectively. The matching is based on Word2Vec embedding [36]. Because the embedding is only for individual words, we split every phrase into words and choose the highest word pair similarity as the match result.
As an example, the parse tree of an IoT app description is shown in Figure   3. The conditional and main clauses after splitting are "motion detected" and from the node which contains env.temperature.low to the one which contains env.temperature.high) and thus can be pre-defined. The app logic will also be used by the dynamic selection algorithm (explained in Section 6.4).

Hypothesis Graph
Due to the interactive nature of IoT components, attacks may trigger un-   Figure 4: The illustration of Mirai cross-layer attack.
the last one is at the device layer. Assuming the victim device is d t , Figure 4 illustrates the cross-layer attack. Node v i1 ∼v i5 are device-layer nodes represent- remote-to-user attack. To formally define an atomic attack, we need to identify the system and attacker states before and after the attack. Then the attack behavior is represented as the added edge between these two nodes.

Constructing Hypothesis Graph
As is shown in Section 6.1, for some attack, we need to introduce new atomic propositions, we can safely add edges to represent attack behaviors. The resulting graph is an IoT hypothesis graph whose nodes depicts system states 435 including the attacker's state at certain layer and whose edges represent state transition due to environmental change, user's behavior, information flow, or attacker's behavior.

Vulnerability Detection
Our framework is built on top of model checking, which takes system graph 440 and correctness properties as input, and outputs a counterexample if the system does not satisfy a certain correctness property.
Specifying correctness property. Correctness properties for a system can be classified as safety property (that something "bad" will never happen) and liveness property (that something "good" will eventually happen), which

State Space Compression
A major challenge of model checking is the state explosion problem. Though introducing user and attack can make the system model more realistic, the number of nodes of the model gets 2 k (k is the number of newly introduced atomic propositions to represent user and attacker's states) times bigger, thus worsening 455 the state explosion problem. Therefore, we propose a dynamic selection algorithm that selects relevant applications and user states, given the correctness property. Because the algorithm is executed before constructing the hypothesis graph, it can be used regardless of the model checking algorithm chosen.
Our algorithm is based on the observation that every correctness property or IoT application involves environment features and/or actuator configurations.
Moreover, each user state and associated behavior can also be seen as a virtual application. Hence, formally any given application i can be represented as in is the set of input environment features (including user states), A

Case Study
We illustrate our framework's wide applicability by designing hypothesis graphs for two IoT systems: smart home and smart healthcare.

Smart Home
In this subsection, we present a proof-of-concept attack inspired by [7,27,10]   The model checking algorithm first determines that the node with a thick red border is a state which violates the specified correctness property. Then it traces back and marks all the preceding nodes until it reaches the initial node.
Since from the initial system state we can finally reach the violating state, the hypothesis graph does not satisfy the specified correctness property, and the 510 error trace is the red path in Figure 6.

Smart Healthcare
Our framework can be easily applied to other IoT scenarios, such as smart healthcare, or smart factories. Here we show how to create IoT hypothesis graphs for an automatic blood glucose control system, where the insulin pump

Implementation
We implement ForeSee based on the Spin model checker [37]. Spin takes that, we perform verification by running the compiled program. If the system does not satisfy the given correctness property, the verifier will return an execution trace that caused the violation.

State Compression Ratio
To evaluate the effectiveness of our dynamic selection algorithm, we created 6 555 smart home scenarios whose IoT apps are chosen from the Samsung SmartApps We can see from the figure that the compression ratio ranges from 1.5% to 0.057%, and for the larger set of apps our algorithm tends to achieve better ratio. violation of the property. The time and space complexity of hypothesis graphs that pass the verifier are shown in Figure 9(a) and 9(b), while the ones of the 570 hypothesis graph that fail to pass are shown in Figure 9(c) and 9(d). The x-axis variable "# states" denotes the number of unique states of the hypothesis graph traversed by Spin model checker. This is used as a measure of the size of the hypothesis graph. The y-axis variable "Memory (MB)" in Figure 9 denotes the sum of memory used to store all these states, hash table, depth-first search stack, 575 and other overhead. Due to the server's memory limit, the maximum number of states we can run is around 1.4 × 10 6 . Since Spin will immediately return after detecting a violation (i.e., an acceptance cycle), the verification process takes much less time and space if there is a violation. The scalability of our framework when there exists a violation is shown in Figure 9. From the figures, 580 we can see the time and space cost scale linearly with the graph size, and it takes much less time and memory when there is a violation of the property.

Related work
The research works on IoT security and privacy can be categorized by the components they focus on. communications by rating and sorting the correlated subgraphs extracted from the directed graph generated from the traffic data. Li et al. [43] investigated the side-channel information leakage of video surveillance cameras through streaming traffic data analysis.
Application Layer. Ding et al. [27] presented a framework that discov-600 ers potential physical interactions across applications using natural language processing (NLP) techniques and evaluated the risk score of each inter-app interaction chain. Mohsin et al. [44] proposed a formal framework for IoT security analysis based on satisfiability modulo theories (SMT). [35,45] took advantage of model checking to analyze application-level vulnerabilities in IoT systems.

Discussion
Our hypothesis graph uses model checking to detect IoT vulnerabilities. Researchers have also proposed other graph-based approaches for network security 610 analysis [47,48,49]. In this section, we compare these methods and discuss benefits and limitations of our framework.
Cycles in the graph. Attack graphs created by [49,47] utilize a key assumption of monotonicity, meaning once the attacker has gained some privilege, he will always have that privilege in the following attacks. However, this is not 615 true for IoT systems because there could be some negative feedback loops due to automatic control. For example, when the attacker compromises the heater to increase room temperature, an IoT app will sense this physical event and automatically turn on the air conditioner to lower the temperature. Since attack graphs based on monotonicity assumption cannot deal with cycles, they cannot 620 model and detect such attacks. In comparison, we do not make this assumption and our hypothesis graphs can uncover violations of correctness properties involving cycles.  Instead of using the original state vectors, this approximation technique [50] uses the hash value of state vectors to index the state array, reducing memory usage to 1 percent while achieving the approximation ratio close to 100 percent [18].

Conclusion
In this paper, we design and prototype ForeSee, a cross-layer vulnerability 635 analysis framework for IoT systems. We propose a formal approach to construct the IoT hypothesis graph which includes all of the core components of IoT systems, including user states and behavior that are largely ignored in existing works, and the potential attacks. We also present an approach to extract the IoT application logic from app descriptions using NLP techniques, and show 640 the experimental results. Besides, we design a state compression algorithm to reduce the size of the generated hypothesis graph. The framework can detect vulnerabilities and threats caused by any interaction between IoT core components. Our evaluation shows that the prototype scales well and works for hypothesis graphs with millions of nodes. The compression algorithm is able to 645 reduce the number of states by three orders of magnitude. tacks, https://www.pentestpartners.com/security-blog/ z-shave-exploiting-z-wave-downgrade-attacks/.