Discovering Vulnerabilities in COTS IoT Devices through Blackbox Fuzzing Web Management Interface

A novel approach for discovering vulnerability in commercial o-the-shelf (COTS) IoT devices is proposed in this paper, which will revolutionize the area. Unlike previous work, the web management interface in IoT was used to detect vulnerabilities by leveraging fuzzing technology. To validate and evaluate this scheme, a tool named WMIFuzzer was designed and implemented. ere were also two challenges: (1) due to the diversity of web interface implementations, there were no existing seed messages for fuzzing this interface and it was inecient while taking random messages to launch the fuzzing and (2) because of the highly structured seed message, fuzzing with byte-level mutation could conduce to be rejected by the device at an early stage. To address these challenges, a brute-force UI automation was designed to drive the web interface to generate initial seed messages automatically, as well as a weightedmessage parse tree (WMPT) was proposed to guide the mutation to generate mostly structure-valid messages. e extensive experimental results show that WMIFuzzer could achieve expected result while 10 vulnerabilities including 6 zero-days in 7 COTS IoT devices were discovered.


Introduction
With the rapid progress of Internet of things (IoT) technologies, more and more devices have been deployed in our daily lives, such as smart home routers and IP cameras. According to a recent report [1], the number of IoT devices will reach 20+ billion in 2020. ese network-enabled devices bring new ability to customers and make their lives easier, while they also attract attacks to compromise them [2][3][4]. Some studies show that many IoT devices are protection-less, and the security vulnerabilities in them are usually easy to be exploited [5][6][7][8][9]. For example, a security researcher in F-Secure found 18 zero-days in a Foscam IP camera, including insecure default credentials, command injection, stack-based bu er over ow, and so on [10]. Security vulnerabilities in IoT devices can cause a big impact because of the huge quantity of these devices. An example is the Mirai that launched a massive DDoS, which made several leading online services inaccessible including Twitter, Pay-Pal, and Net ix [11]. erefore, it is crucial to discover the vulnerabilities in IoT devices before deploying them in practice.
A straightforward approach for discovering vulnerabilities is analyzing implementation aws in the rmware of targeted IoT device, as the rmware provides the low-level control of the device hardware. is is called rmware-based approach that usually contains three steps. Firstly, rmware images are collected from public channels, such as online support service [12]. Secondly, these images are processed by unpacking tools, such as Binwalk, BAT, and FRAK [13][14][15].
irdly, static methods [16][17][18] or dynamic methods [6,15,19,20] are deployed to detect aws in these unpacked les. However, rmware-based approaches su er from known drawbacks. e rst one is the availability of rmware images since many vendors do not release them publicly. e second one is the di culty of unpacking images, as most vendors prefer to pack them with a compression algorithm to reduce the consumption of device storage. Existing tools can only unpack some images because they work upon known algorithms and file formats. ose images packed with a private file format or encrypted with a private key cannot be unpacked by these tools. For example, a prior work [19] collected 23,035 images, and it only succeeded in unpacking 8,617 images of them via existing tools. e third one is the difficulty of binary analysis due to the diversity of underlying architectures. Different IoT devices usually use different chipsets that have customized features (e.g., instruction sets, memory layouts, and so on). It is still challenging to analyze the unpacked binary files without the knowledge of underlying architecture. Due to the lightweight and resource constrain, many IoT devices usually provide network interfaces to allow users to interact with them instead of a screen and a keyboard. Web management interface that is designed for device administration is a popular instance of these network interfaces.
is interface packs user inputs into a message and sends it to the device. After receiving this message, the device parses the message and does more further procedure according to the message content (e.g., executing a targeted program). If there is an implementation flaw in the message parsing or the further procedure, a vulnerability may be exploited. So, an IoT device that has the web interface can be treated as a blackbox, and feeding this box with malformed messages could trigger potential vulnerabilities of it.
Motivated by this, this paper leverages mutation-based fuzzing technology to perform blackbox testing automatically. Since the web management interface is accessed via network, it is fully independent of the device firmware images and can be used to test COTS IoT devices that do not release their firmware images publicly. Additionally, this blackbox testing does not require the knowledge of underlying architecture about the targeted device. Two challenges detailed in Section 3 are needed to be addressed, including the generation of initial seed messages and the mutation of highly structured message.
To validate and evaluate this blackbox fuzzing, a tool named WMIFuzzer was designed and implemented for discovering vulnerabilities in COTS IoT devices automatically. WMIFuzzer was tested on 7 devices, and it discovered 10 vulnerabilities including 7 zero-days. National Internet Emergency Center in China (CNCERT/CC) is a coordination organization [21] that is responsible for analyzing and reducing cyber threats, vulnerabilities, disseminating cyber threat warning information, and coordinating incident response activities. All vulnerabilities found by WMIFuzzer have been reported to the CNCERT/CC, and all of them have been fixed now. WMIFuzzer was also compared to the stateof-the-art fuzzers, AFL and Sully [22,23], and the result showed that it is better than both of them in efficiency and effectiveness.
In short, this paper makes the following major contributions: (i) To the best of our knowledge, WMIFuzzer is the first blackbox fuzzer designed for fuzzing the web management interface in COTS IoT devices automatically (ii) Two challenges about fuzzing this interface were addressed including the generation of initial seed messages and the mutation of highly structured message (iii) We implemented the blackbox fuzzing in WMI-Fuzzer and evaluated it on 7 COTS IoT devices: 10 vulnerabilities including 6 zero-days were found e remainder of this article is structured as follows. In Section 2, the background knowledge of IoT vulnerability discovery is introduced. en, the design of WMIFuzzer is presented in Section 3. In Section 4, the implementation and evaluation are presented. e limitations are discussed in Section 5. e survey of related work in Section 6 is followed by the conclusion in Section 7.

Background
In this section, the background knowledge about discovering vulnerabilities via fuzzing web management interface is introduced.

Typical IoT Network Architecture.
In a typical IoT environment (like the smart home shown in Figure 1), several devices are deployed for different purposes. ere are two types of IoTnodes: a gateway node and multiple sensor nodes. e sensor nodes work by (1) collecting external information and pushing them to the remote user and (2) receiving commands from the remote user and executing them. Some sensor nodes are coupled with Internet capability, such as the camera connected to the homeGate via WiFi. While some cannot access the Internet directly, such as the wristband connected to a mobile phone via Bluetooth and the light connected to a hub via Zigbee. e mobile phone and the Zigbee hub can make a connection with remote user via Internet, and both of them contain a proxy module that works as a bridge between the Bluetooth (Zigbee) and the Internet. e gateway node, which is usually a wireless home router, provides the access point of Internet. e camera, mobile phone, and Zigbee hub are connected to this node. It plays an important role since the insider and outsider of the whole home network are separated by this gateway node.
Unlike the traditional information system, there is usually a lot of sensitive data in the IoT environment (e.g., sleeping patterns, health information, human activity, and child privacy). Since these data are shared through the Internet that is an open network, security and privacy are very critical. Due to the resource constrain, some lightweight authentication protocols are proposed [24,25] for IoT devices. Another recent work proposed a privacy-preserving scheme, called PrivHome, which supports authentication, secure data storage, and query for sensitive data [26]. Since PrivHome is lightweight, it is practical for smart devices which have limited resources.
Most IoT nodes do not provide a screen and a keyboard, and they prefer to provide a network interface to facilitate the user's operations on device management. For example, the customer uses the browser to login to his camera management interface and update video parameters. If the implementation of this network interface contains security vulnerabilities, an attacker can apply them to launch a crime. A prior study shows that security is an important factor for IoT users [27] because people want the devices to be under their control rather than being fully automated. However, vulnerabilities can make people lose control of these devices.
In this paper, we focus on discovering vulnerabilities in IoT devices that have the capability of Internet since an attacker can do more serious harm by compromising them. For example, vulnerabilities in IP cameras had been used to launch the well-known DDoS. Moreover, even if the camera in Figure 1 does not contain any vulnerability, an attacker can also utilize vulnerabilities in the gateway node to redirect the network flow of the camera into an evil Internet node to extract privacy via machine learning [28,29].

Web Management Interface in IoT
Devices. As stated previously, an IoT device usually provides a network interface for users to manage itself. Although there are no standards about how to implement this interface, many vendors prefer to utilize web technology because of its flexibility and simplification [6,30]. However, it is well known that making secure web applications is not a trivial task [31]. e overview of the web management interface is illustrated in Figure 2. ere are 3 parts: FrontEnd, Webserver, and PageHandler.
e FrontEnd is composed of HTML codes, JavaScript codes, CSS codes, and other static resources. e FrontEnd is run in a browser, and it has 2 goals. e first one is to be presented as a graphical user interface (GUI) to receive user inputs and clicking, and the second one is packing user inputs into a request message and sending it to the Webserver. e Webserver and PageHandler are both run in the IoT device; the former focuses on message encoding/decoding and the latter focuses on business logic. In detail, the Webserver firstly decodes the message to URL and other parameters, then it locates the corresponding PageHandler and passes parameters to the handler for further procedure, and finally it encodes the result from the PageHandler into a response message and sends it back to the FrontEnd. e implementations of PageHandler are diverse, such as PHP, Perl, Lua, and CGI-bin [32][33][34]. If there are flaws in the implementation of the Webserver or the PageHandler, sending malformed messages to this interface could trigger them.
Additionally, an IoTdevice is usually designed for a specific purpose, such as the IP camera that focuses on the transmission of captured video data via TCP/IP network. Vendors usually prefer to pay more attention on the improvement of data transmission and video quality, while they just pay little resource on the web management interface including the security. A prior work [5] found a lot of embedded devices are misconfigured in the web management interface by static analysis. erefore, the web management interface is a good surface to discover vulnerabilities in IoT devices.

Fuzzing Technology.
Fuzzing is an automated random testing technology that was first developed by Takanen et al. [35] to understand the reliability of UNIX tools. e core of fuzzing is to fill targeted tools with random testing data in the goal of triggering flaws. Fuzzing has become widely popular in software testing, and many serious vulnerabilities have been found by this technology [23].
Based on how to generate the random testing data, there are two major categories [36]: generation-based fuzzing generates random data by following a data model written manually, while mutation-based fuzzing generates random data by mutating an initial data seeds. Generation-based fuzzing can generate complicated data fields, such as a field that represents the checksum of a set of bytes. SPIKE, Sulley, and Peach are the state-of-the-art fuzzers built on generation-based fuzzing, and all of them are popular in protocol testing [22,37,38]. However, it is challenging to deploy generation-based fuzzing when the data model is not available, since writing data model is laborious, time-consuming, and error-prone. While mutation-based fuzzing Security and Communication Networks 3 performs a small mutation on a set of well-formed seeds and then feeds the targeted program with the mutated seeds to monitor unexpected behaviors. For testing COTS devices that do not have documents about its web management interface, it is unpractical to write the data model for every targeted device. In contrast, collecting a set of seeds for every device to start mutation-based fuzzing is much easier. Based on the complexity, there are three major categories of mutation-based fuzzing [39]: blackbox fuzzing just feeds the target with mutated data and monitors the program status in the goal of detecting abnormal behaviors; greybox fuzzing employs some feedback information (e.g., code coverage) to improve efficiency; and whitebox fuzzing usually performs heavy program analysis (e.g., symbolic execution) to retrieve more internal status to guide the fuzzing. Whitebox fuzzing has a well-known drawback of scalability about targeted program [40], so it is challenging for testing real-world programs. Greybox fuzzing usually needs to make modification on the original program (source code or binary code) [41], and it cannot work for those programs that cannot be modified. For the web management interface of COTS IoT devices, the Webserver and Page-Handler are embedded inside the device.
is constraint makes whitebox fuzzing and greybox fuzzing to not work. In contrast, blackbox fuzzing is independent of program size and it does not require program modification. So, it is practical to discover vulnerabilities in the web management interface by blackbox fuzzing.

WMIFuzzer
In this section, we firstly introduce the challenges in WMIFuzzer via an example in Section 3.1, secondly we present the system architecture in Section 3.2, thirdly we present the detailed design about seed generation in Section 3.3 followed by the WMPT for representation of the highly structured message in Section 3.4, and finally we present our fuzzing upon the WMPT in Section 3.5.

Challenges in WMIFuzzer
3.1.1. An Example. To better understand the problem, we use an enterprise gateway as the example. is device has a web management interface for administration, and its firmware is available publicly. However, the firmware may be encrypted with a private key or be compressed with a private format since it cannot be unpacked via existing tools. So, it is unable to discover vulnerabilities in this device via firmware analysis.
As stated previously, the FrontEnd of web management interface is run as a set of GUIs in a browser. Figure 3 illustrates one GUI designed for configuring the TCP/UDP session limitations and the corresponding message generated by this GUI. As can be seen from it, this message is highly structured. It contains two parts: header and contents, and both of them are following the HTTP protocol specification [42]. In the header part, conn_limit.cgi represents the targeted PageHandler. If the Webserver makes an incorrect assumption about the length of PageHandler, a memory overflow flaw may be triggered by mutating it with a long string. In the contents part, COMN_LIMIT_NUM � 1000 represents the parameter about TCP session limitation. If PageHandler assumes that it is a number, an unknown bug may be triggered by mutating it with COMN_LIMIT_NUM � ABCD. To discover vulnerabilities in this device automatically, one possible approach is mutating every message generated by all GUIs and sending them to the targeted device. However, two challenges are required to be solved in order to launch this mutation-based fuzzing.

Challenge 1: Generation of Initial Message Seeds.
Mutation-based fuzzing works upon some well-formed seeds [23], or it will generate a lot of invalid test cases to make the fuzzing inefficient. For fuzzing traditional file format processing software (e.g., BMPViewers), a BMP file can be used to test different BMPViewers since all BMPViewers follow the same format specification. In contrast, an existing web management interface message cannot be used for testing device devices as there is no unique specification for the implementation this interface. For example, the field named CONN_LIMIT_NUM in Figure 3 is specific for this example device, and it will be invalid for another device. So, an automatic approach is needed for generating the initial valid message seeds when testing different COTS IoT devices.

Challenge 2: Mutation of Highly Structured Message.
Traditional mutation-based fuzzing mainly focuses on unstructured data, most fields of which are at fixed location and have a fixed length. For example, the field about the BMP file length is located in offset 2 with 4 bytes, and its value can be from 0 to max integer number. No matter how the byte-level random mutation changes this field, all BMPViewers still treat this 4 bytes field as an integer number and parse it. In contrast, most fields in the highly structured web management interface messages are not at fixed location and their length is also not fixed. Byte-level mutation on these messages usually corrupts their structure to cause mutated messages being rejected by the device at an early stage. For example, the POST field in Figure 3 is the HTTP method, and mutating its byte S to the blank ASCII char will cause the remaining ST to be a new field named URL followed by the old URL. e Webserver will discard this mutated message to stop more deep procedure since it contains two URL fields. erefore, an approach is required for handling the mutation of highly structured messages.

Solutions.
Fortunately, this paper obtained the following approaches to address the above challenges: (i) Generating Message Seeds via UI Automation. UI automation [43] is an automatic method to emulate user operation such as inputting data to UI elements and clicking buttons to submit the inputting. Observed on this, UI automation can be applied to input data and click a button to drive the FrontEnd to generate a web management interface message. Since the network of a browser can be intercepted, the generated message can be captured as a seed. By trying all GUIs via UI automation, sufficient messages can be generated as the fuzzing seeds. (ii) Mutating a Message Seed Based on Its WMPT. Abstract grammar tree (AST) [44] is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. e syntax is "abstract" in the sense that it does not represent every detail appearing in the real syntax, but rather just the structural, contentrelated details. Inspired by the AST and HTTP protocol, dissecting a message to a tree and modifying the content of a node can keep the structural information and perform mutation simultaneously. In this way, the weighted message parse tree (WMPT) is proposed to perform the mutation to generate mostly structure-valid messages. is will bring the blackbox fuzzing the ability to detect deep vulnerabilities in IoTdevices, such as the PageHandler.

System Architecture of WMIFuzzer.
e architecture of our WMIFuzzer is illustrated in Figure 4. At a high level, there are 3 parts: seed generation, WMPT parsing, and fuzzing upon WMPT. eir goals are presented as follows: (i) Seed Generation. is is the first part that utilizes UI automation to generate seed messages as much as it can. In detail, it firstly tries to explore every GUI via a browser and then triggers GUI events to drive the browser generating original messages. A HTTP proxy is deployed to intercept these messages sent by the browser and save them as the initial seeds. All of these steps are fully automatic. (ii) WMPT Parsing. is part is converting an original message to its WMPT detailed in Section 3.4. e goal of the WMPT is to store the message structure into a tree and store the content of message fields into the tree nodes. In this way, mutation on the nodes of a WMPT will mostly generate structurevalid messages. (iii) Fuzzing upon WMPT. e last part is performing fuzzing schedule on the WMPT generated previously. It performs random mutation to generate structure-valid messages and sends them to the device under testing. To infer whether a vulnerability is triggered, several monitoring strategies are deployed to monitor the device running status.

Seed Generation via UI Automation.
As stated previously, WMIFuzzer is built on mutation-based fuzzing that needs a set of valid seeds to start the fuzzing. is section describes how to generate valid messages as the initial seeds by UI automation. (i) I is the set of GUI elements used to receive user inputs in which α ∈ I is called input control (IC) (ii) S is the set of GUI elements used to receive user clicks in which β ∈ S is called submission control (SC) e procedure of UI automation is enumerating every c j ∈ S, inputting data to I j , and clicking c j . Figure 5 is one example web management interface GUI that has two SCs named Save and Check again. is GUI has two mappings: (1) the Save is mapped to fours ICs named Enable, Disable, 02, and 00 and (2) the Check again is mapped to none of these ICs. With these mappings, UI automation firstly inputs four values to these four ICs and then clicks the Save or directly clicks the Check again without inputting.
In practice, the mapping M in Definition 1 is generated by software testing engineers by hand. However, an automatic approach for building the M is required since WMIFuzzer is designed to fuzz different IoT devices automatically. A previous work [45] that utilizes UI automation to detect IoT account vulnerabilities builds this mapping by the observation that many IoT control apps share the same style for the account management GUI, and most of these apps use the default ICs and SCs. In contrast, the GUI of web management interface in IoT devices is diverse, as different types of devices are usually designed for different purposes. Even for the same type of devices, most vendors prefer a new GUI style to make themselves different from other vendors. So, WMIFuzzer cannot depend on the GUI style, and it has to analyze the codes of FrontEnd to identify I, S, and M.
In the development domain of FrontEnd, the browser renders the GUI based on three parts: HTML codes, Java-Script codes, and CSS codes. HTML is the standard markup language for creating web pages, and HTML elements are the building blocks of a web page.
e JavaScript codes process the business logic (e.g., submitting user inputs), while the CSS codes describe the style of E. is means that every κ ∈ E is declared as an HTML element in the HTML code. According to the HTML specification [46], different types of HTML elements have different purposes: input and textarea elements are defined as ICs and input-with-submit and button elements are defined as SCs. erefore, WMI-Fuzzer should identify I and S from all elements in the HTML code. Unfortunately, it usually fails because of the flexibility of HTML elements. In Pseudocode 1 we present the core HTML codes of the GUI presented in Figure 5.
From the above codes, we can get the following information: (i) Check again can be identified as a SC since it is a button element in code line 3. (ii) Enable and Disable cannot be identified as ICs since they are img elements in code lines 5-6, but the FrontEnd utilizes two img elements with booleanstyle value attribute to work as a checkbox-style IC. So, these two img elements are not ICs, but they have the same functionality of ICs. (iii) 02 and 00 cannot be identified as ICs because they have the read-only attribute in code lines 9-12, but their parent element has a click event that can dynamically modify the read-only attribute. (iv) Lines 15-16 contain two useless ICs that have hidden attributes to make themselves invisible in the GUI. (v) Save also cannot be identified as a SC since it is a div element in code line 17, but this element has been binded with a click event handler in code line 19. e event handler collects data from two read-only elements and saves data to a hidden element hosted in a form in code lines 20-21. At last, the event handler triggers the form submitting a request in code line 22. So, this div element has the same functionality of a SC.
In short, every HTML element can be an IC or a SC because it can use value attribute and event handler to receive user inputs and button clicks. is is the flexibility of  Security and Communication Networks web technology, and it makes identifying the ICs and SCs difficult. Program analysis (e.g., symbolic execution and taint tracking) can be utilized to explore whether a HTML element can collect data or submit data to help identify the ICs, SCs, and their mappings, but their known limitations make them inefficient for the web interface.
As it is a challenge to identify ICs, SCs, and their mappings, a brute-force UI automation that makes a tradeoff between the accuracy and the practice is proposed.
is means (1) every element is an SC, (2) every element is an IC, (3) and every SC is mapped to all ICs. So, M is very simple, and its size is a constant that matches the element count of E. It seems that this brute-force UI automation works well at most times, since the event handler of most SC ignores those ICs that are not really mapped to this SC. In this way, WMIFuzzer can use UI automation to drive the FrontEnd to generate messages automatically by the following steps: (i) Initialization. In this step, the UI automation firstly starts a browser, which is run under control, to open a welcome page that guides the user to input the IP address of the targeted IoTdevice. en, the browser opens the login page of the web management interface to guide the user inputting his credentials to login to the interface. is is the only step that requires a little of manual efforts because this interface usually denies accesses without successful login. On the contrary, this is also the standard step for managing an IoT device via web management interface. e whole step does not require any knowledge about software testing and vulnerability discovering.  [47] can be utilized to intercept the network-related APIs in the browser, but API hooking is not a robust approach, so another approach named browser network proxy that is supported by most browsers natively is utilized in WMIFuzzer. By installing a proxy and configuring the browser with this proxy, all messages will be transparently sent to the proxy instead of the real device.
In a word, WMIFuzzer overcomes the challenge of seed generation by utilizing the brute-force UI automation.

Weighted Message Parse
Tree. Traditional mutationbased fuzzing works by randomly selecting a set of bytes in a seed, modifying them and feeding the target with this mutated seed. However, it cannot work well for the web management interface messages that are highly structured, as pure byte-level random mutation mostly generates structure-invalid messages to make the fuzzing inefficient. Inspired by the abstract syntax tree [44,48], WMIFuzzer applies the WMPT to represent the seed message and perform random mutation on the WMPT.
A WMPT is an ordered rooted tree with typed nodes where each node has a weight scope, and this tree has two types of nodes: leaf nodes and internal nodes. A leaf node contains an atomic part of message content while an internal node concatenates its child nodes. e weight of a node means how much time should be assigned to the mutation of this node. Figure 6(a) is an example message about web management interface and Figure 6(b) is the corresponding WMPT. All green nodes are the leaf nodes that store the atomic content of a seed message, and others are internal nodes that store the structure of their child nodes. Obviously, the mutation of a leaf node will not change the structure of the WMPTat most times. Moreover, inserting or deleting some internal nodes can generate mutated messages that have a big difference from the original seeds. For a mutated tree, collecting the content of its nodes and contacting them will output a message.
Based on the mutation of leaf nodes and internal nodes in WMPT, WMIFuzzer can generate both structure-valid and structure-invalid messages on demand. So, it is capable to detect shallow and deep vulnerabilities. ere are three operations on the WMPT: deserialization is converting a seed message to be a WPMT and mutation is modifying the WMPT while serialization is converting the mutated WMPT to a new message.

Deserialization Operation.
As the web management interface messages are built on HTTP protocol, WMIFuzzer follows the protocol specification to dissect a seed to its WMPT. Algorithm 1 details the procedure of seed deserialization. For every seed message α ∈ M, WMIFuzzer needs to perform deserialization for converting it to a WMPT t ∈ T. According to the HTTP protocol specification [46], the web interface message is an ACSII string composed of 3 parts: requestLine, headers, and optContent. e requestLine is the first line of the message, the headers are the following lines that end with a special empty line, and the optContent is the whole remaining data determined by some special line in the headers. So, Algorithm 1 firstly splits a message into these three parts (nodes 2, 3, and 4 in Figure 6) since this is the mandatory constraint about HTTP message. e requestLine contains three parts: a request method that is a predefined ASCII string (e.g., GET, POST, and so on), a targeted URL that is composed of pagePath and optional parameters, and a predefined protocol version (nodes 5, 6, and 7 in Figure 6). e two parts of a target URL are concatenated via the symbol "?" (nodes 17 and 18 in Figure 6). e pagePath is separated by the symbol "/" (nodes 21 and 22 in Figure 6), while the parameters are encoded as k-v blocks concatenated via "&" (nodes 23 and 24 in Figure 6) and k-v is concatenated via the symbol "�" (nodes 27 and 28 in Figure 6). In this way, Algorithm 1 generates the requestNode and inserts it to the WMPT t. Since the targeted URL is the most customized part, a double weight scope was assigned on its child nodes to perform more mutation.
e hea de rs are a set of header lines, that is, a k-v style string, and the separator is a colon. ere is no limitation of post /path1/path2?a = 1&b = 2 HTTP/1.  To parse each line, we also follow the k-v style to decode the line. Because the data format of the value content in a k-v header line is not known, heuristic strategies are applied to parse the content into basic elements by scanning special chars including blank, comma, semicolon, and "&". In this way, Algorithm 1 generates the headersNode and inserts it to the WMPT t. e optContent is an optional part, and there is no limitation of its content. e content length is declared in a well-known header named Content-Length in headers, and it is strongly recommended to declare the content type in another well-known header named Content-Type. In the web development domain, there are 3 popular types: application/ x-www-form-urlencoded, multipart/form-data, and application/json. For these three types, WMIFuzzer deserializes the content following their specifications, while for other types, the content is deserialized as a bytes stream. With this, Algorithm 1 generates the contentNode and inserts it to the WMPT t. Finally, a message is dissected to its WMPT.

Mutation Operation.
As stated previously, the mutation is performed on the WMPT nodes to generate mostly structure-valid messages. For a leaf node, WMIFuzzer mutates its content with random data directly. Since the HTTP message is built on text string that ends with a NUL character (NUL has all bits zero), NUL is imported into a mutated message before the tail will make the message to be truncated. For example, node 7 in Figure 6 is the protocol version, and the hea de rs and optContent will be truncated by the Webserver in IoT devices when a char of this node content is mutated to NUL. So, NUL is filtered when performing mutation. In detail, when mutation will change a byte to NUL, WMIFuzzer dynamically generates a random number, that is, 1100 and checks whether it is bigger than a threshold (now is 97). If the result is TRUE, WMIFuzzer accepts this mutation, or it just ignores this mutation. For an internal node, WMIFuzzer deletes the node or deletes a child node randomly to the mutation on a leaf node, this mutation will bring a big difference.

Serialization Operation.
Serialization is converting a WPMT to its corresponding message, and it is very straightforward: traversing the WPMT depth-first and contacting the content of every leaf node. e contacting symbol is determined by the parent node of every node. Especially, the serialization will keep the parent node symbol when a leaf node is mutated to an empty value. For example, node 30 in Figure 6 is a parameter value in URL, and the serialization of its parent (node 24) will be "b=": In contrast, serialization will not keep the parent node symbol when a leaf node is mutated to be deleted. For the previous example, the serialization of node 24 will be b.

Fuzzing upon WMPTs.
Based on the generated WMPTs, the Mutator in Figure 4 can be scheduled to generate structure-valid mutated messages. en, these mutated messages are sent to IoT devices in the goal of triggering vulnerabilities detected by the Monitor.

Fuzzing Scheduling.
ere are multiple nodes in a WMPT, and it is unknown that which node with mutated content can trigger a vulnerability. So, scheduling as shown in Algorithm 2 is proposed to perform the fuzzing. ere are two phases of fuzzing: determined phase and random phase. e determined phase is performing limited mutation on a single leaf node to trigger vulnerabilities as fast as it can. Every seed message α ∈ M will be processed once in this phase. For a leaf node β ∈ α, sequentia_mutating tries following modifications including Input: the set of seed messages, M; Output: the set of generated WMPT, T; (1) initial T � ∅ (2) for each α ∈ M do (3) initialize an empty WMPT t � ∅ (4) split α to 3 parts {requestLine, headers, optContent} (5) generate requestNode from requestLine and insert it to t (6) initialize an empty headersNode � ∅ (7) for echo β ∈ headers do (8) generate header from β and insert it to headersNode (9) end for (10) insert headersNode to t (11) if optContent is not empty then (12) generate contentNode from optContent and insert it to t (13) end if (14) end for (15)  For example, the serialization of the URL parameters will be a � and b � 2 when node 28 in Figure 6 is mutated to empty content. As the content owned by the key a is empty now, an implementation flaw may be triggered if the Webserver or the Page-Handler makes an incorrect assumption. (iii) Replacing the numerical-style content with typical integer number in the goal to trigger vulnerabilities about integer overflow. Because the HTTP message is text based and there is no standard approach to identify whether the node content is a string or a number, WMIFuzzer just applies regular express to match whether the content looks like an integer number. If the matching result is FALSE, it will be treated as a string, or the content will be replaced with classic boundary integer numbers: 2 i , 2 i − 1, and 2 i + 1, where 0 ≤ i ≤ 32. (iv) Replacing the string-style content with special strings in the goal to trigger various vulnerabilities. FuzzDB [49] is an open-source project that collects a lot of popular strings used in multiple security tools, and WMIFuzzer also uses its string database to replace string-style content.
(v) Changing the content type in the goal to trigger vulnerabilities about assumptions on the data type. For the node that matches an integer number, it will be replaced with an ASCII string, or the node content is treated as a string and will replaced with an integer number. For example, node 28 in Figure 6 is an integer string, and the Webserver will convert it to an integer number. However, the conversion may fail when the integer string is mutated to a nonnumerical string, and a crash may occur when the Webserver does not check the result of conversion.
In contrast, the random phase is an endless loop that performs random mutation on all seeds repeatedly. When fuzzing a WMPT, only a subset of its nodes will be mutated at a time.
is is because the generated message will be useless and rejected at an early stage at most times if most nodes are mutated. e randomIndexList in Algorithm 2 is firstly used to generate a list of index values that identify which nodes should be mutated in this iteration. And the random mutating is used to perform mutation as shown in following: (i) For a leaf node, both the modifications in the determined phase and random byte-level are fully supported. In contrast to the determined phase, the major difference is that every mutated message in this phase may contain modification on multiple nodes. (ii) For an internal node, deleting itself and deleting a child node are both supported. As there are multiple mutations on different nodes via indexList, deleting a node from the tree will make the indexList invalid and the remaining mutations may trigger a crash. Inspired by the copy-on-write [50] that is popular in modern operation system (e.g., Windows and Linux), WMIFuzzer does not really remove the node from the tree, and it just marks the node 'deleted' to notify the serialization of WMPT ignoring this node. e messages generated in these two phases will be sent to the IoT devices under testing. If an interesting event in the following Remote monitoring is detected, an alert will be generated with the current mutated message to help further manual inspection.

Remote monitoring.
Monitoring running status is very important because it is the only way to infer whether a vulnerability is triggered in a COTS device. However, it is challenging to monitor the running status since these devices usually do not support local monitoring to make a lot of information unavailable (e.g., running processes, threads, CPU overhead, and memory statistics). So, the main surface for device monitoring is the network communication. If a vulnerability causes the device to crash, the device network will be unavailable and this can be detected in the outside of the device. But not every vulnerability will cause the system crash, such as the Command Injection and Interface Leak. If Command Injection executes a network-related system command in the device, the network status of device will also Input: the set of generated WMPT, M; (1) for each α ∈ M do (2) for each node β ∈ α do (3) if β is a leaf node then (4) tm � sequentia_mutating(β) (5) result � sending_monitoring(tm) (6) if interesting(result) then (7) alert(tm) (8) end if (9) end if (10) end for (11) end for (12) loop (13) for each α ∈ M do (14) indexList � randomIndexList(0, len(α)) (15) for each λ ∈ indexList do (16) tm � random_mutating(λ) (17) end for (18) result � sending_monitoring(tm) (19) if interesting(result) then (20) alert(tm) (21) end if (22) end for (23) end loop ALGORITHM 2: WMIFuzzer fuzz scheduling algorithm. be infected and this can be monitored outside. So, WMI-Fuzzer uses two commands, reboot and ping to infect the device network; the former can make the device to go offline and the latter can send ICMP message. e root cause of Interface Leak is incorrect authentication, and WMIFuzzer uses a simple strategy to monitor this type of vulnerabilities. If a mutated message contains modification in the authentication data or in the URL but the response status is successful, a leak vulnerability is triggered. Given the device has two phases to consume the mutated messages, Webserver and PageHandler, we analyze the potential network status when a flaw is triggered in each phase: (i) Webserver Phase. If a message just crashes the Webserver, there is no response message after the mutated message is sent to the device, and the Webserver may become unavailable if it is running in a single-process model while the device low-layer network is available. Moreover, if the crash also infects the system at the same time, the device may reboot due to the following reasons: 1) no response , 2) Webserver unavailable, and 3) low-layer network unavailable. Otherwise, the Webserver will reply a response message, and both the low-layer network and Webserver are reachable. (ii) PageHandler Phase. If a message crashes the Page-Handler, the fuzzer will not receive the response message while the Webserver is available. Moreover, if the crash also infects the system, the network status is the same as the crash in the Webserver and they cannot be distinguished from each other.
In a word, WMIFuzzer monitors the status of the lowlayer network, Webserver, and PageHandler to detect multiple types of vulnerability including crash, Interface Leak, and Command Injection.

Implementation and Evaluation
We present the prototype implementation of WMIFuzzer in Section 4.1 and the evaluation in Section 4.2 and then briefly study some vulnerabilities found by WMIFuzzer in Section 4.3.

Implementation of WMIFuzzer.
WMIFuzzer was implemented with around 3,000 Python lines of code and 2,000 C lines of code in total. Also, several open-source projects (e.g., Chrome, Selenium, mitmproxy, and fuzzdb [49,51,52]) are integrated into this fuzzer to avoid reinventing the wheel.
In the seed generation phase, the brute-force UI automation was built based on Chrome and its Selenium driver. Python code was written to use the Selenium driver to control the Chrome behavior, such as opening a URL, inputting data, and clicking a button. e mitmproxy project, a HTTP proxy written in Python code, was extended to intercept the web management interface messages.
In the WMPT parsing phase, C code was written to implement the deserialization, mutation, and serialization. HTTP protocol specification was used to parse the URL, headers, and content. To reduce redundant parsing of the content type of a node, the data type is calculated once when converting a message to its WMPT and saved into the node. Inspired by the copy-on-write strategy, the mark-deleted flag was applied on the WMPT nodes to reduce unnecessary memory operations. e C code was compiled to a dynamic library that can be loaded by Python code.
In the Fuzzing phase, Python code was written to schedule the fuzzing. After a mutated message has been sent to the device, the reachability of Webserver, PageHander, and the low-layer network is checked to detect crashes and command injection in the targeted device. In addition, the response message was analyzed to detect interface leak.

Evaluation of WMIFuzzer.
In order to evaluate the effectiveness and efficiency of WMIFuzzer, we tested it on 7 IoT devices and compared it with two state-of-the-art fuzzers: American Fuzzy Lop (AFL) [23] and Sulley [22].

Testing Devices.
We bought 7 popular IoT devices, and all of them contain the web management interface. After manually checking their firmware images, 4 of them have been released publicly while others were not. For the public firmware images, popular tools were deployed to unpack them: Binwalk [14] and BAT [15]. And just one of these images can be unpacked, but it fails on being deployed in an emulator presented in a previous work [19]. e detailed specifications of these devices are described in Table 1, where FirmwareAva means firmware is released publicly, Firm-wareDec means firmware can be unpacked, and Firmwar-eRun means firmware can be run in an emulator environment.

Testing Environment.
WMIFuzzer and the other two fuzzers are run in a separate virtual machine that hosts an Ubuntu 16.04 with Intel Core i7 quad-core 3.6 GHz CPU with 4G RAM. e original branch of AFL cannot work on the web management interface because it works by testing targeted program locally. So, its code was patched to create two versions: AFL 1 and AFL 2 . AFL 1 replaced its code about target program execution with sending mutated data and waiting for a response. AFL 2 extended the AFL 1 with remote monitoring strategies used in WMIFuzzer. However, the fuzzing schedule of AFL was never changed. Sulley is another state-of-the-art fuzzer that is popular for protocol fuzzing, but it needs a data model to start fuzzing. As different seeds have different fields, every seed needs a data model in Sulley. Since it is laborious, time-consuming, and error-prone to write data models by manual analysis of every seed, a data model generation that treats every message as a set of tokens separated by CR-LR or blank was used in this experimentation. In detail, parsing every seed to tokens once can generate its data model that can be integrated into the Sulley 1 . Sulley 2 extended Sulley 1 with remote monitoring strategies used in WMIFuzzer. Anyway, the seed messages generated by WMIFUzzer are used as the initial seeds for AFL and Sulley.

Research Questions.
Using the previous experiment setup, we aim to answer the following research questions: (i) RQ1. Can WMIFuzzer discover vulnerabilities in COTS IoT devices based on its seed generation and fuzzing of WMPT? (ii) RQ2. Can WMIFuzzer outperform the state-of-theart fuzzers by performing mutation-based fuzzing on the same initial seeds? Table 2 lists the unique vulnerabilities found by WMIFuzzer. For each device under testing, WMIFuzzer firstly applies the UI automation in Section 3.3 to generate seed messages in 1 hour automatically. en, it converts the seed messages to WMPTs and starts mutation-based fuzzing in 23 hours. At last, it discovered 10 vulnerabilities: 3 command injections, 4 interface leaks, and 3 crashes. All of them have been reported to the CNCERT/CC [21] that cooperates with many vendors to fix vulnerabilities in their products, and 6 CNVD-IDs are assigned as they are zero-days. e two crashes (Vul-ID-003 and Vul-ID-009) have been reported by other researchers independently before this paper did, so they are not been assigned with a CNVD-ID. e crash (Vul-ID-006) has been reviewed by the vendor, and they inferred that it is not security related, so it was not assigned with a CNVD-ID. Vul-ID-004 and Vul-ID-005 are different vulnerabilities in a same device, but they have been combined to be assigned with one CNVD-ID.

Effectiveness of Vulnerability Detection (RQ1).
In addition, most of these vulnerabilities are of high impacts because they can be exploited remotely. An attacker can utilize them to compromise the targeted device, such as installing malware into a router for further cybercrimes and rebooting a camera for bypassing surveillance. At the time of this writing, patches have been released to fix these vulnerabilities.
ese results indicate that WMIFuzzer can discover vulnerabilities in the web management interface of COTS IoT devices automatically based on its seed generation and mutation-based fuzzing.

Efficiency of Vulnerability Detection (RQ2).
We measure the efficiency in terms of vulnerabilities discovered over quantity and time. Table 3 lists the results by comparing WMIFuzzer with AFL and Sulley; it shows that WMIFUzzer can detect more vulnerabilities than the other fuzzers. In detail, WMIFUzzer detected 10 vulnerabilities including Command Injection, Interface Leak, and Crash while the other two fuzzers just detected a subset. Moreover, AFL and Sulley take more time than WMIFuzzer for detecting the same vulnerability.
We performed a further manual analysis and found the following: (1) Command Injection and Interface Leak vulnerabilities cannot be detected by both AFL 1 and Sulley 1 because these two fuzzers can both receive response messages from the devices where no crashes are triggered. However, AFL 2 and Sulley 2 can detect them because of the integration with the remote monitoring strategies used in WMIFuzzer.
(2) Sulley 2 can detect more vulnerabilities than AFL 2 , and Sulley 2 also takes less time on the same vulnerability. After manual checks, we get the result that byte-level mutation in AFL 2 generated a lot of structure-invalid messages rejected by the device. In contrast, the data model in Sulley 2 can guide the mutation to generate structure-valid messages. Actually, the data model in Sulley 2 can be considered as a simple version of WMPT, where mutation on internal nodes is not permitted, every leaf node has the same weight scope, and every generated message just contains one mutated leaf node.
ese results indicate that WMIFuzzer is competitive compared to the state-of-the-art fuzzers: AFL and Sulley, in terms of quantity and time about vulnerability detection in the web management interface of COTS IoT devices.

Feiyuxing Enterprise
Gateway. VE602W+ is a wireless gateway designed for small office and company, and it contains web management interface. e firmware of this device has been released publicly, but it is encrypted or packed with a private format, and it cannot be unpacked by existing tools. erefore, WMIFuzzer can be used to detect vulnerabilities in this device by blackbox fuzzing. After 30 minutes, WMIFuzzer reported a potential interface leak vulnerability as a message with mutated URL can get a successful response. is mutated URL is "http://192.168.0. 1/.htpasswd," and the response content is "admin: $1$$FQzEy2IhIMAL60u9OrHLp1." After manually analyzing the content, it can be decrypted by John the Ripper password cracker [53]. e result is "admin:147258369," which is just the credential of the web management interface. So, anyone who can access the router work can get the password of this device no matter how the administrator changed his/her password. After another 4 hours, WMIFuzzer reported a potential Command Injection vulnerability as another message triggered the device sending a ICMP request to the fuzzer. We reviewed the mutated message content in Pseudocode 2 and found the difference that "PING_ HOSTIP � 11" has been mutated to "PING_HOSTIP � 11 | ping 192.168.0.11." Further inspection showed that the module named System Diagnosis is responsible for this message. is module executes ping command inside the device, and the parameter of ping is referenced to the user-supplied PING_HOSTIP. It seems that the developer must have forgotten to sanitize this content. So, evil content can execute additional commands. en, the nmap [54] is deployed to scan this device, and two special open ports are found: 23, that is, the telnet service, and 10089, that is, the SSH service. At last, the upper vulnerabilities were used to crack the device system credentials, and we succeeded in logging into the SSH service with the cracked credentials. Moreover, by manually checking the web interface, we found that there is no entry for closing these two services. is means that an attacker can utilize these vulnerabilities to control the gateway device independent of the web interface authentication.

Phicomm Smart
Router. K2 is a smart home router that can be managed via a browser or its official app. e firmware of this device is released on the vendor website, Binwalk can unpack it, and its web management interface is implemented by Lua [33] scripts. We could not find a tool that is designed to detect vulnerabilities in Lua scripts, and we failed running the firmware in the emulator [19] because of some hardware features unavailable in the emulator. en, WMIFuzzer was deployed to perform the blackbox fuzzing, and a crash was reported after 4 hours. Manual analyzing the crash showed that the root cause is the URL about time reboot module. is module is designed for the administrator to configure the router rebooting at a special time automatically, and the main part of the message is listed in Pseudocode 3.
Compared to the original seed message, the additional content is the "-reboot." By reviewing all unpacked Lua scripts, this crash was hosted in the script file that contains the core codes in Pseudocode 4.
is script firstly gets the request parameters named timeRebootEnablestatus and timeRebootrange, and then combines them to construct the parameters of luci.sys.call without any sanitization. So, malicious parameter in this message can execute extra commands via luci.sys.call.
By this Command Injection vulnerability, we detected the underlying architecture and found that it contains Wget that can download a file from a remote server. en, Wget was applied to download the corresponding Dropbear, a popular SSH server for embedded devices, and install it into the device. Finally, we got the full control of this router independent of the web management interface authentication.

NEO IP Camera.
In the manual, customers are suggested to manage this device via its official control app while the web management interface is also enabled. e vendor does not release the firmware publicly, and we also found  nothing from online search engines, such as Google and Baidu. So, WMIFuzzer was deployed to fuzz its web management interface automatically. After 18 minutes, a crash was reported and the corresponding mutated message was captured in Pseudocode 5. e content of authentication header has been appended with a number of ASCII char A. In our whole experimentation, both AFL and Sulley can trigger this crash, but they take more time than WMIFuzzer. We manually analyzed the mutated messages in these two fuzzers and found that the crash will not be triggered if the mutation changed the token Basic. As AFL cannot treat Basic and the YWRtaW46YWRtaW4� as two atomic fields, its mutation rarely generates a message that keeps the first field unchanged while the second field is modified to a long string. In contrast, both Sulley and WMIFuzzer can identify these two atomic fields, so they can detect the vulnerability quickly.

Discussion and Limitations
Although WMIFuzzer can discover vulnerabilities in COTS IoT devices efficiently, there are still some avenues for future improvements.

Interface Protection in Device.
To detect vulnerabilities as fast as it can, WMIFuzzer sends a lot of messages to the device at a high speed. But if there is protection about message speed in the device firmware, most messages will be blocked to make WMIFuzzer inefficient. Message context is another protection, such as the CSRF token [55], which is designed to prevent CSRF attack. If this web interface contains CSRF protection, every message sent to the device must contain an unequal token extracted from the previous response. is means that a message will be invalid if it had been sent to the device once. is will make WMIFuzzer to not work since mutated messages from a seed have the same invalid CSRF token.

Limitations of Message Seed Generation.
WMIFuzzer generates the initial message seeds by crawling the web pages of an IoT device. HTTP and HTTPS are very popular in transferring the web page content of IoT devices, and WMIFuzzer works well for these devices now. However, if a security scheme is deployed for this transmission (e.g., PrivHome [26]), WMIFuzzer will not work directly because of failure on decrypting captured web messages. One possible approach is implementing an internal interceptor of this new scheme and using it to capture the messages before encrypting them. A HTTP client containing this scheme is also needed for encrypting mutated messages and sending them to targeted devices. e WMPT and strategy of mutation are independent of this scheme, and they can remain unchanged.

Limitations of Remote Monitoring.
As COTS IoT devices usually do not have the support for local monitoring, WMIFuzzer has to infer whether a vulnerability is triggered by monitoring the device network. But vulnerabilities in the targeted device may not alter the network status. For example, a memory overflow vulnerability may not cause a system crash if the device does not integrate MMU [56] to its memory management policy. Moreover, WMIFuzzer detects Command Injection dependent on ping and reboot commands since they can modify the device network status. However, if they are not available in the targeted COTS device, the device network cannot be changed by command injection vulnerabilities. More channels are required to monitor the running status of the targeted device, and one possible approach is monitoring signals from the hardware interface.

Related Work
As firmware images are been packed to reduce storage usage, existing tools (e.g., BinWalk, BAT, and FRAK [13,14,15]) are presented to unpack them. Costin et al. [5] presented the first public, large-scale analysis of firmware by analyzing the unpacked files statically. By collecting thousands of firmware images and analyzing them, the authors showed that account passwords and self-signed SSL certificates together with their corresponding private RSA keys, outdated software, building images as root and web servers configuration make many firmware images vulnerable. In another study, Costin et al. [6] used RIPS to scan PHP files in the device webroot to detect vulnerabilities, but the results showed that only 8% of embedded firmware images contain PHP code. Yan et al. [18] utilized a novel model of authentication bypass flaws to analyze binary files in firmware images, and finally they discovered several vulnerabilities in three firmware. It seems easy for static approaches to test IoT devices, as they do not need to run the firmware and they are independent of the real hardware. However, there are well-known limitations for static analysis techniques. e first one is false negatives (FNs), since static analysis mainly works upon some known patterns. For those vulnerabilities that have a new pattern, GET static approaches usually fail. e second one is false positives (FPs), since every detected vulnerability by static approaches is just a potential vulnerability. A lot of manual efforts are required for further analysis to confirm them. So, some researches look forward to dynamic techniques.
Costin et al. [6] presented an automated framework to discover vulnerabilities in web interfaces of embedded devices; it works by integrating Qemu to run the web service and testing the web service via existing web penetration tools. Although it used some heuristic techniques to run chroot and init to launch the web service, it may fail because of the side effects of forced emulation, diversity of web server environment, and limitations of Qemu. Chen et al. [19] presented an automated dynamic analysis system for Linuxbased embedded firmware; it works on building a full system emulator to run the firmware with a stocked kernel and a private library of nonvolatile memories (NVRAM). It overcomes the challenges of userland emulation, such as chroot and init. However, the boot and configuration information in different devices are diverse, and it is very difficult to emulate fully. So, it still requires a lot of manual efforts to model its operation of NVRAM when testing a new device. Davidson et al. [57] presented a platform FIE for discovering implementation flaws in MSP430 family of microcontrollers, and FIE works by utilizing KLEE [58] symbolic execution engine to verify security properties in open-source firmware programs. FIE can perform a complete analysis of firmware images because symbolic execution can explore all possible execution paths in a program. However, it is limited to analyzing small firmware that is open-source and written in C Zaddach et al. [20] presented another platform, Avatar, supporting dynamic analysis of firmware; it is a hybrid approach that contains both the device and an emulator integrated with the S2E [59] engine. Avatar utilizes S2E to perform selective symbolic execution in the emulator and forwards I/O operations from the emulator to the real device. Anyway, both static analysis and dynamic analysis based on firmware are dependent on the availability of device firmware images. However, firmware images of many COTS IoT devices are not available publicly, or they have been encrypted with a private key, or they have been compressed with an unknown file format.
Some other research studies applied online scanning techniques to detect embedded devices' vulnerabilities in the Internet scale. Cui et al. [60] and Cui and Stolfo [16] utilized Nmap [54] to scan a wide range of open embedded devices in the Internet and attempted to log into them using wellknown default credentials. e results showed that a large number of devices ranging from enterprise equipment to office equipment are vulnerable. In another work [5], ZMap [61] was used to search vulnerable devices in the Internet based on vulnerable SSL certificates, and around 30K online affected devices were identified. Heninger et al. [62] studied vulnerable keys in embedded devices at Internet scale, and they found that 0.75% of TLS certificates share keys due to insufficient entropy during key generation. SHODAN [63] is a powerful engine to search vulnerable devices in the global network, and it works by matching open service fingerprints in online devices. However, online scanning mainly focuses on discovering known vulnerabilities, as it works on a number of known signatures. So, it usually cannot discover zero-days. On the contrary, it is also ethically questionable, even illegal, to scan online devices without authorization.
Since most network-enabled devices will communicate with an external entity, some works are presented to fuzz these communication protocols for vulnerability discovery. RPFuzzer [64] is a blackbox fuzzing framework to detect vulnerabilities in Cisco routers, and it used a predefined data model to generate seeds for mutation-based fuzzing. e main challenge is that it requires a security expert to write the data model, so it cannot be leveraged to test other devices automatically. Chen et al. [65] presented IOTFUZZER that performs a protocol-guarded fuzzing on COTS devices; its key idea is that many IoT devices can be controlled through their official mobile apps. So, they firstly adopted a taintbased approach to track the atomic data that are used to construct the network message; then, they mutated these atomic data dynamically to reuse the original code of message building. Our work is similar to IOTFUZZER; both of them perform mutation on atomic elements and do not require a predefined data model. However, not all IoT devices have an official control app, and IOTFUZZER can just detect memory corruption. In addition, the taint-based approach is much heavier than our blackbox approach. AFL and Sulley are two state-of-the-art fuzzers that are widely used to discover vulnerabilities in many software, but additional patches are needed to support fuzzing COTS IoT devices.

Conclusion
We have proposed the first blackbox fuzzer targeting the web management interface in COTS IoT devices; it utilizes the mutation-based fuzzing technology to discover vulnerabilities automatically. To improve the efficiency of fuzzing, a set of techniques including the seed generation, the WMPT, and the remote device monitoring are designed. e experimentation on 7 COTS devices successfully identified 10 vulnerabilities including 6 zero-days. All vulnerabilities have been reported to CNCERT/CC to help the vendor to fix them, and all of them have been fixed now.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request. e security vulnerabilities found in this paper can be accessed in the CNVD (http://www.cnvd.org.cn).

Conflicts of Interest
e authors declare that they have no conflicts of interest.