A Topical Review on Machine Learning, Software Deﬁned Networking, Internet of Things Applications: Research Limitations and Challenges

: In recent years, rapid development has been made to the Internet of Things communication technologies, infrastructure, and physical resources management. These developments and research trends address challenges such as heterogeneous communication, quality of service requirements, unpredictable network conditions, and a massive inﬂux of data. One major contribution to the research world is in the form of software-deﬁned networking applications, which aim to deploy rule-based management to control and add intelligence to the network using high-level policies to have integral control of the network without knowing issues related to low-level conﬁgurations. Machine learning techniques coupled with software-deﬁned networking can make the networking decision more intelligent and robust. The Internet of Things application has recently adopted virtualization of resources and network control with software-deﬁned networking policies to make the trafﬁc more controlled and maintainable. However, the requirements of software-deﬁned networking and the Internet of Things must be aligned to make the adaptations possible. This paper aims to discuss the possible ways to make software-deﬁned networking enabled Internet of Things application and discusses the challenges solved using the Internet of Things leveraging the software-deﬁned network. We provide a topical survey of the application and impact of software-deﬁned networking on the Internet of things networks. We also study the impact of machine learning techniques applied to software-deﬁned networking and its application perspective. The study is carried out from the different perspectives of software-based Internet of Things networks, including wide-area networks, edge networks, and access networks. Machine learning techniques are presented from the perspective of network resources management, security, classiﬁcation of trafﬁc, quality of experience, and quality of service prediction. Finally, we discuss challenges and issues in adopting machine learning and software-deﬁned networking for the Internet of Things applications.


Introduction
Internet of Things (IoT) is the connectivity of the enormous amount of physical devices to the internet to collect, share, and analyze massive chunks of data. Kevin Ashton introduced the concept of IoT 17 years ago [1], and the concepts become fundamentals for the 2nd digital revolution [2]. A survey conducted by Cisco forecasts that 50 billion things will be interconnected through the internet [3]. Interconnection of such a huge number of devices leads to management and scalability issues. For IoT device management, the traditional management approaches are slowly getting obsolete due to the evolution of new technologies and trends. IoT orchestration is one recent advancement in the development to handle management and scalability issues. Orchestration is considered as the more flexible and scalable approach for the management of the enormous number of connected devices through IoT [4,5].
Due to IoT's tremendous economic potential, technology companies and research institutions invest in the development and research in the IoT field to propose sustainable IoT solutions. They have developed various IoT commercial and open source projects over the past decade. Due to the lack of interoperability between these IoT platforms, different data formats are used by these platforms, which rise to the vital challenge of heterogeneity. Thus the need for more efficient network management techniques to handle an enormous amount of data produced by this large number of connected devices increase day by day. The paradigm for centralizing processing and storage of data is currently not feasible; hence, edge computing (EC) plays a vital role in data analysis to improve these consequences in IoT. A new concept called IoT Big Data refers to semantics and type of massive generated data by large-scale IoT connected devices. The realization of IoT, security, and privacy are current research's challenging issues.
In the literature, several surveys have tackled different IoT challenges and aspects including IoT applications, challenges, and opportunities [6], IoT frameworks, IoT security [7][8][9], IoT standardization [10], application of in software-defined networking (SDN) in IoT [11], IoT and cloud integration [12]. However, previous survey studies of IoT did not review all the challenges in detail. For example, in the context of internet applications, quality of service (QoS) and security issues are discussed. However, in IoT applications, these challenges are even more crucial. The existing solutions do not provide solutions and address the IoT challenges partially. In this study, we discuss the SDN and machine learning hybrid approaches based on feasible solutions that can deal with some of IoT's main challenges and overcome IoT applications' issues. SDN can address the challenges of security, cost of hardware, centralization, and management of resources in the IoT environment. Machine learning helps analyze the big heterogeneous data produced by the IoT platforms. SDN paradigm fundamental concept in networking is to separate the control and data planes, enabling network controller to perform network management and engineer traffic dynamically [13]. SDN's controller role includes management of the network resources and programming the network dynamically. Furthermore, the controller monitors and collects the network configuration data, network state, information, and packet flow in real-time.
Machine learning models are trained on historical network data to perform optimization of the network, data analysis, and automated network services provisioning intelligently [14]. In the literature, recent contributions to machine learning provide promising directions to apply machine learning to networking. Machine learning improves the performance, efficiency, and security of SDN solutions. The machine learning-based SDN can improve the performance, security, and efficiency of the IoT network. This review study is divided as follows: Section 2 presents the methodology of this topical research study. Section 3 presents existing studies to understand the background of IoT, and Section 4 presents SDN background. Section 5 explains machine learning techniques currently applied to SDN. In Section 6, we discuss IoT and SDN adaptation efforts and research contributions. Section 7 presents limitations and challenges for the adaptations of IoT leveraging SDN and future research direction. Finally, Section 8 concludes the study.

Methodology
This section presents the methodology of the research study for topical review on IoT, SDN, machine learning approaches, and IoT leveraging SDN techniques. Research questions, objectives, and research studies identification criteria of the proposed study are discussed. This research study investigates existing research studies proposed in the literature to machine learning, SDN, and secure IoT systems. Firstly, we present background sections to understand SDN, machine learning, and IoT to understand the theme of the study. Secondly, we will investigate different machine learning techniques applied to SDN-based networking solutions. Lastly, we will investigate the researchers' interest and contributions in IoT leveraging SDN solutions. Table 1 presents the research questions identified to reach the goals of this study.

S.No
Research Question Description 1 What are the major challenges addressed in the literature regarding IoT networks-based applications?
To investigate different IoT concepts, challenges, and solutions used to address these challenges. 2 What techniques were used in the literature to protect IoT systems?
To understand and investigate techniques and solutions proposed by researchers to protect and enhance IoT systems. We searched for IoT, SDN IoT, and Machine learning in IoT, SDN, and IoT keywords for the answers to the above questions. For example, keyword machine learning in SDN to narrow down the articles and review papers search criteria. SDN and IoT were explicitly added in all our keywords because they were the main topics of this topical review. Search for all the sub-topics was done using the Jeju national university of south Korea, google scholar, research gate, science direct, and IEEE. The main searches were conducted using the science direct portal. However, some of the research articles were searched through google and other search engines. Table 2 presents the search keywords strings used to retrieve relevant research papers related to IoT leveraging SDN, machine learning in SDN, IoT challenges, and its solutions using SDN and machine learning.

Key Criteria
Search keyword (SDN OR "software defined networking") AND (IoT OR "Internet of Things") AND (machine learning OR applications of SDN OR "IoT challenges" OR IoT solutions using SDN) Limiters Article date between 2015 and 2020. Expanders Without the word "optimization".

Search keyword
Search keyword occur anywhere in the article.
This topical review aimed to focus on the most recent research studies and future trends in IoT, SDN, and machine learning hybrid applications. Without this limitation on the scope of this research study, studies to be investigated during this topical review were large. Another advantage of limiting and filtering the studies distinguished our research study from previous survey and review papers. Therefore, to conduct effective topical research, the most relevant and scholarly literature from the last 2 years was collected as a first priority, and research studies older than 2 years were used for background knowledge of the topics mentioned in the paper. Some research papers and survey papers were excluded based on the publisher and publication venue parameters such as impact factor, journal citation reports. The research papers selected for this study mostly focused on IoT, machine learning techniques to analyze SDN network traffic, and techniques for IoT leveraging SDN. Figure 1 represents our approach towards investigating the applications of intelligent SDN in the field of IoT and its wide deployment.

Classify papers to
IoT domain

Background of SDN
SDN has recently received much attention to address some of the enduring networking challenges. SDN's concept is based on ideas of generalization network hardware and decoupling the network controls software from the implementation devices [15]. SDN is a new intelligent architecture for network programmability. SDN's main idea is to separate the control plane from the network devices, enabling the data control from a central and external software entity called an SDN controller. The Open Networking Foundation (ONF) [16] is a non-profit consortium dedicated to the development, standardization, and commercialization of SDN for the transport and IP network layer. Figure 2 presents SDN layered architecture, consisting of [17] four planes: data plane, management plane, control plane and application plane. The Data plane presents a forwarding table that forwards the incoming packets to a network device. Management plane provides intelligent provisioning and orchestration systems for entire network management. The control plane is also termed as the brain of the networks that control different types of data planes using SBIs and protocols. The application plane defines the layer on different northbound applications that exist, which can help out SDN perform and solve future challenges of 5G, as this plane brings innovation, openness, and flexibility for the network vendors. SDN architecture provides dynamicity, network flexibility, and management capabilities. Research studies suggest SDN is the most reliable and promising for separating strategic network computation and data forwarding.  Several northbound interfaces (NBIs) between the control plane and applications were introduced to provide high-level abstractions to the applications that reside on the control plane and in the form of various network-level services. OpenFlow [18] is a significant addition to the southbound interfaces (SBIs) that enable the network to be managed efficiently. Nevertheless, SDN can not be limited to OpenFlow as other less conventional protocols exist such as ForCES [19], NETCONF [20], OVSDB [21], Pollex [22], LISP [23], and OpenState [24]. OpenFlow deports all the intelligence to a centralized entity called SDN controller, which enables the separation of the control plane from the forwarding plane. There are three SDN functions in the panorama of OpenFlow, status reporting for each device connection, slicing, and flow-based forwarding. OpenFlow-based interconnection device matches the packets against a flow table inside the forwarding plane. FlowVisor controller is responsible for handling the flows, decisions, and publishing of the policies.

OpenFlow
OpenFlow provides an interface-based communication among the infrastructure and control layer based on open networking Foundation (ONF) [25,26]. Moreover, it provisions a way for controlling switches regardless of source code disclosure by the vendors. In summary, Openflow provides a way to directly access and manipulate the forward plan of switches and routers [27]. Furthermore, it grants access towards the flowable and provides instructions to the switches on managing and direct traffic of the network. It allows the network managers to alter the network flow in a short time span [28]. According to studies, there are two main categories of OpenFlow-based switches: OpenFlow-only and OpenFlow-hybrid. OpenFlow support is limited to OpenFlow operations, while the hybrid version supports operations and normal Ethernet switches [29].
The OpenFlow controllers are responsible for managing the OpenFlow switches based on a secure channel protocol called OpenFlow protocol. One or more flow tables are contained within a switch for performing a forwarding operation and packet tracing. A flow table includes a flow entry; each entry possesses header fields, specific counters, and actions. The purpose of a header field is to match up against packets with information related to VLAN, ID, source and destination ports and IP address, etc. There are counters for keeping information about the number of packets their sizes. Action provides information associated to processing and matching packets; Their forwarding action is also specified, such as being sent to the controller or a port or sometimes dropped [30][31][32]. OpenFlow channel provides an interface for connecting the switches and controllers. Using this interface, the switches are being managed and configured by the controller; additionally, the events are received, and packets are sent through the switches. Various messages are sent using this channel, including asynchronous messages that involve messages for updating the controller regarding the network event and state change. The second type is controller-to-switch messages; these messages are for managing and inspection of switch states. Lastly, symmetric messages are initiated by the switch or the controller and are sent unsolicited [33][34][35]. OpenFlow controller is responsible for managing, allocating, and updating instructions and policies to the networking devices. It can decide how to handle packets with invalid flow entries and control the switch flow table. OpenFlow switch can establish communication with one or more controllers. A multiple-controller architecture can increase network reliability if a switch fails. OpenFlow starts operations when the switch is connected to all its configured controllers simultaneously, whereas related messages are only sent to the next switch [36][37][38][39].

Data Plane
The Data plane resides at the bottom layer of SDN architecture, consisting of forwarding devices such as routers and switches, whether virtual or physical. Virtual switches are software-based switches that run on a common operating system (OS), examples of virtual switches are Open vSwitch [40], Pantou [41], and Indigo [42]. Physical switches are hardware-based switches, implemented either on open network hardware such as NetF-PGA [43], or on a merchant switch from networking hardware vendors. ServerSwitch [44] and switchBlade [45] are two examples of NetFPGA-based physical switches. Hardware vendors design their's merchant switches with the support of SDN protocols, for example virtual switches are more flexible and have complete feature support for SDN protocols. Physical switches have a high rate of flow forwarding as compared to virtual switches. Both are used to forward, drop, and modify packets using the control plane logic.

Control Plane
In SDN, the control plane acts as the brain that performs a set of actions; for example, it applies flow rules to handle the received ethernet frames that decide the traffic destination ports. SDN controller program network resources, control communication between applications and forwarding devices. SDN controller translates application plane requirements into policies and distributes these custom policies into forwarding devices. Control plane functionalities include network topology storage, configuration of devices, notification of state information, and shortest path routing. Some of the SDN controller architectures proposed in literature are NOX [46], Floodlight [47], POX [48], OpenDayLight [49], Ryu [50], and Beacon [51]. SDN controller has three interfaces for communication: southbound, eastbound westbound. Control data plane interfaces (CDPIs) are interfaces between the data and control plane. CDPIs are also called SBIs, used for forwarding devices to exchange control policies and network state information with the control plane of an SDN controller. SBIs allow programmatic-based control of all device capabilities such as event notifications, advertisements, statistic reports, and forwarding operations. NBIs are exploited by applications to get an abstract view of the network to facilitate automation, analyze specific network behaviors, and analyze network requirements. Eastbound interfaces (EBIs) and westbound interfaces (WBIs) are used in the multi-controller SDN solutions. In multicontroller SDN networks, the exchange of information is important between controllers to provide a global network view to the applications. Examples of distributed control architectures are HyperFlow [52] and Onix [53]. EBIs and WBIs are private and cannot communicate with each other. SDN communication-interfaces [54], distributed-control plane (CIDC) [55], and east-west bridge [56] are some proposals for communication between different SDN controllers.

Application Plane
The top layer in the SDN is called the application plane, consisting of business applications. Business applications provide management and optimization of business services. These applications implement the control logic based on the network state information received from controller NBIs to modify the network behavior. In the study, [57] the solutions of SDN for traffic engineering are discussed. In paper [58], a security-based survey study is conducted. Yan et al. [59] proposed analysis of distributed denial of service (DDoS) attacks in SDN networks in a cloud computing environment. A survey on fault management issues in SDN and its solutions is presented in [60]. SDN is deployed in real-time scenarios including transport network, wireless networks [61,62], optical networks [63], wide area networks (WAN) [64], IoT, EC [65], and cloud computing [66].

P4 (Programming Protocol-Independent Packet Processors)
P4 is a programming language to access the hardware without the knowledge of the architecture of hardware. P4 is used to modify the packet-forwarding mechanisms of the SDN switches [67]. Initially, P4 was used to write software programs and program hardware switches. Hardware resources including network interface cards, networking appliances, FPGA, and ASIC. P4 is used to set the custom headers and dynamic parsing of headers from the packet [68]. P4 provides the custom match and action tables and other constructs such as counters, registers, etc. This makes the P4 language entirely protocol-free. If a definite protocol is used in the network, it is easy to reconstruct the P4 program for new header field maintenance. Other P4 features include configurations, making new P4 applications reusable if needed rather than purchasing new networking devices. Furthermore, the P4 is free from any target device specifications and characteristics. Nonetheless, P4 is dependent upon the design of a device. The P4 application written for a distinct architecture is deployable beyond all destination devices with the same architectural design [69]. The P4 program is specially created for the data planes layer; however, the destination device may hold both the control and data planes. P4 is also used in literature for defining the interfaces between the control and data plane partially, but it cannot manage the control plane's functionality.

Background of IoT
IoT is the connectivity of things to the internet for collecting and sharing data between devices. IoT real-world applications are the basis of the second digital revolution. IoT platforms recognize connected things by unique addresses using protocols suites of transmission control protocol (TCP) and Non-TCP. IoT is the connectivity of sensing and actuating devices such as sensors to the network. Devices virtualization is the process of virtualizing physical resources such as sensors and actuators into virtual objects [70][71][72][73]. IoT connects physical devices and virtual objects through communication protocols such as Bluetooth low energy (BLE), WiFi, ZigBee, Z-Wave, Long Range Wide Area Network (LoRaWAN), to name a few. These IoT devices have dynamic configuration and remotely accessible interfaces [74]. Recent development and contributions in the IoT field introduced new concepts and IoT terms such as machine-to-machine (M2M), industrial IoT (IIoT), Internet of everything (IoE), Internet of anything (IoA), social IoT (SIoT), and web of things (WoT).
BLE is an improved Bluetooth variant, extensively used wireless technology for effective communication with a range of about 10 m. The newest version of Bluetooth 5.2 adds an innovative IP support profile. Literature shows that BLE is completely developed and optimized for IoT devices [75]. WiFi is another broadly used protocol for communication between IoT devices, and most electronic devices manufacturer preferred due to the infrastructure it bears. The range for device communication using WiFi is around 50 m that is much higher than BLE communication [76]. ZigBee is another short-range wireless communication protocol with a 250 kbps data transmission rate. ZigBee is suitable for effective communication between IoT devices due to low power consumption, security, durability, and high scalability [77]. Like ZigBee, Z-Wave is a low-power communication protocol using radiofrequency designed for automation systems such as lamp controllers and sensors. Z-Wave communication protocol range is about 30-100 m, so the interruption of Z-Wave with other protocols such as Bluetooth, ZigBee, and WiFi is negligible. The data transmission rate ranges from 40 kbps to 100 kbps approximately [78]. LoRaWAN protocol is used for long-range battery-operated IoT devices. It communicates in long-range with the least power consumption and detects the noise level of signals based on a threshold range. LoRaWAN is mainly used in smart homes, smart hospitals, and smart cities applications where enormous devices are interconnected for secure communication using less power and memory [79][80][81].
Cisco introduced the concept of IoE [82], IoE is the connection of things, humans, devices, and global network data [83]. Apart from the capability of IoE, IoA considers the connectivity of imagined things [84]. M2M is the study of the communication between machines as well as machine and human [85]. M2M automates communication between machines without the intervention of human beings [86]. Due to IoT's tremendous economic potential, technology companies are investing in the development of real IoT solutions. These companies have developed various IoT commercial and open source projects over the past few years. Table 3 presents some popular IoT commercial and open source projects. Due to the lack of interoperability between these IoT platforms and different data formats used by these platforms, heterogeneity issues have arisen. IoT applications have been deployed in different domains; the industrial domain is considered the most significant. Application of IoT in the industry needs careful deliberations and efforts [87]. IoT with the fourth industrial revolution is a step toward industry 4.0, which integrates industrial practices using smart technologies. Security and privacy are the two main challenges to IIoT applications [88]. To address and investigate these challenges, AT and T, Cisco, GE, and IBM founded a consortium for the IIoT [89]. A newly comparative analysis study was performed between IoT and WoT in a white paper presented in Auto-ID Labs. The comparison results present that IoT has unique identification of things. WoT cannot be used for resolving structural concerns [90]. WoT is a web-based framework that connects physical and virtual things through internet connection. WoT provides accessibility and data analysis, whereas IoT is used for connectivity of devices, automated configuration, and device management IoT.

IoT Software Platforms Projects Company Open Source
AWS IoT platform [91] Amazon No Smart World Sensor [92] Libelium No vCore [93] Hewlett Packard Enterprise No Watson IoT, IBM IoT Foundation Device Cloud [94] IBM No ThingWorx -MDM IoT Platform [95] ThingWorx No M2M/IoT [96] InterDigital yes Particle.IO Particle No ThingsBoard [97] ThingsBoard Yes Cisco IoT System [98] Cisco yes Google Cloud IoT [99] Google No Intel IoT Platform [100] Intel No Microsoft .NET Gadgeteer [101], Azure IoT Suite [102] Microsoft Yes Edge Gateway [103] Dell yes OpenMTC [104] Frankhaufer FOKUS yes SIoT enables devices to form their social network [105]. The concepts of SIoT and WoT are considered very correlated. Now social media networks generate a huge amount of data, analysis of this immense data effectively requires the development of new data science applications [106,107]. This concept of socialization can be applied in the context of IoT [108] for the empowerment of all networks mentioned above to introduce a global network called future internet (FI). FI will be made possible using the principles of collaboration, connectivity, cognition, content, context, and cloud [109]. Things with constrained resources are connected to IoT to generate enormous data, and cloud-based applications will be used to process the data. This data shift from constrained things to a cloud-based collaborative environment will result in a cognitive world [110].
IoT gateway layer functionality includes protocols translation [111], security, service chaining, QoS management, data mining, handover management, mobility, forwarding packets, and routing [112]. In paper [113] IoT gateway based on oneM2M is proposed. oneM2M gateway performs three functions, device management, data analysis, and resource discovery. This study proposes fog and cloud-based computing architecture to the gateway layer for service management [114]. In EC, the edge node has the limitation of computing resources. Therefore migration of computing to the cloud is important for solving large computing tasks [115]. Cloud of Things (CoT) has the challenge of data trimming, to handle data trimming, functional architecture-based smart gateway was proposed [116,117]. Some of the tasks performed by this gateway are data collection, data preprocessing, data filtering, and reconstruction of data into a more valuable format. This smart gateway uploads necessary data to the cloud, track the activities of IoT objects and sensors, track the energy consumption of constrained IoT nodes, manages privacy and security of the data, monitor and manage IoT services [118].
In [119] authors proposed IoT gateway based on mobile phone for transmission of data over wide area network(WAN). IoT gateway with the capability of data requests and replies translations is proposed in [120]. The data is collected from IoT sensors and transmitted to applications installed on the mobile phone. A two-way approach is used for the accessibility of the data, pooling, and registration requests for notification to the gateway. With the revolution of SDN and NFV technologies, Software-defined service chaining is proposed in [121]. In [122] software-defined edges based service chaining mechanism is proposed. The deployment of these edges as software engines in a data center is made using virtual machines. Mininet based simulations are performed for the evaluation and performance analysis of IoT architectures. SDN controller(Pox) is used for the configuration of the routers and switches edge nodes. Paradrop, an edge-computing platform used for IoT gateway, is presented in [123]. Paradrop characteristics include management using OpenFlow, dynamicity, security functionalities, and APIs support. A comparison of the device-centric approach to a data-centric approach is discussed in [124]. Data-centric approach relies on the collection and provision of data without the consideration of device identity. Examples of some data-centric cloud-based IoT architectures are IoT6, FI-WARE, and IoT-A. A service-centric approach relies upon the provisioning of services. In a Usercentric approach, the identity of devices is based on relevancy to the owner identity. Now we discuss IoT scalability from an architectural perspective. Due to Many IoT architectures are proposed in the literature, which connects a huge number of objects to the internet. To address the issues of Interoperability, a standard architecture is needed to propose sustainable IoT solutions. Heterogeneity, diversity, and interoperability among these architectures make it inefficient for IoT-wide deployment [125]. SDN provides a common control layer over the top of these architectures to handle the heterogeneity. SDN handles privacy, communication resilience, security, big data management [126]. Moreover, SDN performs efficiently in traffic engineering and QoS guarantee. Table 4 list IoT architectures providing complete IoT deployment and IoT e architectures leveraging SDN.

FIWARE [127]
IoT applications development based on APIs iCore [128] An IoT project providing user level management and abstracting the heterogeneity IEEE Project P2413 [129] enable compatibility between different architectures IoT-A [130].
A service layer abstraction to overcome the vertical heterogeneity COMPOSE [133].
Collaborative Open Market to Place Objects at your Service IoTDM [126] data Broker for M2M SAM [134] proprietary DIY IoT platform with offline access and cloud support Dweet.IO [135] Open source middleware based IoT platform Particle.IO [136] Proprietary middleware based IoT platform Glue.thing [137] Proprietary DIY based IoT platform Node-red [138] Open source partial DIY supported IoT platform

Machine Learning Algorithms for SDN
SDN controller has a global view of the networks providing centralize network control and management. Machine learning algorithms can be used as a separate module or placed with the other northbound applications of the SDN controller to bring intelligence to SDN. SDN controllers using machine learning performs network data analysis, optimization, and automation of the network. In this section, we survey machine learning-based research studies applied to SDN. We categorize the studies as routing optimization, QoS prediction, quality of experience (QoE) prediction, network traffic classification, resource management, and security management. Table 5 presents machine learning algorithms categories applicable in the SDN domain.

Traffic Classification
Traffic classification plays an indispensable role in network management. Traffic classification enables network operators to control and allocate different services. Currently, the most used traffic classification approach includes dynamic port-based approaches and machine learning [139]. Port-based approaches such as deep packet inspection (DPI) have the advantage of high classification accuracy. DPI limitations include recognition of applications based on pattern availability and high computational cost to check traffic flow. DPI is unable to identify the encrypted flow of traffic. Therefore, compared to traditional DPI approaches, machine learning-based approaches are used to analyze encrypted traffic with a low computational cost. Traffic classification also includes collecting massive traffic flow data, extracting knowledge from the traffic data using machine learning approaches. Machine learning approaches are applied to the SDN controller for the analysis of collected traffic data. Machine learning-based traffic classification such as elephant flow-aware(EF), QoS-aware, and application-aware traffic classification.
EF traffic classification is used to classify traffic flow to elephant flow and mice flow. Elephant flows are bandwidth-hungry persistent flows, whereas mice flow is weak flow and delay intolerant. There are 80 mice flows in data centers, and the rest of the traffic flow is elephant flow [140]. In such environments, elephant flow identification is essential for controlling the traffic flow efficiently. Hybrid data center-based traffic flow scheduling issues are discussed in paper [141]. Machine learning approaches are used to implement EF traffic classification at the edge level; SDN controller-based optimization algorithms further use these analysis results for efficient traffic flow management. Xiao et al. propose a two-stage algorithm with efficient learning cost in SDN for identification of elephant flow [142]. Firstly suspicious elephant flow is differentiated from mic flow using a head packet measuring technique. Secondly, a decision tree classification model is used to classify the suspicious traffic as elephant flow or not.
In literature, application-aware traffic classification is proposed to identify applications based on the traffic flow. In study [139], an OpenFlow-based SDN system is proposed for the collection of data in enterprise networks. Several machine learning classifiers are trained to classify applications based on traffic flow. A hybrid approach of multi-classifier and DPI-based classifier is proposed in paper [143] to identify and classify applications. Rossi et al. proposed a behavioral engine for UDP protocol-based application-aware traffic classification [144]. SVM-based model is used to classify UDP traffic according to its Netflow records with more than 90 percent classification accuracy. TrafficVision, an SDNenabled edge network, is proposed for mobile application-aware traffic classification [145]. The main component of Trafficvision is TrafficVision Engine(TV Engine). TV Engine performs tasks such as collecting data, storing data, extracting flow statistics, and data of ground truth training from access devices and end devices. A decision tree classifier model is used to classify different applications. In contrast, a KNN classifier-based model is used to classify flow types such as video content, an audio file, video chat, etc.
QoS-aware traffic classification is used to classify the traffic flow into QoS classes. QoS classes are assigned to different applications based on QoS requirements such as jitter, delay, and loss rate. Traffic flow classification, according to QoS classes, is the most efficient approach towards traffic flow classification. A semi-supervised learning algorithm and DPI-based approach are proposed for QoS aware traffic classification [146]. DPI approach is used to label the well-known applications. Laplacian SVM or other semi-supervised learning-based models are trained on the labeled data from DPI to classify applications into QoS classes known and unknown.

Routing Optimization
Mathematical optimization or optimization is selecting the best decisions or choice or selection regarding some measure, model or criterion, from some set of available choices. Optimization has been used in different fields including, optimal route recommendation [147], optimal policy-making [148], energy optimization [149], to name a few. Routing is one of the primary functionality of a network; the SDN controller manages the routing functionality of traffic flow by modifying the network devices' flow tables such as routers and switches. SDN controller may guide a network device to route a traffic flow through specific paths, or it may also decide to discard a specific type of traffic. Inefficient decisions related to traffic flow can affect the SDN network's performance in terms of network overload and transmission delay. There are two types of algorithms used for routing optimization: shortest path first and heuristic algorithms. Shortest path first algorithms are simple best routing protocols, but it does not enhance the network resource utilization [150]. On the other hand, heuristic algorithms best utilize the network resources with a high computational cost [151]. SDN controller devises policies for each routing flow. Machine learning is used widely to overcome the issues of routing optimization using fast near-optimal routing solutions. These routing optimization issues are considered decision-making tasks; hence, machine learning approaches such as Reinforcement Learning (RL) perform well for routing optimization. In RL-based routing optimization approaches, the controller is considered as an agent and the network as the platform. The state-space consists of network and traffic states. Actions represent routing solutions, and the agent rewards are devised based on optimization metrics such as network delay and throughput. A distributed intelligent protocol for routing in SDN network using RL is proposed in paper [152].
In paper [153], a routing optimization algorithm in the domain of SDN-based interdata center overlay network is proposed. A time-efficient QoS aware adaptive routing technique to forward the adaptive packet by using RL algorithms is proposed in paper [154]. The proposed approach selects a routing path with maximum QoS aware reward according to the user applications and traffic types. Literature also presents studies based on supervised learning techniques for routing optimization. Supervised learning-based routing optimization considers network and traffic states as the input of a training dataset. The routing solution of the heuristic algorithm is considered the output of the training dataset. Learning based on supervision can lead to optimal routing solutions like heuristic-like routing. In [151] dynamic routing based on supervised learning called NeuRoute is proposed. In NeuRoute, the long short-term memory (LSTM) part is used to estimate future traffic. The estimated network traffic and network state are used as input, and output is the results of the heuristic algorithms to a neural network model. The neural network is trained using these input and output data to predict heuristic-like results.

Traffic Prediction
Prediction process involves predictive modeling using regression models to predict the likelihood of an outcome. Predictive modeling is usually practiced in the domain of machine learning and artificial intelligence [155,156]. On the other hand, predictive analytics is used to predict the outcome of undiscovered future events using modeling and machine learning techniques from current data to predict future problems. Predictive modeling is used in literature for real-life potential applications [157][158][159][160]. Traffic prediction is an important research topic in the field of routing optimization. Traffic prediction used to predict the patterns of network traffic volume using analysis of historical traffic data [161]. Traffic prediction results are utilized by the SDN controller for efficient traffic routing decisions in advance and distribute the dynamic routing policies to devices in the data plane. These routing policies are used as a guideline for traffic flow routing soon. Traffic prediction enables the SDN controller to avoid traffic congestion, improve QoS, and proactively provide the network. For dynamic optical routing, a dynamic optimal routing meta-heuristic algorithm is proposed in the literature [162]. These meta-heuristic algorithms consist of three stages: offline scheduling, offline planning, and online routing. In the offline scheduling phase, a neural network is used to predict network traffic load for optimal resource allocation. Online routing decisions are made based on the routing path with minimum cost.
A load balance strategy is proposed for the optimization of path load [163]. SDN controllers use four features for predicting the path load through a neural network model, packet loss rate, transmission latency, transmission hop, and bandwidth utilization ratio. The path with the least load is selected for the new traffic flows. In paper, [164] LSTM based framework NeuTM is proposed to predict the network traffic matrix. GEANT backbone network [165] based real traffic data are used to train the LSTM model. Results from the simulation environment show that LSTM prediction performance is good for routing optimization. Researchers use QoS parameters such as throughput, delay, loss rate, and jitter for network performance assessment. User satisfaction level and perception are essential to service providers and network operators. QoE is used for the assessment of the performance of the network from user-oriented metrics. Service providers use prediction algorithms to predict QoS and QoE to provide network services to users with great customer satisfaction. Machine learning algorithms are applied to SDN controller collected statistics and information for the QoS and QoE prediction [166]. QoS management can be improved by predicting QoS parameters according to the key performance indicators (KPIs). Prediction of QoS parameters is considered a regression task because these QoS metric values are continuous.
Algorithms based on supervised machine learning such as random forest, support vector regression, ANN-based regressor are used for QoS parameters prediction. QoE identifies a subjective metric over the network such as mean opinion score (MOS) [167]. MOS classifies the QoE values into five classes: excellent, good, fair, poor, and bad. QoE values are usually obtained from a feedback form regarding the QoS. The customers rate the quality of the services in 5 stars or a scale of 1 to 10. The subjective method is timeconsuming as QoE is dependent on QoS parameters. Machine learning algorithms can be used to find the relationship between QoS parameters and QoE values. In the paper, ref. [168] QoE prediction experiment is performed in the case study of a video streaming. MoS value is estimated using network parameters such as delay, bandwidth, jitter, and RTT. SDN controller can adjust video parameters to improve the user QoE. In paper [167], QoE values are predicted based on video quality parameters using four machine learning algorithms, decision tree, K-NN, ANN, and random forest. Pearson correlation coefficient and root mean square error (RMSE) is used for the performance analysis.

Resource Management
Network operators and service providers use resource management techniques to improve the performance of the network. SDN maximizes resource utilization using network-based resources management. Data plane level resource management includes the utilization of computing, networking, and caching resources. Networking resources include bandwidth, spectrum, and power, which are used to fulfill user QoE and QoS requirements. Caching techniques to store the most frequently requested data at the device end to remove data redundancy and reduce data transmission delay. Recent technology trends such as face recognition and augmented reality require high computation at the device end to enhance QoS and QoE. Due to scarce computation and battery capacity, the device resources fail to perform these computational tasks. One solution to offload such computational tasks is by deploying computing resources near the end-users using EC [169].
SDN networks are deployed in single and multi-tenant environments for the efficient resources management of data plane. In a single tenancy SDN network, a logically centralized controller manages all data plane resources. In a multi-tenancy SDN network, multiple tenants share data plane layer resources, SDN controller of each tenant manages their isolated resources. In the paper, [170] a framework for software-defined virtualized vehicular ad-hoc-network (VANET) is proposed that enhances the network performance using data plane resources allocation dynamically. A multi-objective optimization problem is formulated from the resource allocation problem. Deep Reinforcement Learning (DRL) algorithms are used to solve the problem and obtain policies regarding resource allocation.
Solution of multi-objective optimization resource allocation problem is proposed for the case study of smart cities [170,171]. Multi-tenancy SDN network-based resource allocation challenges in mobile network operators are addressed [172] in C-RAN. For this, a resource allocation problem is formulated as a non-cooperative game-theoretic problem. Each player selects an optimal set of mobile network operators using a learning algorithm based on regret matching to solve the problem. The computational offloading issue in mobile edge computing (MEC) is also addressed as a non-cooperative game-theoretic problem [173]. MEC servers are the players, and each player performs two actions, active and inactive. The optimization goal of each player is to minimize the consumption of energy. Each player learns optimal actions using RL based model.
Network virtualization's latest advancements enable multi-tenancy SDN networks for sharing data plane resources using network hypervisor installation between the control and data plane. Each tenant manages its isolated network resource using a network hypervisor such as FlowVisor [174] and OpenVirteX [175]. Hypervisors use machine learning algorithms for efficient resource management. In paper [176], hypervisor's CPU consumption monitoring tool is proposed, which is named hvbench. Hvbench is a benchmarking tool used to measure the control message rate. Three regression models are trained to learn the correlation between control message rate and consumption of CPU. These trained models are used to detect the overload of network hypervisors in real-time. Control-plane resource management in SDN is done using the SDN controller placement, which has a significant impact on the SDN network performance. SDN controller processes traffic flows from switches installed at different locations. If the distance between network devices and SDN controller is long, the delay in traffic flow processing will be considered by the resource management. In literature, heuristic algorithms are proposed to solve the problem of controller placement, but these algorithms have a high computational cost. Thus supervised learning algorithms are used to obtain an optimal controller placement [177,178]. The input of these supervised learning models is traffic distribution data, and the output is a controller placement solution from heuristic algorithms. This hybrid approach based on supervised learning and heuristic algorithms leads to an optimal controller placement solution with a low computational cost.

IoT Leveraging SDN
IoT plays an important role in the development of intelligent systems such as smart health-care [179], smart transportation systems [180], and smart energy systems [181] using large scale distributed systems. These distributed systems connect billions of RFID nodes and sensors, and hence there is a need for designing a secure, efficient, intelligent, and cost-effective, salable IoT architecture [182]. A sensing network is made up of sensing nodes capable of collecting and monitoring information from the environment, such as temperature, pressure, humidity, and motion. These sensing nodes usually have scarce resources such as limited computing power, battery, and bandwidth, which raise issues of heterogeneity and network configurations. SDN is an effective solution for network management and network configurations. SDN is integrated with the sensing network for sustainability and efficiency of the sensing network [183]. In paper [119], a model of a software-defined wireless sensor network (SDWSN) is proposed to solve heterogeneity and network configurations of scarce resources.
Large-scale distributed systems will generate massive data; machine learning techniques are applied for the predictive analytics of such massive data. Predictive machine learning models are trained using a centralized approach; however, transferring such a massive amount of data to a centralized SDN controller requires large network bandwidth [74]. For low bandwidth consumption, EC-based solutions are used to preprocess the data and improve the system response time. The prepared data are then transferred to the SDN controller to speed up the machine learning training process. To improve IoT services' response time, these trained models are deployed on edge servers [184]. Machine learning and SDN hybrid techniques are applied for route optimization, data analysis, event detection, node clustering, localization [66], intrusion, and fault detection [185] in IoT applications. In Figure 3, a conceptual IoT architecture based on intelligent SDN architecture with machine learning modules is presented.

Data Processing & Storage Centre
Accounting & Bill  The benefits of SDN in IoT architecture are discussed in detail by Omnes et al. [186]. SDN allows the dynamic configurations to define the policies and rules for different data planes. They also discussed IoT architecture requirements such as common service layer, big data management, QoS, and network access management. In paper [187], a restful application programming interfaces (APIs) based IoT architecture consisting of several modules such as northbound and southbound APIs is proposed. The control plane consists of a processor and database modules. SBIs module deals with protocols such as HTTP, COAP, and LoRaWAN. In literature, free scale SDN platforms such as vortiQa open network switch director and vortiQa open network are discussed [188]. SDN-based Majord' home management platform is proposed in [189]. Due to the extensive use of the internet, device management has become a difficult task. In Majord' home management, the connected object is represented by CO, and coCO represents a community of the connected objects. The virtual object is represented using Vo, and avatar is used for the management of VOs.
Boussard et al. [190] expand the proposed home automation platform further using a generalization of the concept to any smart environment. CO is generalized to a home device, VO is the abstract view of a device, whereas coVO is a community of the virtual objects. Proposed SDN architecture is composed of four layers, a management layer spanning vertically and three horizontal layers. The data layer performs data generation without routing and forwarding functions. The Control layer is composed of a network controller, CO controller, and coVO controller. The management layer consists of a network manager, application, and VO manager. In paper [191], SDN and distributed data service are applied to IoT architecture. SDN provides data agility, flexibility, and mobility handling, whereas distributed data service is used to manage big data. The proposed approach is needed in IoT as its applications and services depend upon the collection and analysis of sensing data. The architecture addresses three domains: M2M, networks, and application domains of IoT. In [192], a software-defined infrastructure (SDI) manager-based approach is proposed. SDI Manager consists of cloud computing controllers such as OpenStack and network controllers such as FlowVisor. Cloud computing controller performs user-level management of computing resources. Network controller performs network-level management such as network resource management and collection of topology information of the network. The network controller also interacts with openvswitches for the configuration of the data plane's forwarding tables.
Heterogeneity is addressed using the network operating system (NOS) to allow the deployment of different applications on a set of devices of different networks. In paper [193], an operating system has been developed based on ONOS SDN controller for IoT systems to support SDN-WISE. SDN-WISE is a protocol for extending the capabilities of SDN for wireless sensor networks. In paper [194], the issue of heterogeneity in the IoT environment is addressed and solved using a multi-SDN environment. The proposed approach consists of an IoT controller which communicates with the IoT devices. IoT controller handles communication requests and estimates forwarding rules. The proposed approach has some limitations, such as scalability, heterogeneity of identity schemes, routing protocols, and IoT agents' integration. In paper [195], management, security, and scalability relevant issues of IoT are addressed and solved using a layered architecture. The physical layer consists of IoT devices, and the middleware or control layer consists of software-defined blocks. Software-defined blocks include blocks for security, storage, IoT controller, and software-defined controller. Data collectors process the data received from an IoT gateway, where a software-defined security block authenticates and flags the data as positive and negative. IoT controller block computes forwarding rules, and forwarding rules are forwarded to software-defined controller block. Software-defined controller downloads the data into the network switches. SDN-based WoT architecture has limitations of security, data management, and things in web technologies [196]. SDN layer on top of WoT resources provides security and management of devices. SDN-based WoT architecture comprises three layers, the access layer, control layer, and the application layer. The access layer provides WOT gateways to the things, the control layer provides control functions and manages the resources database, and the application layer provides application-level management.
The need for Universal customer premises equipment (uCPE) based edge gateway is increasing to facilitate various SDN/NFV functions. These functions include sensing data aggregation, policy management, data storage, protocol translation, and cloud-specific functions. To make easier flow data between SBI and NBI, MQTT protocol is deployed. Bluetooth and ZigBee modules were adapted to setup the uCPE [197,198]. Softwareized WiFi networks require dedicated softwares for wireless functionalities realization, which is not fully flexible and causes management complexities. In, the authors proposed Po-Fi, highly flexible architecture for SDWSN based on Protocol-Oblivious Forwarding (POF) for WiFi-based innovations. Po-Fi abstract WiFi Acess point as a programmable forwarding pipeline based on SDN consensus [199]. The data traffic collected at the Macro Base Station (MBS) increases due to the wide usage of mobile phones. The MBS cannot manage all user's requirements, and to get the expected services, some users are offloaded to the nearby small cells [200]. The tradeoff between the admittance load and economic incentive to achieve optimal offloading SDN assisted Stackelberg Game is proposed. Stackelberg Game model selects the users precisely to aggregate the service with Access Point at MBS to improve QoE. Every player of the game maximizes their payoff utilities in a real-time scenario. The authors obtained maximum throughput per user, which experiences the best data service without any lack of QoE. The Stackelberg Game model proves better as compared with other game theory models in achieving optimal data offloading [201].
Research studies claim that the core network will utilize SDN soon, as some of the data centers and networks of service providers are already benefiting from SDN. This will sim-plify network function deployment and effective feedback to observe network conditions. In one paper, an SDN mechanism is applied to incorporate wireless IoT edge networks. Long Range (LoRa) was selected as the IoT communication protocol. The study proposes integration architecture for LoRa-SDN [202]. In large-scale IoT networks such as LoRaWAN based IoT, SDN with network slicing must manage network slices flexibly and provide optimized parameter configuration in IoT. Samir Dawaliby et al. proposed SDN-based network slicing architecture for LoRaWAN. The network slices are isolated and deployed virtually over LoRa's physical gateways. The study also improves large-scale network configurations using TOPG. TOPG is network slicing-based optimization to improves the parameters configuration of LoRa based on each slice QoS thresholds. Simulation results using NS3 highlighted that the proposed optimization approach improves the network performance of LoRa slices. Network performance was evaluated in terms of reliability and QoS thresholds in dense deployments of IoT networks [203].

Real World Applications
IoT security is a hot research topic; in literature, machine learning is applied to SDN to improve IoT security. SDN provides an easy approach to handle simple and distributed DDoS attacks [204,205]. Flow-based dynamic security schemes can be implemented at the network edge [206]. A deep learning network-based approach is proposed in paper [207] for anomaly detection at the edge server. In paper [208], a support vector machine is applied to analyze sensing data and detect abnormal activities. In paper [209], deep learning is applied to edge networks for the detection of IoT attacks. SDN provides significant advantages in the domain of network security, especially in large-scale IoT networks. In summary, EC, SDN, and machine learning can enable intelligent and sustainable IoT solutions. IoT real-world applications of IoT and machine learning leveraging SDN are given in Table 6. IoT monitoring framework [214] Software defined 5G mobile networks based framework NFV Data filtering using MQTT, custom monitoring system, handle load balance through NFV, and improve network intelligence. Heterogeneous traffic management, improve reliability and usability but management functionalities.

Limitations and Future Research Directions
This section lists the limitation of current IoT platforms, such as lack of interoperability, compatibility, realizability, and security. Proposing absolute SDN-based solutions is not realistic; however, machine learning applied to SDN controllers will lead to IoT's realization. Figure 4 presents limitations of current IoT. Although SDN with machine learning algorithms addresses some IoT network issues, significant research issues still need to be addressed through SDN. A high-quality training dataset is required to achieve high accuracy of machine learning models in SDN [221]. There are no standard open-source datasets of IoT network data. Comprehensive IoT network datasets will encourage researchers to analyze IoT network data using machine learning models. IoT network flow data is a challenge to develop heterogeneous IoT applications [222]. SDN-based IoT controller decreases the complexities and obtained flexible IoT management. However, attackers can overload the IoT controller through a massive flow of requests. Machine learning-based approaches such as generative adversarial network (GAN) [206] is a practical approach to solve this vulnerability by predicting the new attacks. There are two neural networks in GAN; one neural network generates new data while evaluating the new data authenticity according to a real dataset. GAN generates possible new attacks based on existing attack data; machine learning models can be trained using the new attack data to detect both known and unknown attacks.  IoT devices' scalability raises new issues regarding QoS and security, which can be addressed through machine learning-based SDN solutions. CISCO IoX is one such solution for flexible device management in a real-time IoT environment. To ensure QoS and security in the IoT domain, Microsoft proposed architecture to implement data analytics over the network's edge. Many intelligent applications were proposed to collect data, such as IBM Watson, to develop cognitive systems. However, the realization of IoT is still challenging from both a practical and theoretical perspective [223]. Lack of compatibility means that thousands of research were published in the IoT domain proposing IoT platforms and architectures, raising the need for new network schemes. IoT realization involves many parties, network services providers, data services providers, device manufacturers, and application developers. IoT devices manufacturing companies develop monitoring tools to maintain and manage their devices. Some of the manufacturers provide cloudbased solutions for IoT devices and services management. SDN coupled with cloud computing can hide the complexity of data services and devices management; for example, Amazon provides Amazon web services (AWS) [224] for the management of devices and IoT services.

Limitations of
Based on the presented limitations, we believe that there are imperative directions that have to be considered in the future for IoT research studies. One of the critical challenges which need to be addressed is the interoperability of the proposed IoT platforms with existing platforms. Well-known research institutes and journal publishers should define the main IoT challenges as special issues to reduce the number of partial solutions proposed in the literature. These research institutes and research groups should also work towards the standardization of IoT. IoT simulation software should be developed to test the different IoT architectures, algorithms, and protocols. The real implementation of IoT solutions and performance analysis tools should be developed to test the platforms proposed in the IoT domain. Machine learning applied to SDN coupled with NFV will overcome the management complexity of high-scaled IoT networks. SDN enables the programmability of network functions to overcome IoT heterogeneity challenges. EC coupled with a machine learning-based SDN controller will enable real-time management of big data.

Conclusions
Innovations and development to the communication technologies such as SDN address heterogeneous communication, quality of service requirements, unpredictable network conditions, and a massive influx of data. SDN aims to deploy rule-based management to control and add intelligence to the network using high-level policies to have integral control of the network without knowing issues related to low-level configurations. This paper aims to discuss IoT leveraging SDN solutions to address security challenges, cost of hardware, centralization, and resources management in the IoT environment. Machine learning-based SDN solutions perform optimization of the IoT network, data analysis, and automated and intelligent services provisioning. Machine learning-based SDN solutions also improve the performance, efficiency, and security of IoT solutions. We also addressed potential real-world applications and their relative demerits. The limitations presented in this study provide promising directions towards IoT research studies. Interoperability of the IoT platforms, cooperation functions between SDN and NFV, VNF distribution, and management of tunnels problems are still needed to be addressed. In vehicular networks, software-defined architectures are used to maintain functional stability, reduce control overhead, but poor resource allocations should be addressed. Research institutes and research groups should work towards the standardization of IoT. IoT-SDN simulation software should be developed to test the different software-defined IoT architectures, algorithms, and protocols. IoT leveraging machine learning-based SDN solutions have limited healthcare, smart homes, smart cities for sustainable solutions. Lastly, the problem of fault detection and poor management should be addressed in smart grid environments.
Author Contributions: I. conceived the idea for this paper, designed the methodology, and wrote the paper. Z.G. assisted in the experimental design and paper write-up. J.G. and M.F. assisted in review and editing. A.M.A., A.A. and J.G supervised and proofread the study of a topical review on machine learning, software-defined networking, Internet of Things Applications: research limitations and challenges. All authors have read and agreed to the published version of the manuscript.