Toward the Protection of IoT Networks: Introducing the LATAM-DDoS-IoT Dataset

Anomaly detection is a well-known topic in cybersecurity. Its application to the Internet of Things can lead to suitable protection techniques against problems such as denial of service attacks. However, Intrusion Detection Systems based on Artificial Intelligence, as a defense mechanism, need robust data sources to achieve strong generalization levels from the knowledge domain of interest. Therefore, in this research we present the LATAM-DDoS-IoT dataset, which results from a collaboration among Aligo, Universidad de Antioquia, and Tecnologico de Monterrey. The LATAM-DDoS-IoT dataset includes attack traffic to physical Internet of Things devices and normal traffic from real external users consuming actual services from Aligo’s production network. We also compare this new dataset with the Bot-IoT dataset, as the latter is a collection of data used in recent approaches to create detection systems in the Internet of Things domain. Furthermore, we build a smart anomaly-based Intrusion Detection System from our new dataset, training Decision Tree and Multi-layer Perceptron models, for later deployment and evaluation on a Software Defined Networking architecture with physical and virtual components. Before deployment, we obtained an average accuracy of 99.967% and 98.872% with our new dataset’s balanced denial of service and distributed denial of service versions. After deployment, we show that our Intrusion Detection System does not misclassify legitimate traffic and detects more than 90% of the attacks.


I. INTRODUCTION
The Internet of Things (IoT) requires protection techniques against cyberattacks. The massive and spread use of these physical objects connected to the Internet [1] represents a large target that can turn into botnets after hijacking to conduct attacks such as phishing and Distributed Denial of Service (DDoS) [2]. The required level of protection can be obtained by identification and mitigation systems based on Machine Learning (ML) and Deep Learning (DL), which The associate editor coordinating the review of this manuscript and approving it for publication was Luca Bedogni . have demonstrated adequate time performance and high accuracy detection rates [3].
Nevertheless, Artificial Intelligence (AI) models need robust data sources to achieve high generalization levels. Due to its low computing power, constrained memory, and heterogeneous device types, IoT can be considered a separate phenomenon from standard computers and servers, requiring datasets that reflect this nature. Moreover, there is currently a notorious absence of publicly available IoT datasets for security research and practice [4]. In addition, current datasets do not allow the modeling of real user traffic and the representation of DDoS attacks against physical IoT devices simultaneously.

VOLUME 10, 2022
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Therefore, in this paper we present a new dataset based on physical IoT devices and real users consuming actual services from a production network to enhance the current state of the art for IoT security, particularly concerning denial of service attacks. We call this new data collection LATAM-DDoS-IoT and can be downloaded from [5]. We provide the ground truth pcap files and the generated network flows, their features, and the labeled categories and subcategories to facilitate the implementation of supervised learning methods. In addition, based on this dataset, we developed a novel Intrusion Detection System (IDS) [6], which we deployed and tested on a Software Defined Network (SDN) with physical and virtual components, obtaining an accuracy above 94% and a recall value of 100%. We chose this environment type due to its administration, transparency, and automation characteristics [7].
Lastly, we performed two additional experiments to compare our new dataset with the Bot-IoT dataset [8]. The latter is a state-of-the-art solution to create Intrusion Detection Systems in the IoT domain. After performing transfer learning, we observed apparent differences due to the synthetic normal traffic from the Bot-IoT.
We can summarize the main contributions of this work as follows: • A new publicly available dataset for security researchers and practitioners in the IoT field, based on physical IoT devices and real external users consuming actual services from a production network. This dataset is called LATAM-DDoS-IoT.
• A new IDS based on anomaly detection AI models trained using the LATAM-DDoS-IoT dataset to identify DoS and DDoS attacks and tested on an SDN environment. The remaining sections of this paper are structured this way: we present the related work in section II. Section III describes our methodology for creating the LATAM-DDoS-IoT dataset, the proposed AI-based IDS, and the process for building and implementing the SDN environment for testing. The results and discussion are provided in section IV. Finally, in section V, we present the conclusions and future work.

II. RELATED WORK
This section describes the most commonly used datasets for network forensics and designing protection systems [4], [9]. Additionally, we present information about the limitations of these datasets and a benchmark between them and the LATAM-DDoS-IoT. In the end, we introduce some Intrusion Detection Systems for IoT networks to define what experimentation tasks are relevant to conduct with our new dataset.  [11], [12], since 2008 [13] it has been considered outdated for modern networks.
• KDD99: it was built based on the DARPA dataset and was published in 1999 [14]. It contains attacks such as DoS, portsweep, and User to Root (U2R). Inherent problems exist in the KDD99 dataset, such as the issue of redundant records, where it has been found that about 78% and 75% of the records are duplicated in the training and testing sets, respectively [15]. This redundant information can cause bias problems in the AI methods used for learning.
• NSL-KDD: it is a newer version of KDD99 published in 2009 [16], from selected records of the complete KDD99 dataset to overcome its inherent problems. However, as stated by the NSL-KDD authors, this new version may not be a perfect representative of existing real networks [15].
• UNSW-NB15: it was created in 2015 [17] by the University of New South Wales, containing nine types of attacks, such as DoS, Backdoors, and Worms. The total number of normal traffic records is 2,218,761, and the number of attack records is only 321,283. Although this dataset is recent and reflects a wide variety of abnormal traffic, the number of records for each attack type is minimal. An IDS trained on this data could be biased towards normal traffic.
• CIC-IDS2017: it was created in 2017 [9] by the Canadian Institute for Cybersecurity. The data were captured over five days, containing normal and attack traffic, including SQL Injection, DoS using GoldenEye [18], and DDoS using the Low Orbit Ion Cannon (LOIC) tool [19] for sending UDP, TCP, and HTTP requests to the victim server. According to the dataset authors, the attack diversity of the CIC-IDS2017 is based on a report released by McAfee in 2016; thus, they argue that the dataset contains the most common and recent attacks up to that moment. Nevertheless, the CIC-IDS2017 dataset did not include IoT devices in their testbed.
• CSE-CIC-IDS2018: it was created in 2018 [20] by the Communications Security Establishment of Canada, and the Canadian Institute for Cybersecurity, as an update to the CIC-IDS2017 dataset. The CSE-CIC-IDS2018 includes prevalent attacks such as DoS using Golden-Eye and DDoS using LOIC. However, one of the main differences from the original CIC-IDS2017 is that the testbed includes more Windows machines (hundreds of them in a LAN on Amazon Web Services) divided into five subnets to emulate the departments of a company. Nonetheless, this dataset does not reflect IoT behavior since it still ignores these devices in the network topology.
• Bot-IoT: it was published in 2019 [8] by the University of New South Wales and emulates a smart home with five simulated IoT devices: a fridge, a garage door, a weather monitoring system, motion-activated lights, and a thermostat. It includes attacks such as DoS and DDoS based on HTTP using GoldenEye and based on UDP and TCP using Hping3 [21]. The normal traffic was generated using Ostinato [22] and collecting data produced by periodic normal connections between the virtual machines (e.g., when transferring files between each other). This dataset is focused on IoT; nevertheless, it does not include physical IoT devices in its testbed. The amount of normal flows is only about 9,000, compared to the attack traffic of millions of flows, presenting severe class balancing issues [23].
• TON_IOT: it was published in 2020 [24] by the University of New South Wales and contains simulated and physical IoT devices. The simulated devices using Node-RED [25] include a smart fridge and a thermostat, and the physical devices include two smartphones and a smart TV. The attacks launched against the IoT devices include DDoS and ransomware, and the normal traffic is generated from the publishing and subscribing methods between the mentioned devices to local and public Message Queue Telemetry Transport (MQTT) gateways [26]. Although this dataset includes a variety of simulated and physical IoT devices and different attacks, the normal traffic is still a product of the local testbed activities, so it is not exposed to real users for consuming actual services.
• CIC IoT: it was released in 2022 [27] by the Canadian Institute for Cybersecurity and profiles the behavior of physical IoT devices in different scenarios, including cameras, a lamp, and a coffee maker, emulating a smart home. The devices were under DoS attack using LOIC based on HTTP, UDP, and TCP protocols, and also brute force exploiting the cameras' Real Time Streaming Protocol (RTSP). This dataset includes physical IoT devices; however, it does not contemplate DDoS attacks (only DoS), and it is mainly oriented (from design) to behavioral analysis of different IoT devices (e.g., during idle and powered-on states). Based on this review, we argue that DoS and DDoS attacks are still prevalent problems faced by networks. These attacks are mainly based on UDP, TCP, and HTTP protocols, with tools such as GoldenEye, Hping3, and LOIC commonly used to launch them. However, this review shows a need for publicly available IoT datasets for network traffic analysis that include real users, such as the one we present in this paper. Table 1 briefly compares our proposed dataset against the most commonly used IoT datasets in the literature to create defense systems. To the best of our knowledge, the LATAM-DDoS-IoT dataset is the only one that considers physical IoT devices while containing DDoS attacks, as well as normal traffic from real external users consuming actual services from a production network. Including DDoS attacks represents the actual distributed behavior against physical IoT devices; furthermore, the normal category allows the modeling of real user traffic.
Regarding the creation of Intrusion Detection Systems for IoT networks, we can mention some examples, such as [3], [28], [29], and [30], which we consider relevant since they show that a correct feature selection, class balancing, time-performance evaluation, and benchmark of different ML and DL models are essential aspects when deciding if an anomaly-based IDS is suitable for deployment in production environments. This exhaustive level of experimentation is achieved in this paper using the LATAM-DDoS-IoT dataset.
From now on, the names LATAM dataset (for short) and LATAM-DDoS-IoT dataset will be used indifferently.

III. METHODOLOGY
This section explains the steps we followed to create the LATAM dataset, the data processing, and the hyperparameters for training the AI models of our anomaly-based IDS. Additionally, we present information related to our SDN architecture for the IDS deployment and evaluation.

A. LATAM-DDoS-IoT DATASET
The LATAM dataset was designed and created during a collaboration among Aligo, 1 Universidad de Antioquia, and Tecnologico de Monterrey. Thanks to Aligo's support, we built and implemented a testbed for DoS and DDoS attacks, using physical and virtual components (see Fig. 1).
Four physical IoT devices and one simulated IoT device were the victims of the DDoS attacks. The physical IoT devices were two Google Home Mini, one smart power strip, and one smart light bulb, connected via an access point. The simulated device using Node-RED was a thermostat running on a container in a virtual machine with Fedora Linux.
The attackers were two Kali virtual machines, running Hping3 for launching DoS and DDoS attacks based on UDP and TCP protocols and GoldenEye for the HTTP protocol. Moreover, these attackers used tcpdump [31] to capture the network traffic and Nmap [32] for port scanning to identify open services running on the victims.
The normal traffic was captured from the Gatherer node, connected to a span port, where production activities from Aligo's customers were collected. The time window to capture such normal traffic was 50 minutes; in contrast, each victim attack's time window was 10 minutes. Each time window was run after the previous one finished to avoid bias towards the timestamps. The collected pcap files were processed offline using Argus [33] to obtain data at flowlevel, structured as the second feature set proposed in [3] for real-time IDS implementations. Although information such as the destination port numbers are a suitable indicator of the targeted services, this second feature set resulted from a Pearson correlation analysis that discarded those attributes. Table 2 shows the total number of flows collected for both versions of the LATAM dataset (i.e., the LATAM-DoS-IoT and LATAM-DDoS-IoT datasets). The normal traffic is the same for both versions, with 799,187 flows collected during 50 minutes from the span port. The generated columns are shown in Table 3.
We labeled the data with a category column (where 0 means normal and 1 means attack traffic) and a subcategory column (with values from 0 to 3 for Normal, UDP, TCP, and HTTP classes, respectively). This labeling allows the implementation of supervised learning methods, with binary and multiclass classification tasks for detecting and identifying denial of service attacks.
The total number of samples for the LATAM-DoS-IoT dataset is 30,662,911 flows with 20 columns, and for the LATAM-DDoS-IoT dataset is 49,666,991 flows with the same number of columns. Due to the amount of normal traffic generated by actual customers consuming real services from a company, we consider that the LATAM dataset in its two versions (DoS and DDoS) can be used, for instance, to model one-class classifiers [34] for zero-day attacks [35] detection.
Our attacks based on TCP and UDP protocols had a flooding behavior [36]. We successfully made the two Google Home Mini fail, where the Google service remained processing the voice requests (with the status lights on) though it could not respond. Also, we successfully stressed the smart power strip since it did not resist the DDoS attack for more than eight minutes. The light remained on for the smart light bulb, but it could not change color. Hping3 seems to be a strong attack tool for denial of service since GoldenEye (used against the simulated thermostat) did not interrupt the victim. That can be explained because GoldenEye is a low-rate denial of service tool [37], so it requires more time to overwhelm the victim. Table 4 shows the commands used to launch the attacks. These commands were launched from one Kali virtual machine for the LATAM-DoS-IoT dataset and two Kali virtual machines (simultaneously) for the LATAM-DDoS-IoT dataset.
From Table 4, we can see three distinct commands patterns when launching the attacks:   From the two versions of the LATAM dataset, we decided to apply the same random selection of consecutive flows sections for each attack type in the same proportion to the normal samples to keep a balanced ratio, as proposed in [3]. Fig. 2 shows the data distribution of the DoS and DDoS versions of the LATAM dataset. In the end, the total dataset size for training the AI models was 2,407,102 samples for the DoS version and 2,431,453 for the DDoS. The difference between both distributions is the number of GoldenEye records, which is less than the other classes due to the low-rate behavior of this tool, characterized by the small average traffic derived from periodic pulses instead of continuous flows [36].

B. DATA PROCESSING AND AI MODELS HYPERPARAMETERS TUNING FOR TRAINING
Features of both versions of the LATAM dataset were normalized by applying the StandardScaler method from the scikit-learn ML library [38], [39] (see Eq. (1)). For training our models, we did not use either the timestamps or the Argus sequence number from Table 3 (i.e., we only used 15 of the total number of created features).
We decided to train Decision Tree and Multi-layer Perceptron (MLP) models according to the best ML and DL learning methods from [3]. The hyperparameter tuning for these models resulted from an iterative approach to evaluating different configuration settings. For instance, to select the best depth of our Decision Tree, we chose the one with the best accuracy from a finite range of values; for the MLP, we defined the best learning rate by minimizing the range between 0 and 1 using random decimal numbers. We show the values of the final hyperparameters across the different experiments of this work for the MLP in Table 5 and the Decision Tree in Table 6.
We split the dataset into 80% for training, 10% for validation, and 10% for testing. The Decision Tree model was built using scikit-learn, and the MLP model using PyTorch [40].

C. SDN TESTBED FOR THE IDS DEPLOYMENT
To deploy our smart IDS, we simulated an SDN infrastructure using Mininet [41] and an ONOS [42] as the network controller. Even though ONOS is more complex in its implementation, we decided to use it because it is one of the most robust controllers and is typically used in production environments. This testbed is an architecture with physical and virtual components, where we have ONOS installed and running as a Linux service in an Ubuntu 18.04.6 LTS machine, but we also  have simulated devices using Mininet. The network diagram is depicted in Fig. 3.
The testbed includes one physical Ubuntu computer where the ONOS controller runs directly as a Linux service. Also, the architecture is structured with two switches; each one of them communicates to four nodes with an IP address in the subnet 10.0.0.x. We used iPerf [43] to generate normal client traffic and implement two servers listening on TCP port 5001. The attacker nodes generate high-rate denial of service attacks [44] based on TCP using Hping3. Next, we describe the commands used during traffic and attack generation: 1) iperf -s: this command runs iPerf in server mode.
We used this command to implement the server nodes  , and so on, up to h8. Next, we explain the flow of communications in the SDN testbed using this notation. Legitimate traffic is generated as follows: h2 towards h8; h5 and h6 towards h1. Attack traffic is generated using the following paths: h3 and h4 towards h1, and h7 towards h8. Legitimate traffic is sent simultaneously from the source nodes for 60 seconds. During the last 10 seconds, the attack traffic is launched, overlapping with legitimate traffic. When performing a ping reachability test in the network, we confirmed that this traffic volume was high enough to see 98% of the communications drop.
We have a Java prototype installed and running in ONOS. Currently, this app is limited to working only for TCP traffic, but it can be further enhanced to capture UDP traffic. This is the reason why we only launched TCP traffic from the attackers. This Java app is based on [45] and was developed to capture network traffic, parse the packets into flows, and extract the features. It uses a Java implementation of Flowtbag [46], initially written in Go. We extended the code to allow connectivity to our cloud-hosted REST API written in Flask [47], where our logic for flow classification resides. Hence, we were able to structure the JSON request with the variables we needed. Algorithm 1 describes the IDS classification process inside our API.
The results obtained from our SDN testbed and their discussion can be read in section IV.

IV. EXPERIMENTAL RESULTS AND DISCUSSION
The classification metrics used for evaluation are accuracy, precision, recall, and F1 score. See Eq. (2)-(5) for these metrics' definitions. For the time performance evaluation, we measured the average number of flows per second each anomaly detection model could analyze, according to [48].

A. LATAM DATASET
The classification results for both versions of the LATAM dataset are shown in Table 7, whilst the time performance results are in Table 8. These results indicate that Decision VOLUME 10, 2022 Tree is marginally better than MLP, and both anomaly detection models outperform the maximum peak of 1,681 flows per second discussed in [48]. The LATAM-DoS-IoT dataset classification results are slightly better than those of the LATAM-DDoS-IoT dataset. In both LATAM dataset versions, binary classification is slightly better than multiclass classification. The latter can be explained because the low amount of records in the HTTP category affects the predictions in multiclass classification, in contrast to binary classification, where this category is merged with the other attack classes (UDP and TCP).
To extend the proposed smart IDS from [3] and to add real traffic from real customers and physical IoT devices, we decided to conduct two experiments with the previously used balanced version of the Bot-IoT dataset. The first of the experiments applied transfer learning, and the second one implied concatenating both datasets. We provide more details in the following subsection.

B. COMBINING THE LATAM AND BOT-IoT DATASETS 1) TRANSFER LEARNING
Transfer learning is an AI technique focused on transferring knowledge from a source domain to a target domain when both of them share similarities [49]. We want to transfer knowledge from the LATAM dataset to the balanced version of the Bot-IoT dataset [3] because we believe more data (millions vs. thousands of records, respectively) can lead to better generalization of anomaly detection in network flows.
Instead of applying random initializations, we used transfer learning to initialize the weights and biases of our new MLP models. To accomplish this, we took our neural networks trained with the LATAM dataset and froze all their fully connected layers except for the last one in order to train our new and final linear layer with the balanced version of the Bot-IoT dataset, but now in a substantially shorter amount of time since we accelerated the whole learning process. See Table 9 for the transfer learning results with the adjusted MLP models.
Recall values from both tables in the case of binary classification indicate the models detected above 95% of the attacks. For multiclass classification, we can see that precision was the highest metric, with flows identified as attacks correctly classified in their majority. We can argue that normal flows heavily affected our accuracy since the testbed designed for the Bot-IoT dataset incorporates synthetic normal traffic generated using Ostinato, which is different from our normal traffic collected from actual customers consuming real services. This aspect conducts to a negative transfer [50] between both domains, which can be slightly improved with fine-tuning to find a better network setting that improves our detection metrics, for instance, freezing different layers (not all of them).

2) DATASETS CONCATENATION
For the next set of experiments, we concatenated the balanced version of the Bot-IoT dataset with each balanced version of the LATAM dataset, which allowed the models to learn from both domains from scratch, albeit taking considerably longer to train. This approach led to the best classification results. After concatenation of the Bot-IoT and LATAM-DoS-IoT datasets, the resulting dataset size was 2,443,442 samples; and for the Bot-IoT and LATAM-DDoS-IoT datasets, 2,467,793 samples. From now on, we will refer to the first combination as LATAM-Bot-DoS-IoT and LATAM-Bot-DDoS-IoT for the second case.
The hyperparameters setting applied to the new datasets are shown in Tables 5 and 6. We employed the same tuning approach we have followed so far. Table 7 shows the classification results for both LATAM-Bot-DoS-IoT and LATAM-Bot-DDoS-IoT datasets, and Table 8 shows the time performance.
These time performance results indicate the feasibility of implementing our anomaly detection models in production networks. The classification experiments indicate that learning from scratch improves results, with our models successfully learning information from both domains: the Bot-IoT dataset and the LATAM dataset. The best classification accuracy obtained from transfer learning was 87.342%, compared to the worst MLP accuracy value of 98.654% obtained with dataset concatenation. Since Decision Tree outperforms the MLP in time performance and both classification tasks, we decided to choose it to conduct a final experiment deploying our smart IDS in the SDN topology described in section III.

C. IDS DEPLOYMENT RESULTS IN THE SDN TESTBED
We deployed our Java prototype for TCP traffic over the SDN topology, using the Decision Tree for binary classification from the LATAM-Bot-DDoS-IoT dataset. We obtained an accuracy of 94.608%, a precision of 100%, a recall of 91.406%, and an F1-score equal to 95.51%. Fig. 4 shows the normalized confusion matrix.
Our results show that 100% of the flows the smart IDS identified as attacks were correctly classified, and 91.406% of the attack flows were detected. The F1 score is 95.51%, indicating a good balance between precision and recall. Since the results do not show misclassifying of legitimate traffic and show detection of more than 90% of the attacks, we are confident that our smart IDS is a suitable defense against denial of service attacks in SDN environments.
Compared to the obtained results before the deployment, the accuracy degrades in the SDN scenario, mainly because of the use of Mininet for victims simulation, which is different from the physical IoT victims in the LATAM dataset (i.e., different from the vast majority of the traffic present in the used LATAM-Bot-DDoS-IoT dataset).
After launching the DDoS attacks, we show a ping reachability test from Mininet in Fig. 5, where we can see that 98% of the communication was dropped, and the attack successfully disrupted the network. This test was executed immediately after launching all the traffic on the SDN testbed, and the only ping that remained working was between the attacking  host h7 and its victim, server h8. This can be explained as ten seconds were not enough for a single attacker to overflow its victim completely. In this experiment, an average of 860,016 packets were transmitted by each attacker node in just ten seconds to the victim nodes h1 and h8 (as presented in section III). Fig. 6 shows a fragment of our smart IDS logs while classifying traffic in ONOS: we log the classification results, the model specified in the input JSON request to detect the traffic, and the tuple that conforms to the flow (i.e., the source IP, source port, destination IP, destination port, and protocol fields). The white square in the image shows that our smart IDS can detect simultaneous attacks from different source IPs (i.e., DDoS attacks), which in this case correspond to attackers h4 and h7.

D. COMPARISON WITH PREVIOUS WORKS
Unlike state-of-the-art IoT datasets [8], [24], [27], we addressed the modeling of real user traffic and the representation of DDoS attacks against physical IoT victims simultaneously.
In addition, we performed experiments with our IoT dataset in an SDN architecture with physical and virtual components, running ONOS as a Linux service instead of Docker [51], in contrast to the fully virtualized simulation reported in [37].
Moreover, we present time performance as a metric to measure the quality of our Intrusion Detection System, different from other works [28], [30] that are limited to only classification metrics (e.g., the accuracy, precision, recall, and F1-score values).
Although the deployed IDS in [37] shows the best accuracy of 95.01%, their obtained precision was 95.46%, which is less than our precision of 100%. It indicates misclassification of legitimate traffic and a severe problem against normal users, different from our IDS capability, where all of the flows identified as attacks were correctly classified. Furthermore,   the work in [37] uses the CIC DoS dataset [52] for targeting web servers and not considers an IoT environment.
In the next section, we conclude this paper and present future work that can be carried out to explore new directions and extend our current efforts.

V. CONCLUSION AND FUTURE WORK
In this work, we presented a new dataset that includes real normal traffic from actual clients consuming production services from a company and also real attack traffic to physical IoT devices. We call this new dataset LATAM-DDoS-IoT and share a DoS version of it.
To leverage the categories and subcategories present in the LATAM dataset, we conducted binary and multiclass classifications with its balanced DoS and DDoS versions, obtaining an average accuracy of 99.967% and 98.872%, respectively. Then, we combined the balanced version of the Bot-IoT dataset from [3] applying transfer learning, showing how the datasets differ from each other. Additionally, we concatenated both datasets in another experiment to get a higher level of generalization from both domains, achieving strong results such as 99.99% accuracy using Decision Tree binary classification for DoS.
Furthermore, we built an SDN architecture using ONOS and Mininet to deploy and evaluate our IDS and coded a Java app to communicate with our cloud-hosted Flask REST API. 100% of the flows identified as attacks by our smart IDS were correctly classified, and it detected above 90% of the attack flows. Our defense system does not misclassify legitimate traffic and presents an average time performance above 30,000 flows per second, which is fast enough for physical deployment.
This work can be extended by creating and deploying an Intrusion Prevention System as a mitigation management strategy, which integrates and communicates with our Intrusion Detection System. The resulting architecture will not only allow us to detect attacks (as we have done) but also stop the identified attackers, diminishing network damage.
Another direction for further improvement is to test our IDS in an entirely physical SDN architecture. To achieve this, the next component to remove would be Mininet to avoid virtualized hosts and implement real physical devices. Moreover, since our Java functional prototype was designed to capture only TCP traffic, a natural improvement would be to consider UDP traffic as well.
In addition, it will be interesting to see what kind of experiments derive from other colleagues' use of the LATAM-DDoS-IoT dataset, such as one-class classifiers, since the characteristics of real traffic from actual clients and also attack traffic directed to physical IoT devices make our dataset convenient for real production environments.