Advancing XSS Detection in IoT over 5G: A Cutting-Edge Artificial Neural Network Approach

: The rapid expansion of the Internet of Things (IoT) and the advancement of 5G technology require strong cybersecurity measures within IoT frameworks. Traditional security methods are insufficient due to the wide variety and large number of IoT devices and their limited computational capabilities. With 5G enabling faster data transmission, security risks have increased, making effective protective measures essential. Cross-Site Scripting (XSS) attacks present a significant threat to IoT security. In response, we have developed a new approach using Artificial Neural Networks (ANNs) to identify and prevent XSS breaches in IoT systems over 5G networks. We significantly improved our model’s predictive performance by using filter and wrapper feature selection methods. We validated our approach using two datasets, NF-ToN-IoT-v2 and Edge-IIoTset, ensuring its strength and adaptability across different IoT environments. For the NF-ToN-IoT-v2 dataset with filter feature selection, our Bilayered Neural Network (2 × 10) achieved the highest accuracy of 99.84%. For the Edge-IIoTset dataset with filtered feature selection, the Trilayered Neural Network (3 × 10) achieved the best accuracy of 99.79%. We used ANOVA tests to address the sensitivity of neural network performance to initial conditions, confirming statistically significant improvements in detection accuracy. The ANOVA results validated the enhancements across different feature selection methods, demonstrating the consistency and reliability of our approach. Our method demonstrates outstanding accuracy and robustness, highlighting its potential as a reliable solution for enhancing IoT security in the era of 5G networks.


Introduction
The rapid evolution of the Internet of Things (IoT) and the advent of 5G technology have profoundly transformed various sectors, including e-learning, e-health, and intelligent industrial manufacturing.While these advancements have integrated smart devices into our daily lives, they have also introduced numerous security challenges.The interconnectivity of devices in IoT ecosystems creates vulnerabilities that cyberattackers can exploit, threatening the integrity and security of these systems [1].This vulnerability is particularly dangerous in IoT devices operating over 5G networks, which enhance connectivity and capacity while expanding the spectrum of frequencies available for mobile networks [2].
As 5G networks continue to improve, advancements in wireless communication technology are made.However, with new technology, new security issues arise.Developing robust security guidelines and approaches is critical, mainly as more individuals and systems rely on 5G networks; this entails keeping data safe, safeguarding essential services and systems, and protecting user privacy.By addressing security problems head-on, we can collaboratively develop a secure and robust 5G network.The increased connectivity and bandwidth of 5G networks facilitate more connecting mobile devices simultaneously [3].
The faster speeds and shorter delays of 5G technology raise additional concerns about user privacy and data security.The increased amount of data generated and shared creates risks of unauthorized access to private or sensitive information [4].Network elements such as base stations and edge servers are potential failure points that criminals might exploit to gain unauthorized access to user data.The interconnected nature of devices and systems in 5G networks heightens these risks [5].Furthermore, the vast amount of data disseminated by IoT devices using 5G poses concerns about data security, including their resale, user profiling, and intelligence collection [6,7].
One of the security breaches that may threaten IoT devices is Cross-Site Scripting (XSS) attacks, which can be particularly harmful to web-based networks connected to IoT devices [8].These attacks allow a hacker to exploit XSS vulnerabilities in web applications, altering IoT devices' behavior or gaining access to the system to acquire sensitive information.Severe consequences of XSS attacks on IoT include unauthorized use of devices, data breaches, privacy breaches, and potential physical harm or safety threats [9].The high-speed distribution of data over 5G networks amplifies the threat posed by XSS attacks, making it critical to develop a comprehensive and proactive security approach [10].
XSS is a type of security vulnerability commonly found in web applications that allow attackers to inject malicious scripts into web pages viewed by other users [11].These scripts can execute in the context of the user's browser, potentially leading to unauthorized actions, data theft, session hijacking, and more [12].XSS attacks come in three main types: Stored XSS, Reflected XSS, and DOM-based XSS.Stored XSS involves permanently storing malicious scripts in a target server, such as in a database, comment field, or forum post, served to users' browsers upon request [13].Reflected XSS occurs when malicious scripts are reflected off a web server in immediate responses, such as error messages or search results, executing as soon as the user receives them [14].DOM-based XSS happens when client-side scripts in the web application modify the DOM environment in unsafe ways, enabling the execution of an attacker's script [15].
Integrating IoT devices with 5G networks creates new vectors for XSS attacks due to increased connectivity, faster data transmission speeds, and device interdependencies [16].Attackers can inject malicious scripts into IoT devices that host web interfaces, such as smart home devices with web dashboards.These malicious scripts can be injected into device logs, configurations, or user interfaces.In a 5G network, where IoT devices frequently communicate, a script injected into one device could propagate to other interconnected devices [17].The high-speed data transmission capabilities of 5G networks mean that the execution of malicious scripts can happen more quickly and on a larger scale, leading to widespread disruption if one device is compromised.
Web-based management interfaces for IoT devices are another common target for XSS attacks.Attackers can inject scripts into fields that administrators or users interact with, like login pages, settings, or logs, causing the malicious script to execute when accessed [18].Social engineering and phishing are also tactics used by attackers to trick users into clicking on links that contain malicious scripts, exploiting vulnerabilities in the web interfaces of IoT devices [19].Additionally, as 5G enhances mobile device capabilities, mobile apps that control IoT devices can be targeted by injecting scripts into web views within these apps, compromising the managed IoT devices [20].
Mitigating XSS attacks in IoT over 5G involves several strategies.Input validation and sanitization ensure that all input received by IoT devices and web interfaces is free of malicious scripts [21].Implementing a Content Security Policy (CSP) restricts the sources from which scripts can be executed, enhancing security through regular security updates for IoT device firmware and software patch known vulnerabilities.Secure coding practices during development can prevent the introduction of XSS vulnerabilities.Additionally, educating users about the dangers of phishing and social engineering attacks is crucial in reducing the risk of XSS attacks.By employing these measures, the security of IoT devices operating over 5G networks can be significantly improved, mitigating the potential impact of XSS attacks [22].

Contribution
In this research, we achieved an important contribution in the IoT field over 5G networks, so we can summarize them as the following: 1.
Novel ANN Application for XSS Detection: We introduced a novel ANN approach to detect XSS attacks in IoT systems over 5G, significantly improving detection accuracy and efficiency compared with traditional methods.

2.
Comprehensive Dataset Utilization: We employed NF-ToN-IoT-v2 and Edge-IIoTset datasets to validate the model's effectiveness and reliability across diverse IoT environments, ensuring generalizability and robustness.

3.
Enhanced Feature Selection: We utilized both filter (mutual information (MI)) and wrapper (recursive feature elimination (RFE)) feature selection methods to optimize predictive performance, reducing computational complexity while maintaining high accuracy.4.

Statistical Validation via ANOVA Test:
We applied an ANOVA test to confirm significant improvements in detection accuracy, addressing performance variability due to initial conditions and ensuring robustness and consistency.

5.
High Detection Accuracy: We achieved remarkable detection accuracies with BLNN and TLNN models, reaching up to 99.84% using BLNN on the NF-ToN-IoT-v2 dataset and 99.79% using TLNN on the Edge-IIoTset dataset, demonstrating the potential for real-time intrusion detection in IoT systems over 5G networks.
The rest of the paper is structured as follows: Section 2 provides related works; Section 3 presents the proposed methodology to detect XSS attacks; Section 4 presents the results and performance evaluation of the ANN detection approach; Section 5 presents the results of the ANOVA test; Section 6 discusses the effectiveness and efficiency of the proposed approach over the related works; and finally, Section 7 concludes the paper and suggests future research directions.

Related Work
The widespread use of IoT systems and the deployment of 5G networks have significantly changed how we interact with our environments.However, this new level of connectivity has also opened up many security risks, with XSS attacks posing a significant threat to the security of IoT systems.This section overviews literature reviews on using Machine Learning (ML) and Deep Learning (DL) to secure 5G IoT networks, focusing on studies that address XSS attacks on IoT systems.Duan et al. [23] addressed the intrusion detection problem in IoT systems, particularly relevant to smart cities.The researchers proposed a novel approach supported by dynamic line graph neural networks and semi-supervised learning.They tested their model on six datasets, including NF-ToN-IoT-V2, and achieved the highest detection accuracy of 95.70% for XSS attacks.
Gaber et al. [24] suggested an injection attack detection system for IoT, proposing an Intrusion Detection System (IDS).They investigated two feature selection approaches, constant removal and recursive feature elimination, with three machine learning classifiers: Support Vector Machine (SVMs), Random Forest, and Decision Tree.Using the AWID dataset (version AWID-CLS-R, created by Constantinos Kolias et al., University of the Aegean, Samos, Greece), the Decision Tree classifier outperformed others with a 99% injection attack detection rate by applying only eight selected features.This research highlights the importance of injection attack detection for the security of smart cities, where numerous threats are anticipated due to their development.Awad et al. [25] emphasized the rapid increase in cyberattacks on IoT networks and devices, highlighting the significance of ML in Network Intrusion Detection Systems (NIDSs).They noted that the prediction time in anomaly-based NIDSs is directly proportional to the number of factors used by the ML model.Their proposed model achieved a detection accuracy of 98% for XSS attacks using just 13 features, demonstrating the effectiveness of the feature importance model.
Yigit et al. [26] conducted a study on a digital twin-empowered smart attack detection system for 6G Edge of Things (EoT) networks using the ToN-IoT datasets.They employed an online learning algorithm with AutoFS and AutoCM for dynamic and adaptive feature selection and classification.Their system achieved a sensitivity metric of 98.04% for XSS attack detection, proving its efficiency in detecting and preventing these attacks due to innovative feature selection and machine learning techniques.Sarhan et al. [27] explored XSS attack detection within IoT environments, integrating their work with the NF-ToN-IoT-V2 dataset.They utilized a machine-learning-based model to identify XSS injection attacks prevalent in such networks.Their model showcased robustness with an accuracy of 96.83% in detecting XSS attacks.
Awad et al. [28] conducted a study focused on enhancing IIoT security using ML and DL techniques for intrusion detection.The primary objective was to detect and mitigate 14 distinct types of cyberattacks targeting IIoT and IoT protocols.Their methodology involved using the Edge-IIoTset dataset.They implemented various ML algorithms, including k-nearest neighbors (K-NNs), Decision Trees (DTs), and neural networks (NNs).The experiments were conducted using the KNIME platform, with preprocessing steps that included data cleaning, missing value, and normalization to improve classification performance.Their results revealed that the K-NN algorithm achieved an accuracy of 54.37%, while the DT algorithm achieved 85.48% accuracy in detecting XSS attacks.Their study is relevant as it focuses on the effectiveness of ML and DL in securing IoT environments, aligning with our goal of using ANN for XSS attack detection over 5G networks.However, its lower accuracy for specific attacks like XSS indicates a gap in comprehensive threat detection.
Ahmed and Askar [29] developed EdgeGuard, a framework utilizing machine learning for proactive intrusion detection on edge networks.The main aim was to identify and counteract various cyber threats targeting edge and IoT environments.Their approach used convolutional neural networks (CNNs) with residual connections to effectively identify complex patterns in network traffic data.The experiments, conducted using the Edge-IIoTset dataset, demonstrated that their method achieved 77% accuracy in detecting XSS attacks.This research is significant as it showcases the effectiveness of ML techniques in enhancing the security of IoT and edge environments, aligning with our objective of using ANN for detecting XSS attacks over 5G networks.
Ferrag et al. [30] introduced SecurityBERT, a model designed to be both lightweight and privacy-preserving, utilizing the BERT architecture to detect cyber threats in IoT devices.The research focused on enhancing threat detection accuracy while keeping computational demands low, thus making the model ideal for use in environments with limited resources.The methodology incorporated Privacy-Preserving Fixed-Length Encoding (PPFLE) and the Byte-Level Byte-Pair Encoder (BBPE) Tokenizer to effectively process network traffic data.Testing on the Edge-IIoTset dataset showed that SecurityBERT achieved an overall accuracy of 98.2% in detecting fourteen types of attacks, and it achieved 76.22% accuracy specifically for XSS attack detection.
The literature review highlights the significance of Artificial Intelligence (AI) approaches in securing 5G IoT networks from XSS attacks.Most studies noted the increased vulnerability caused by the adoption of IoT systems and the societal transformation facilitated by 5G networks.They emphasized the need for feature selection approaches to enhance the detection rate of intrusion detection systems, affirming the necessity of improving security in AI optimization.Thus, these efforts are justified to enhance the security and preservation of IoT in smart cities despite increasing insecurity.

Proposed Methodology
The rapid development of 5G technology and IoT has significantly transformed various aspects of modern life.These advancements have provided mobile networks with wider bandwidths, faster connections, and improved performance.However, they have also introduced a new range of security threats.Among these, XSS attacks are particularly impactful on data confidentiality, exploiting vulnerabilities in network components and web applications and jeopardizing user privacy and data security.The primary objective of this research is to develop a robust deep learning method for detecting XSS attacks on 5G-enabled IoT devices.This approach is crucial for preventing security breaches and ensuring the overall security of IoT systems.Figure 1 illustrates the proposed methodology for identifying XSS attacks using the NF-ToN-IoT-v2 and Edge-IIoTset datasets.The methodology begins with selecting subsets from the NF-ToN-IoT-v2 and Edge-IIoTset datasets that focus on "XSS" and "benign" categories.The preprocessing steps include data cleaning, normalization, and label encoding.The Synthetic Minority Oversampling Technique (SMOTE) is applied to address the issue of imbalanced data.Next, both filter and wrapper feature selection methods are used to identify the most valuable features for XSS detection by ANN architectures.The dataset is divided into subsets: 70% for training and 30% for testing.Various ANN classifiers, including Narrow Neural Network, Bilayered Neural Network, and Trilayered Neural Network, are then employed to detect XSS attacks.The performance of these models is evaluated using metrics such as accuracy, precision, recall, and F1-score.Additionally, the ANOVA statistical test is applied to validate the results.This approach demonstrates its effectiveness in enhancing IoT system security and mitigating potential risks by achieving high accuracy in identifying XSS attacks.

Dataset
In this work, we adopted two datasets to evaluate the performance of our model as follows:   • Edge-IIoTset Dataset: The Edge-IIoTset dataset is a comprehensive cybersecurity dataset for IoT and Industrial Internet of Things (IIoT) environments, designed to support developing and evaluating intrusion detection systems.This dataset contains network traffic data collected from various IoT devices under normal and attack conditions.The dataset comprises 157,800 records featuring diverse types of cyberattacks and benign instances [31], as shown in Table 3. Table 4 provides a comprehensive overview of the features included in the Edge-IIoTset dataset, which is utilized for detecting various cyberattacks in IoT and Industrial Internet of Things IIoT environments.Each feature is described with its corresponding data type, either categorical or numerical.Key features include network-related attributes such as source and destination IP addresses, port numbers, protocol identifiers, and traffic metrics like byte and packet counts.Additional features, such as TCP flags, flow durations, and retransmission statistics, capture specific behaviors and network traffic characteristics.The dataset also includes metadata about HTTP requests, DNS queries, and other protocol-specific details critical for identifying anomalies indicative of security threats.Combining these diverse features enables a thorough analysis and accurate classification of network traffic, facilitating the detection of potential cyber vulnerabilities and enhancing the overall security of IoT systems.

Data Preprocessing
Preprocessing is essential in transforming unprocessed input into usable data, involving extensive cleaning to remove errors and superfluous information.Our method modifies the raw NF-ToN-IoT-v2 dataset, which includes 44 attributes and 53,464 entries, by correcting anomalies with substitute values and addressing missing entries using the maximum values of attributes.To facilitate the training of ANNs, we convert categorical labels into numeric codes, classifying network traffic as "Benign (0)" or "Attack (1)", and use the SMOTE technique to balance the dataset.We also apply feature selection techniques to decrease the size of the dataset and simplify the computational demands, enhancing the effectiveness and quality of the analysis.These steps in the data preparation process are crucial for ensuring that the ANN model is trained efficiently and effectively utilizing MATLAB software (version R2023a, MathWorks, Natick, MA, USA).

•
Data Cleaning: The first step involved identifying and eliminating all cases with missing (NaN) or infinite (Inf) values.They may be absent from the dataset due to measurement errors or data corruption.Once such cases were cleaned, the dataset became more uniform and accurate for further analysis and model training [32].• Data Normalization: The data have been normalized to improve the performance of the training and the ANN model.In addition, normalizing is critical when using a dataset in which each feature has significant numerical disparities.Normalizing all features to the range between 0 and 1 helps ensure the classification's accuracy, but it also aids in minimizing the training time and potential error since the less exponentially prominent features would not govern the learning process [33].• Label Encoding: Label encoding is a fundamental preprocessing step in the field of ANN, particularly useful when working with categorical data.This technique transforms categorical variables into a numerical format, making them compatible with ANN algorithms that require numerical input.In this study, we employed label encoding to convert categorical features into numeric labels, facilitating the training and evaluation of our predictive model [34].The adopted label encoding process can be summarized using Algorithm 1.

Filter Feature Selection Method
We utilized mutual information (MI) to evaluate and select features.This method operates independently of any ANN model, focusing instead on the intrinsic properties of the data.The fundamental concept behind filter methods is to score the relevance of features based on statistical measures, precisely the MI in this case, which quantifies the amount of information one variable provides about another.The MI between a feature and the target variable is a non-negative value that measures their dependency [35].It is calculated as the following Equation (1): where X and Y are the feature and target variable, respectively; p(x, y) is the joint probability distribution function of X and Y; and p(x) and p(y) are the marginal probability distribution functions of X and Y.A higher MI score indicates a greater relevance of the feature to the target variable, suggesting that the feature shares more information with the target.
The process began with the encoded dataset S, during which the features X and the target variable y were isolated.Following this, for each feature f in X, we computed the mutual information score between f and y using the mutual_info_classif function.We then selected the top ten features with the highest mutual information (MI) scores to ensure an optimal balance between retaining highly informative features and minimizing model complexity.This approach was strategically chosen based on empirical evidence, suggesting that selecting features with the highest MI scores significantly enhances the model's predictive accuracy while reducing the risk of overfitting.Algorithm 2 illustrates the details of this method.Figure 2 presents the MI scores for all features in the NF-ToN-IoT-V2 dataset, and Figure 3 presents the selected features with the highest MI values.Figure 4 presents the MI scores for all features in the Edge-IIoTset dataset, and Figure 5 presents the selected features with the highest MI values.

Wrapper Feature Selection Method
We employed the recursive feature elimination (RFE) method for feature selection.Unlike filter methods, RFE operates with a predictive model, iteratively refining the feature subset to enhance model performance [36].The core principle of wrapper methods is to evaluate feature subsets based on the performance of a chosen model, optimizing for the most predictive combination of features.RFE begins by training an estimator on the entire set of features and computing the importance of each feature.The least important features are then recursively pruned from the current set of features.Specifically, RFE ranks the features based on their importance and recursively removes the least important feature, refitting the model on the remaining features in each iteration until the desired number of features is reached [37].
In this study, we utilized a Linear Regression model as the estimator within the RFE algorithm, leveraging its robustness and capability to handle complex datasets.The process started with the encoded dataset S, isolating the features X and the target variable y.We then applied the RFE method with the Random Forest classifier to iteratively rank and select the top ten most important features.Equation ( 2) clarifies how it works as follows: where X ′ represents the selected subset of features, k is the number of desired features, L is the loss function, and f is the predictive model.By minimizing the loss function, RFE identifies the feature subset that contributes most significantly to the model's predictive accuracy.This approach ensures that the selected features not only retain high predictive power but also maintain the interpretability and relevance of the model.Algorithm 3 details the RFE process.Figures 6 and 7 illustrate the feature importances and the selected features in the NF-ToN-IoT-V2 dataset, respectively, while Figures 8 and 9 present the corresponding results for the Edge-IIoTset dataset.

Classification Methods
In addressing the challenge of XSS attack detection, neural network architectures were adopted, ranging from the simplicity of Narrow Neural Networks to the complexity encapsulated within Trilayered Neural Networks.This endeavor is propelled by MATLAB's software, allowing for an in-depth comparative analysis of architectures varying in layers and neuron counts to capture and model our dataset's intricate dynamics accurately.

Narrow Neural Network
The Narrow Neural Network, with its single hidden layer, exemplifies model efficiency and swift training capabilities at the foundational level, making it particularly suitable for less complex predictive tasks.This algorithm initializes with random weights and biases, progressing through cycles of forward propagation, where data transformations through linear and non-linear operations culminate in output predictions.Backpropagation follows, adjusting weights and biases to minimize error, a process mathematically expressed as Equation (3); Algorithm 4 illustrates the procedures of this model as follows [38]: where σ denotes the activation function; W 1 , W 2 the weight matrices; and b 1 , b 2 the bias vectors, illuminating the path from inputs x to the network's output.

Bilayered Neural Network
Evolving complexity, the Bilayered Neural Network integrates an additional hidden layer, enabling the model to capture more nuanced patterns.This architecture's ability to abstract complex relationships makes it apt for tasks with evolving data patterns, such as image and speech recognition.The BLNN extends the operational framework of the NNN, incorporating an extra layer into both the forward and backward propagation phases [39], thereby enhancing the model's depth and capability, as is apparent in Equation (4); Algorithm 5 presents the details of this model as follows: Here, W 3 and b 3 extend the model to accommodate another layer of computation, enriching the network's capacity to process and learn from the input data.Compute first hidden layer: Compute second hidden layer: The Trilayered Neural Network, with its three hidden layers, represents the zenith of complexity in our exploration.This architecture's deep structure is adept at modeling highlevel abstractions, making it ideal for tackling the most intricate tasks in ANN, including natural language processing and advanced time series forecasting.The TLNN algorithm meticulously orchestrates forward and backward propagations across three layers, refining the network's parameters for optimal performance [40].The mathematical representation captures this complexity in Equation ( 5); Algorithm 6 illustrates the procedures of this model as follows: where W 4 and b 4 are added to accommodate the third layer, highlighting the intricate computations that enable the TLNN to perform its sophisticated analyses.Compute first hidden layer: Compute second hidden layer: Compute third hidden layer:

Evaluation Metrics
The original set was divided into two 80% for training and the remaining for testing.Performance evaluation measures were selected based on the concepts introduced above in the confusion matrix: true positive (TP), false positive (FP), false negative (FN), and true negative (TN).FP corresponds to false alarms, and FN corresponds to misses.The evaluation process uses accuracy, precision, recall, and F1-score measures.

•
Accuracy: Accuracy measures the overall correctness of model predictions by comparing correct predictions with the total predictions.While useful, it may not be the best indicator for imbalanced datasets, where it can misleadingly appear high if the model predominantly predicts the majority class accurately but fails with the minority class [41].
• Precision: Precision is a metric that evaluates the accuracy of a model's positive predictions.It is calculated by dividing the number of true positives by the total predicted positives, which include both true and false positives.This measure is crucial in contexts where avoiding false positives is important.It helps assess how well a model identifies only relevant instances as positive [42].
• Recall: Sensitivity or True Positive Rate: Recall is calculated as the ratio of actual positive instances that are predicted as positives.In other words, it shows how many positive instances the model correctly identified without missing anyone.It is calculated as the ratio of true positive and the sum of true positive and false negative [43].
• F1-score: The harmonic mean of precision and recall, F1-score is the balanced measure considering precision and recall together and mainly used to find an optimal balance between precision and recall.It is calculated by the following [44]: These metrics are crucial for evaluating the performance of a classification model comprehensively.Accuracy provides a general indication of the model's effectiveness.Meanwhile, precision, recall, and the F1-score offer detailed insights into the model's ability to manage false positives and false negatives and the balance between precision and recall.The choice of which metric(s) to use depends on the specific problem being addressed and the desired outcomes of the model evaluation.

ANOVA Test for Performance Variability Analysis
To analyze the variability in neural network performance due to different initial conditions, we employed the Analysis of Variance (ANOVA) test.ANOVA is a statistical method used to determine whether there are any statistically significant differences between the means of two or more independent groups [45].In this study, ANOVA was applied to compare neural network performance metrics (accuracy) across multiple trials with varying initial conditions.
The ANOVA test partitions the total variability in the data into components attributable to different sources of variation.In a one-way ANOVA, the primary components are the variability between groups and the variability within groups.The between-groups variability measures the variation due to differences between the groups, whereas the within-groups variability measures the variation within each group.
The total sum of squares (SS total ) is the sum of the between-groups sum of squares (SS between ) and the within-groups sum of squares (SS within ).The sum of squares for between groups (SS between ) is calculated as follows: where k is the number of groups, n i is the number of observations in group i, Xi is the mean of group i, and X is the overall mean of all observations.The sum of squares for within groups (SS within ) is calculated as follows: where X ij is the j-th observation in group i.
The mean squares for between groups (MS between ) and within groups (MS within ) are obtained by dividing the corresponding sum of squares by their degrees of freedom (df).The mean square for between groups (MS between ) is calculated as follows: and the mean square for within groups (MS within ) is calculated as follows: where N is the total number of observations across all groups, and k is the number of groups.
The F-statistic is then calculated as the ratio of the between-groups mean square to the within-groups mean square, as follows: This F-statistic follows an F-distribution with k − 1 and N − k degrees of freedom.The p-value associated with the F-statistic is used to determine whether the observed differences between group means are statistically significant.
We applied this ANOVA test to analyze the variability in neural network performance, ensuring that the evaluation accounts for differences due to initial conditions.Algorithm 7 used the ANOVA test to analyze the variability in neural network performance; this approach provides a statistically robust comparison of performance metrics.The results of the ANOVA test are discussed in Section 4.

Results
The performance metrics presented in Table 5 provide a comprehensive evaluation of various neural network architectures during the training and testing stages across four datasets: NF-ToN-IoT-V2 Filtered Dataset, NF-ToN-IoT-V2 Wrapper Dataset, Edge-IIoTset Filtered Dataset, and Edge-IIoTset Wrapper Dataset.
For the NF-ToN-IoT-V2 Filtered Dataset, the Bilayered Neural Network architectures (2 × 10 and 2 × 25) exhibit superior performance, achieving remarkably high metrics across both training and testing stages.Specifically, the Bilayered Neural Network 2 × 10 achieves a training accuracy of 99.93% and a testing accuracy of 99.84%, with precision values of 99.90% (train) and 99.78% (test), recall values of 99.92% (train) and 99.82% (test), and F1scores of 99.91% (train) and 99.80% (test).The Bilayered Neural Network 2 × 25 closely follows with a training accuracy of 99.91% and a testing accuracy of 99.88%, precision values of 99.88% (train) and 99.84% (test), recall values of 99.89% (train) and 99.86% (test), and F1-scores of 99.89% (train) and 99.85% (test).These results highlight their robust capability in generalizing from training data to accurately detect XSS attacks with minimal false positives and false negatives.
In comparison, the Medium Neural Network 1 × 25 demonstrates strong performance, particularly in the testing stage, with an accuracy of 98.99%, a precision value of 98.68%, a recall value of 98.88%, and an F1-score of 98.78%.This surpasses the Wide Neural Network 1 × 100, which records a slightly lower testing accuracy of 98.79%, a precision value of 98.43%, a recall value of 98.66%, and an F1-score of 98.54%.This indicates the effectiveness of the Medium Neural Network in balancing model complexity and generalization ability, making it a preferable choice over the Wide Neural Network for this dataset.Conversely, the Narrow Neural Network 1 × 10 shows a lower performance, with a training accuracy of 98.77%, a testing accuracy of 98.74%, precision values of 98.40% (train) and 98.36% (test), recall values of 98.64% (train) and 98.60% (test), and F1-scores of 98.52% (train) and 98.48% (test).These results highlight the challenges simpler architectures face in accurately detecting XSS attacks, underlining the importance of a little increasing model complexity for better performance.For the NF-ToN-IoT-V2 Wrapper Dataset, the Trilayered Neural Network configurations (3 × 10 and 3 × 25) show consistently robust performance.The 3 × 10 network achieves a training accuracy of 99.76% and a testing accuracy of 99.76%, precision values of 99.68% (train) and 99.68% (test), recall values of 99.73% (train) and 99.73% (test), and F1-scores of 99.71% (train) and 99.71% (test).The 3 × 25 network follows closely with a training accuracy of 99.72% and a testing accuracy of 99.71%, precision values of 99.63% (train) and 99.61% (test), recall values of 99.68% (train) and 99.67% (test), and F1-scores of 99.66% (train) and 99.64% (test).These configurations outperform the Bilayered Neural Network 2 × 25, which, although also performing exceptionally well, records slightly lower metrics, particularly in precision (99.84% train, 99.78% test) and recall (99.86% train, 99.81% test).
In the case of the Edge-IIoTset dataset, the Trilayered Neural Network 3 × 10 emerges as the top performer with a test stage accuracy of 99.79%, showcasing its outstanding generalization capabilities.Specifically, it records a training accuracy of 99.85%, precision values of 99.80% (train) and 99.72% (test), recall values of 99.83% (train) and 99.76% (test), and F1-scores of 99.81% (train) and 99.74% (test).The Bilayered Neural Network 2 × 25 also achieves high performance, particularly in the filtered dataset variant, recording a training accuracy of 99.32% and a test stage accuracy of 99.00%, precision values of 99.11% (train) and 98.84% (test), recall values of 99.24% (train) and 99.01%(test), and F1-scores of 99.17% (train) and 98.92% (test).These results indicate that while both configurations are highly effective, the Trilayered Neural Network 3 × 10 holds a slight edge in terms of overall performance.Across all datasets, the Bilayered Neural Network architectures consistently achieve the highest accuracy and robustness with relatively lower complexity compared with the trilayered networks.This is particularly evident in the NF-ToN-IoT-V2 Filtered Dataset, NF-ToN-IoT-V2 Wrapper Dataset, and Edge-IIoTset Dataset Wrapper Dataset, where the Bilayered Neural Network 2 × 10 and 2 × 25 configurations demonstrate superior performance metrics with high accuracy, precision, recall, and F1-scores, underscoring their effectiveness and efficiency.These findings suggest that the bilayered architectures provide a balanced and highly effective solution for accurately detecting XSS attacks in IoT systems over 5G networks, achieving high performance with less complexity.
The confusion matrices presented in Figure 10  This indicates that a larger number of neurons in the single hidden layer has improved the model's ability to identify XSS attacks.However, false positives have increased slightly, which may be acceptable if reducing false negatives is a priority.The Wide Neural Network 1 × 100 (e, f) sees a further reduction in both false positives and false negatives, with 26,163 true positives and 10,817 true negatives during training, and 267 false positives and 178 false negatives.In the test stage, it shows 11,210 true positives and 4635 true negatives, with 116 false positives and 78 false negatives.The performance of the Medium Neural Network 1 × 25 is better between all single-layer models, highlighting its high ability to generalize from training to testing data and effectively balance the trade-off between sensitivity and specificity.
In the case of multi-layer networks, the Bilayered Neural Network achieved the highest performance between all models, starting with the 2 × 10 configuration (g, h Across all stages and configurations, it is evident that as the network width and depth increase, they do not necessarily increase the model's performance in the detection process, as we observed in the experimental results of the NF-ToN-IoT-V2 Filtered Dataset, where bilayered models with a simple configuration of 2 × 10 obtained better results than their more complex counterparts.Therefore, the issue of complexity must be taken into consideration when adopting attack detection models.

Results of the ANOVA Test for Wrapper and Filtered NF-ToN-IoT-V2 and Edge-IIoTset Datasets
To investigate the performance of the DL models over filtered and wrapper NF-ToN-IoT-V2 and Edge-IIoTset datasets, we applied ANOVA tests to determine whether statistically noteworthy accuracy differences exist in those seven models.Those are explained using the neural network example with the weight of the initial condition (also to point out that deep networks have many hyperparameters).Each model was trained ten times with random initial weights to make it robust and put this kind of variability into perspective.The results of these ANOVA tests are presented in Table 6.The ANOVA test results indicate that, for the NF-ToN-IoT-V2 and Edge-IIoTset datasets, the differences in accuracy between the seven neural network models are not statistically significant.For the filtered NF-ToN-IoT-V2 dataset, the between-groups sum of squares (samples) is 0.0750, while the within-groups sum of squares (error) is 5.7026.The sum of squares is 5.7776, with between-groups and within-groups degrees of freedom of 6 and 63, respectively.The mean square between the two groups is 0.0125, and the mean square within the groups is 0.0905, resulting in an F-statistic of 0.1381 and a p-value of 0.9907.Likewise, the NF-ToN-IoT-V2 pool dataset has a p-value of 0.9804, with the sum of squares between groups at 0.0930 and within groups at 5.3221.For the Edge-IIoTset dataset, the filtered dataset displays a p-value of 0.9750 with a between-groups sum of squares of 0.1077 and a within-groups sum of squares of 5.5953, while the pooled dataset has a p-value of 0.9603, with a between-groups sum of squares of 0.1457 and a within-groups sum of squares of 0.1457 and a within-groups sum of squares of 0.9603.At 6.2878, these high p-values indicate that any observed differences in model accuracy are likely the result of random chance rather than actual differences in model performance.
The box plots in Figure 14 visually support the ANOVA results, showing similar distributions of accuracy values across the models with overlapping interquartile ranges (IQRs) and medians.In the filtered NF-ToN-IoT-V2 dataset (Figure 14a), the models exhibit varying ranges of accuracy values, with some models like Model 1 and Model 3 showing higher variability and wider ranges, while Model 6 demonstrates more consistent performance with narrower ranges.This pattern is similarly observed in the wrapper NF-ToN-IoT-V2 dataset (Figure 14b).For the Edge-IIoTset dataset, both the filtered (Figure 14c) and wrapper (Figure 14d) plots indicate that while some models exhibit wider ranges of accuracy values, the overall distributions are quite similar across the models.However, these differences in variance do not translate into significant differences in overall performance, as the variance analysis (ANOVA) results indicate.The consistency across models and datasets suggests that the training process is stable, providing a reasonable guarantee of accuracy no matter which model is chosen.The results suggest that all seven models may be used interchangeably without fear of significant change in performance; this demonstrates the consistency of the neural network training process and details the robustness of the model across the datasets and feature selection techniques.

Comparison with Related Works
In this section, we compare our proposed XSS attack detection method with existing studies in the field, focusing on the datasets, feature selection methods, and performance outcomes.This comparison highlights the advancements and improvements offered by our approach, as shown in Table 7.
Our work advances the current state of the art by achieving a commendable high performance in detection accuracy.For the NF-ToN-IoT-V2 dataset, our Bilayered Neural Network achieved an impressive 99.84% accuracy, outperforming that of Duan et al. (2022) [23], who attained 95.70% using DLGNN, and that of Awad et al. (2022) [25], who reported 98% with a Random Forest model.Similarly, Yigit et al. (2023) [26] achieved 98.04% using an LSTM-AE model, and Sarhan et al. (2022) [27] reported 96.83% using Extra Trees.Our approach's use of advanced neural network architectures and a combination of filter and wrapper feature selection techniques clearly contributes to this superior performance.These results highlight the effectiveness of our method in utilizing comprehensive feature selection and sophisticated neural network structures to achieve higher accuracy in detecting XSS attacks.
[24] AWID Recursive feature elimination, constant removal 99% using Decision Trees, Random Forest, SVM [25] NF-ToN-IoT-V2 feature importance model 98% using RF [26] NF-ToN-IoT-V2 AutoFS and AutoCM 98.04% using LSTM-AE [27] NF-ToN-IoT-V2 -96.83% using Extra Tree [28] Edge-IIoTset -85.48% using Decision Trees [29] Edge-IIoTset -77% using CNN [30] Edge  [30] achieved 76.22% using SecurityBERT.These comparisons underscore the robustness and efficacy of our method, particularly in handling complex datasets.The significant increase in detection accuracy, as shown in Figure 15, demonstrates the advantage of our approach in leveraging deep learning techniques to handle the intricate nature of IoT data.Integrating sophisticated neural network models and comprehensive feature selection methodologies sets a new benchmark for XSS attack detection in IoT systems, highlighting the potential for enhanced security in 5G networks.Our results confirm our approach's effectiveness and indicate its potential applicability in broader IoT security contexts, paving the way for future research and development in this critical area.

Conclusions
Our research presents a cutting-edge solution for detecting and mitigating Cross-Site Scripting (XSS) attacks in IoT systems over 5G networks using Artificial Neural Networks (ANNs).We developed an innovative ANN-based approach that significantly outperforms traditional methods, achieving high detection accuracies validated by extensive testing on the NF-ToN-IoT-v2 and Edge-IIoTset datasets.We optimized our models' performance by employing mutual information (MI) and recursive feature elimination (RFE) for feature selection, reducing computational demands while maintaining exceptional accuracy.Our Bilayered Neural Network (BLNN) and Trilayered Neural Network (TLNN) models reached accuracies of 99.84% and 99.79%, respectively, highlighting their superiority over existing methods.The robustness and reliability of our approach were further confirmed through ANOVA tests, which demonstrated statistically significant improvements in detection accuracy.This research sets a new standard for XSS attack detection in IoT environments, showcasing the effectiveness of sophisticated neural network architectures and comprehensive feature selection techniques.Our findings emphasize the critical role of advanced artificial intelligence models in enhancing IoT security, paving the way for future innovations in safeguarding IoT systems in the 5G era.Funding: This research is funded by Northern Border University, Arar, Saudi Arabia, through the project number "NBU-FFR-2024-1661-04".

Figure 1 .
Figure 1.Proposed Approach to Detection XSS Attack.

Algorithm 1 : 4 Fit
Label encoding of categorical features.Input: Dataset D with categorical features C. Output: Dataset D ′ with categorical features transformed into numeric labels. 1 Initialize an instance of LabelEncoder: label_encoder ← new LabelEncoder().// Step 1: Initialize Label Encoder 2 Define a list of categorical columns: categorical_columns ← list of categorical features in D. // Step 2: Define Categorical Columns 3 for each column c in categorical_columns the LabelEncoder to the column c and transform the values: 5 transformed_values ← label_encoder.fit_transform(D[c]). // Step 3: Fit and Transform 6 Replace the original column values in D with the transformed numeric labels: 7 D[c] ← transformed_values.// Replace Values

Algorithm 2 : 3 6 S 8
Filter feature selection based on mutual information (MI).Input: Dataset S with features X and target class labels y, and optionally a threshold t for feature selection.Output: Ranked list of features S * based on their mutual information scores. 1 S * ← ∅ // Initialize the ranked list of features 2 for each feature f in X Compute mutual information score between feature f and y using mutual_info_classif and store the score.// Score calculation for f 4 Rank X based on computed mutual information scores to form S * .// Feature ranking 5 if threshold t is specified * ← features from S * with scores ≥ t. // Threshold filtering 7 else Select the top k features from S * for final inclusion.// Top k selection

Figure 2 .
Figure 2. MI scores for all features in the NF-ToN-IoT-V2 dataset.

Figure 3 .
Figure 3. Selected features with the highest MI values in the NF-ToN-IoT-V2 dataset.

Figure 4 .
Figure 4. MI scores for all features in the Edge-IIoTset dataset.

Figure 5 .
Figure 5. Selected features with the highest MI values in the Edge-IIoTset dataset.

Algorithm 3 : 7 if s is specified 8 Remove s features 9 else 10 Remove
Wrapper feature selection based on recursive feature elimination (RFE).Input: Dataset S with features X and target class labels y, the number of features to select k, and optionally the step size s for feature elimination.Output: Ranked list of features S * based on their importance.1 S * ← X // Initialize the set of features 2 while |S * | > k do 3 Train the model f using features in S * .// Model training 4 Compute feature importances β = [β 1 , β 2 , . . ., β m ] using the trained model.// Importance computation 5 Rank features based on the importance scores β. // Feature ranking 6 Identify the least important feature f j with the lowest β j .// Identify least important feature Remove the s least important features from S * .// the single least important feature f j from S * .// Remove one feature 11 Evaluate model performance L(y i , f (X ′ i )) with the remaining features.// Performance evaluation

Figure 6 .
Figure 6.Feature importance scores for all features in the NF-ToN-IoT-V2 dataset.

Figure 7 .
Figure 7. Selected features with the highest feature importance values in the NF-ToN-IoT-V2 dataset.

Figure 8 .
Figure 8. Feature importance scores for all features in the Edge-IIoTset dataset.

Figure 9 .
Figure 9. Selected features with the highest feature importance values in the Edge-IIoTset dataset.

Algorithm 5 :
Training procedure for Bilayered Neural Network (BLNN).Input: Input features X, Target labels Y, Learning rate η, Number of epochs Output: Trained BLNN model with optimized weights and biases 1 Initialize weights W 1 , W 2 , W 3 and biases b 1 , b 2 , b 3 randomly 2 for each epoch 3 foreach sample (x, y) in X, Y do // Forward propagation 4

Algorithm 6 :
Training procedure for Trilayered Neural Network (TLNN).Input: Input features X, Target labels Y, Learning rate η, Number of epochs Output: Trained TLNN model with optimized weights and biases 1 Initialize weights W 1 , W 2 , W 3 , W 4 and biases b 1 , b 2 , b 3 , b 4 randomly 2 for each epoch 3 foreach sample (x, y) in X, Y do // Forward propagation 4

Algorithm 7 : 3 for each trial t from 1 to T 4 Set random seed 5 Train neural network 6 12 for each observation j in group i 13 SS 2 14 1 15
ANOVA test for neural network performance.Input: Number of trials per group (T), Number of groups (G) Output: F-statistic, p-value 1 Initialize arrays to store performance metrics for each group 2 for each group g from 1 to G Record performance metric (accuracy) for group g 7 Compute overall mean of all observations ( X) 8 Compute SS between = 0, SS within = 0 9 for each group i from 1 to G 10 Compute group mean ( Xi ) 11SS between + = n i ( Xi − X) 2 within + = (X ij − Xi ) Compute MS between = SS betweenG−Compute MS within = SS within N−G 16 Compute F = MS between MS within 17 Determine p-value from F-distribution 18 return F-statistic and p-value provide a comprehensive view of the performance of various neural network configurations during both the training and testing stages.Starting with the Narrow Neural Network 1 × 10 (a, b), the model performs well on the majority class (Benign), with 26,152 true positives and 10,813 true negatives during training, but there are 276 false positives and 184 false negatives.In the test stage, it shows 11,204 true positives and 4633 true negatives, with 121 false positives and 81 false negatives.Moving on to the Medium Neural Network 1 × 25 (c, d), there is a decrease in false negatives compared with the narrow network, with 26,197 true positives and 10,831 true negatives during training and 238 false positives and 159 false negatives.In the test stage, it achieves 11,233 true positives and 4644 true negatives, with 97 false positives and 65 false negatives.
) demonstrating exceptional accuracy during both the training and test stages, with 26,459 true positives and 10,940 true negatives during training and 16 false positives and 10 false negatives.In the test stage, it achieves 11,329 true positives and 4684 true negatives, with 16 false positives and 10 false negatives, effectively balancing in detection XSS attacks over IoT environments.Moving to a more complex 2 × 25 configuration (i, j), the Bilayered Neural Networks have a little reduction in performance with 26,454 true positives and 10,937 true negatives and very low errors of 20 false positives and 14 false negatives during training.In the test stage, it maintains this strong performance with 11,334 true positives and 4686 true negatives and minimal errors of 11 false positives and 8 false negatives, indicating excellent generalization and robust detection capability.The Trilayered Neural Networks, while still highly effective, show slightly higher error rates compared with their bilayered counterparts.The 3 × 25 configuration (m, n) records 26,239 true positives and 10,849 true negatives during training, with 202 false positives and 135 false negatives, and in the test stage, it achieves 11,233 true positives and 4644 true negatives, with 97 false positives and 65 false negatives.The 3 × 10 configuration (k, l) shows 26,247 true positives and 10,852 true negatives during training, with 196 false positives and 130 false negatives, and in the test stage, it records 11,246 true positives and 4649 true negatives, with 86 false positives and 58 false negatives.

Table 1 .
Overview of the NF-ToN-IoT-v2 dataset categories, detailing the diversity and scale of cybersecurity threats and benign instances recorded.

Table 3 .
Overview of the Edge-IIoTset dataset categories, detailing the diversity and scale of cybersecurity threats and benign instances recorded.
Password 9989Uses brute force attacks or a sniffer to obtain the passwords of users.Port_Scanning 10,071Techniques for obtaining information about hosts and networks, also referred to as probing.Ransomware 10,925The victim's data are encrypted by malicious people, who only unlock them after demanding a ransom.

Table 5 .
Performance metrics of different neural network architectures during training and testing stages.

Table 6 .
ANOVA test results for filtered and wrapper datasets for NF-ToN-IoT-V2 and Edge-IIoTset datasets.

Table 7 .
Evaluating our XSS injection cyberattack detection approach with related works.