Intrusion Detection Based on Privacy-Preserving Federated Learning for the Industrial IoT

Federated learning (FL) has attracted significant interest given its prominent advantages and applicability in many scenarios. However, it has been demonstrated that sharing updated gradients/weights during the training process can lead to privacy concerns. In the context of the Internet of Things (IoT), this can be exacerbated due to intrusion detection systems (IDSs), which are intended to detect security attacks by analyzing the devices’ network traffic. Our work provides a comprehensive evaluation of differential privacy techniques, which are applied during the training of an FL-enabled IDS for industrial IoT. Unlike previous approaches, we deal with nonindependent and identically distributed data over the recent ToN_IoT dataset, and compare the accuracy obtained considering different privacy requirements and aggregation functions, namely FedAvg and the recently proposed Fed+. According to our evaluation, the use of Fed+ in our setting provides similar results even when noise is included in the federated training process.


I. INTRODUCTION
A S THE Internet of Things (IoT) expands, there is a significant increase in the number and impact of security vulnerabilities and threats associated with IoT devices and systems.
To cope with such concerns, intrusion detection systems (IDSs) Manuscript  represent a well-known approach for early detection of IoT attacks and cyber-threats [1]. In recent years, IDS mechanisms are usually based on artificial intelligence (AI) techniques, so that the system is trained with devices' network traffic to accurately detect any anomalous behavior, which could represent a certain type of attack [2]. Indeed, AI-based IDS are trained, considering monitored network traffic and behavioral data from heterogeneous IoT devices deployed in remote, possibly untrusted, and distributed domains and systems, to increase the overall accuracy for attack detection. However, this approach sparks privacy issues as different domains might need to share their private data [3].
As an alternative to typical centralized learning approaches, federated learning (FL) was proposed in 2016 [4] as a collaborative learning approach, in which an AI algorithm is trained locally across multiple decentralized edge devices, called clients or parties, and the information is continuously updated onto a global model through several training rounds. Instead of sharing their data, parties share their models with an aggregator, which computes a global model. Nonetheless, FL suffers from privacy issues, as the global model's updates provided by parties could be used to launch several attacks to infer the private information of the training data [5]. To mitigate such privacy concerns, differential privacy (DP) [3] can be employed to obfuscate either the training data or model updates, giving statistical privacy guarantees over the data against an adversary. DP is usually considered in the scope of FL settings due to the stringent communication requirements of other privacy-preserving approaches, such as secure multiparty computation (SMC) [6].
While the use of DP techniques has been considered in FL [7], [8], existing works do not analyze the impact of such techniques in the scope of IDS approaches, and do not address the impact of different aggregation methods considering nonindependent and identically distributed (non-i.i.d.) data distributions, which are common in real-world scenarios. In this direction, our work provides a comprehensive evaluation of DP approaches through several additive noise techniques based on Gaussian and Laplacian distributions, which are applied during the training of an FL-enabled IDS for industrial IoT (IIoT). Unlike previous approaches, our evaluation is based on an instance selection process to deal with non-i.i.d. data distributions over the recent ToN_IoT dataset [9], which contains recent attacks related to such scenarios. Furthermore, unlike the current state of the art, our evaluation compares the accuracy obtained by using different DP techniques and a recently proposed aggregation function This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ called Fed+ [10], which provides significantly better accuracy results compared to the traditional FedAvg function [4]. To the best of our knowledge, this is the first effort to analyze the impact of non-i.i.d. data and aggregation functions for the implementation of a privacy-preserving FL-enabled IDS in IoT/IIoT scenarios.
Based on this, the main contributions and novelties of this article are as follows.
1) A thorough evaluation on the feasibility and performance of applying DP-enabled FL to detect attacks in IoT scenarios, adapting and partitioning the ToN_IoT dataset considering non-i.i.d. data distributions. 2) An empirical analysis of using different FL-aggregation methods with DP techniques, and their impact on the effectiveness for intrusion detection in IoT. 3) This article accomplishes the first complete quantitative and computational performance analysis of diverse DP perturbation mechanisms applied to FL for intrusion detection in IoT, using different privacy-factor values and FL settings. The rest of this article is organized as follows. Section II provides an overview of FL, highlighting the main privacy issues and potential mitigation approaches. Section III describes the DP-enabled FL architecture, including the proposed training algorithm based on several DP techniques. Section IV describes our methodology for the proposed privacy-preserving FL-enabled IDS, including the aspects of the used dataset, classification model, and aggregation functions. Section V describes the evaluation results. Section VI analyzes the current state of the art. Finally, Section VII concludes this article.

II. PRIVACY-PRESERVING FL
The training process in FL is based on a set of rounds, in which the coordinator selects a subset of clients depending on the problem context, and sends the parameters of a global model to them. Then, those parameters are updated by each client using their own collected data and they are sent back to the coordinator. The entities make use of a certain aggregation method to fusion the parameters received from the clients. While FedAvg [4] represents the most common approach, in which a simple average is applied over such parameters, other methods have recently been proposed to increase the accuracy of the trained model, particularly in settings with non-i.i.d. data distributions. Indeed, as described in Section V, our work evaluates the recent Fed+ algorithm [10], which clearly improves FedAvg for our particular scenario.
The main advantage of FL is that parties do not need to share their data for training a certain model. However, recent works highlight the need to apply privacy-preserving mechanisms to address potential attacks derived from the sharing of parameters/weights throughout the training rounds [5]. In particular, a malicious aggregator could modify the received parameters to fool the model being trained. Even an honest-but-curious aggregator might perform a reconstruction attack to infer training data from these parameters using several techniques, such as generative adversarial networks (GANs). Indeed, the use of GAN can also be considered to launch membership inference attacks, in which an attacker could infer if local data of a certain party were used for the training process [6]. These attacks can be also carried out by external entities to the training process, as well as compromised FL clients. In the context of IoT/IIoT, the impact of these attacks can be significant due to the potential sensitivity of the network data required for the implementation of an IDS in such scenarios.
To deal with the aforementioned privacy issues, different techniques have been postulated, including SMC and DP [5]. SMC uses different cryptographic protocols to jointly calculate a function by using a set of input values, which are kept private for the parties. In the case of an FL environment, the parameters/weights produced by FL clients are kept private when they are fused by the aggregator. However, recent works [8] highlight the high computational and communication requirements associated with the use of SMC that can make these techniques unfeasible for IoT environments. Moreover, DP is based on injecting random noise into a dataset so that, looking at the output of a certain function over the dataset, it is not possible to discern if a certain data was used in such a dataset.
According to [3], the formal definition of DP is based on two concepts.
Definition 1 (Mechanism): A mechanism κ is a random function that takes a dataset D and outputs a random variable κ(D).
For example, if the input is an IoT attacks dataset, then the output can be the flow duration plus noise from the standard normal distribution. In our case, the inputs will be the weights of a model. Definition 2 (Distance): The distance of two datasets D and D denotes the minimum number of sample changes that are required to change D into D .
For example, if D and D differ on at most one individual, there is d(D, D ) = 1. We also call such a pair of datasets neighbors.
Definition 3 (Differential Privacy): A mechanism κ satisfies ( , δ)-DP if and only if for all neighbor datasets D and D , and ∀S ⊆ Range(κ), as long as the following probabilities are well defined, there holds δ represents the probability that a κ output varies by more than a factor of e when applied to a dataset and any one of its neighbors. This definition captures the intuition that a computation on private data will not reveal sensitive information about individuals in a dataset if removing or replacing an individual in the dataset has a negligible effect on the output distribution.
A lower value of δ implies a greater confidence, and a smaller value of tightens the privacy protection. This can be seen because the lower δ and are, the closer P r(κ(D) ∈ S) and P r(κ(D ) ∈ S), and therefore, the protection is stronger. As described in [3], when δ = 0, ( ,0)-differential is simplified to -DP. If δ > 0, there is still a small chance that some information is leaked. In the case of δ = 0, the guarantee of information leakage is not probabilistic. Therefore, -DP provides stronger privacy guarantees than ( , δ)-DP. For this reason, we have chosen for our approach δ = 0.
As described in Section V, our work evaluates different DP techniques based on Gaussian and Laplacian distributions, which are further detailed in Section III.

III. PROPOSED PRIVACY-PRESERVING FL-ENABLED IDS FOR IOT/IIOT
Based on the aforementioned aspects of privacy-preserving approaches for FL, in the following we describe the proposed architecture and algorithm enabling the FL training process. Furthermore, we provide a formal description of the specific DP techniques being considered in our work. For the sake of clarity, Table I provides the meaning of the main variables and terms, which are used in such description.

A. DP-Enabled FL
The overall architecture of our DP-enabled FL approach for intrusion detection in IoT is shown in Fig. 1. A client represents the end device where the local training is performed. Each client is in charge of training the global model sent by the aggregator with its local data and generating a model update. Furthermore, clients are endowed with the logic needed to apply the corresponding DP algorithm. Moreover, the aggregator is the central service that receives the model updates coming from the clients and generates an aggregated model, which is sent back to the clients for each training round.
In particular, considering the use of DP in the federated training, the process in each training round is as follows.
1) The aggregator selects a set of clients to participate in the training process by considering a certain client selection approach. In the context of IoT, operational conditions of the device (e.g., battery consumption) could be considered. 2) In the case of the initial training round, the aggregator creates a new general model W G , whose weights (w G i ) are sent to the selected clients. Otherwise, the aggregator creates an aggregated model by using a certain aggregation function based on the updates provided by the clients. For this work, we compare the use of FedAvg [4] and Fed+ [10] aggregation methods, which are further explained in Section IV. The resulting aggregated model is sent to the clients.
3) Each client takes the shared model and trains it with its local data, generating a particular model update with the weights W k i that results from local training. It should be noted that the number of epochs executed by a client is determined by the number of epochs E. 4) The clients anonymize the calculated weights leveraging a DP mechanism. The particular DP perturbation mechanism applied in this step could be one of the approaches defined in the following section. Then, the resultant anonymized weights W i are sent back to the aggregator entity. 5) The aggregator combines all the received model updates using an aggregation algorithm, such as FedAvg, generating a new global model. These steps are repeated until a certain number of rounds R is reached, or another condition is met, such as achieving a certain target accuracy.
Based on the previous description, Algorithm 1 provides a detailed description of the required steps for the DP-enabled federated training process, showing the integration and relationship between the FL aggregation method and DP mechanism applied in each round. It should be noted that our work is focused on evaluating several DP techniques under different privacy requirements to come up with a potential tradeoff between privacy and accuracy. These techniques are further described in the following.

B. Perturbation DP Mechanisms
As depicted in Definition 1, an output perturbation mechanism takes an input D and returns a random variable κ(D). Such a random variable is computed using the addition of the transformation of the input data by means of a function f : X → R d to some random noise that follows a certain distribution rn; therefore, we could express the ( , δ) − DP as follows: The perturbation methods analyzed in this article are summarized as follows.

5)
Bounded domain Laplace mechanism [13]: Given b > 0 and D ⊂ R, the bounded Laplace mechanism W q : Ω → D, for each q ∈ D, is given by its probability density where It is needed to note that C q is a function of b, and that the domain D = [l, u] having lu. Then, the following holds: when we define ΔC(b) as follows: 6) Bounded Laplace noise mechanism [14]: This algorithm adds independent noise in each of the k coordinates, drawn from the distribution μ DE,R that is supported in where x (i) and each η i is drawn independently from the distribution μ DE,R . Let e ∈ (0, 1), k ∈ N, and δ ≥ As we can understand from the definitions of the DP methods, the level of perturbation will highly depend on the chosen parameter . In that sense, we wanted to analyze how the similarity between the perturbed weights and the original ones is affected by , and we used the Pearson correlation to do so. This coefficient is a measure of the linear association of two variables, and it is measured from −1 to +1. A value of +1 indicates that the data objects are perfectly correlated, and a score of −1 means that the data objects are not correlated.
Essentially, the Pearson correlation coefficient (PCC) is the ratio between the covariance and the standard deviation of two variables. In mathematical form, the coefficient can be described as where r is the correlation coefficient, x i represents the values of the x-variable in a sample,x is the mean of the values of the x-variable, y i represents the values of the y-variable in a sample, andȳ is the mean of the values of the y-variable. In our approach, the PCC, defined in (8), is calculated over the model updates for every training round, before and after applying the DP mechanism, as defined in (1)- (7). This means that this metric indicates the similarity between the original weights and the modified ones for every client.

IV. METHODOLOGY
Before describing our evaluation results, this section provides an overview of different aspects of our approach, including the dataset, machine learning classification technique, and aggregation functions being considered.

A. Dataset Description
The dataset used in this article is based on the CIC-ToN_IoT dataset, 1 which was generated through the CICFlowMeter tool from the original PCAP files of the ToN_ IoT dataset [9] to extract 83 features. As a first step, we remove the nonnumeric features (e.g., flow ID). Then, we separate the samples of the whole dataset based on the destination IP addresses, i.e., victims' addresses, and then remove the samples that do not correspond to the top ten IP addresses, sorted by number of samples.
Update parameters w = w + w, b = + b end for Apply the ( , δ) − DP mechanism to the weights w to get κ(w). return κ(w) to the server [server] initialize W G 0 for each iteration r from 0 to R do S r = Choose p parties out of P for each party i ∈ S r do w i r = LocalUpdate(i, W G r ) end for Calculate new weights using Fed+:

C(w))] end for
Afterward, as the resulting datasets are highly imbalanced, we use Shannon entropy to measure the imbalance of each one of them. The main reason to do so is that the use of imbalanced and non-i.i.d. data has a significant impact on FL scenarios [15]. In particular, given a dataset of length n, and k classes of size c i , the balance between the classes is given by the formula where the function is equal to 0 if all classes are 0 except one, and is equal to 1 if all c i = n k . Furthermore, it should be noted that we consider that one FL client is represented by a single victim IP address. In this context, given that each instance represents a network flow, n is the number of network flows, k is the number of the attack classes, and c i is the number of instances of the class i. Table II describes the data distribution of the different parties, including the distribution of classes and the entropy values for each party. Based on such values, we select the parties with an entropy value higher than 0.2, so parties 0, 2, 4, and 5 are selected as the FL clients for our scenario.
Then, after this initial step, due to the fact that the classes of each local dataset are not well balanced, we use a simple instance selection mechanism based on undersampling, which consists in removing random samples from the predominant classes until we reach an entropy level higher than 0.6. First, from these classes, we select the instances that satisfy the entropy requirement, and then we randomly remove these selected instances. Table III summarizes the resulting data distribution of parties that are selected for the DP-enabled federated training process.

B. Multiclass Classification
In our approach, we use supervised learning considering a multiclass approach to classify the dataset instances into benign or a specific attack, namely, DoS, DDoS, Backdoor, Injection, MITM, Scanning, Password, and XSS. Specifically, we apply the multinomial logistic regression [16], also called softmax regression, due to its training efficiency. The algorithm was provided by the sklearn library 3 . It can also interpret model coefficients as indicators of feature importance. As with most classifiers, the input variables need to be independent for the correct use of the algorithm. Given the input x, the objective is to know the probability of y (the label) in each potential class p(y = c|x). The softmax function takes a vector z of k arbitrary values and maps them to a probability distribution as follows: .
In our case, the input to the softmax will be the dot product between a weight vector w and the input vector x plus a bias for each of the k classes .
The loss function for multinomial logistic regression generalizes the loss function for binary logistic regression and is known as the cross-entropy loss or log loss. While other supervised techniques could be employed, it should be noted that our work is focused on analyzing the impact of DP techniques considering different privacy requirements and aggregation functions in an FL setting.

C. Aggregation Functions
As described in Section II, the local updates generated by each client in FL are combined through an aggregation function in each training round. The most basic aggregation function is represented by FedAvg [4], which generates the global model based on the average of the weights generated by the FL clients. In particular, let W G = (w G i ) be the weights of the general model and W k = (w k i ) the weights of the party k, then where D and D i are the total data size and data size of each party, respectively. However, the performance of FedAvg may be degraded in scenarios with non-i.i.d. and highly skewed data, as this case. In this work, we also consider a recent approach called Fed+ [10], which unifies several functions to cope with scenarios composed by heterogeneous data distributions. For this purpose, Fed+ relaxes the requirement of forcing all parties to converge on a single model. In particular, let the main objective in FedAvg be where f k is the local loss function of the party k. In the case of Fed+, the main objective is where W are the global model weights, W k are the weights of the party k model, B(·, ·) is a distance function, α i > 0 are penalty constants, and C is an aggregate function that computes a central point of W . Then, to calculate the weights in each round, parties generate their respective W k , and send it to the aggregator. Afterward, the aggregator calculates the value of C(W ) and then sends it to the parties. Finally, the parties calculate the new model weights where γ k are the learning rates, and W k r represents the weights of party k at round r.

V. EVALUATION RESULTS
This section describes the evaluation results achieved by applying each one of the previously described DP mechanisms to the FL training scenario with a logistic regression classifier over the ToN_IoT dataset. The main performance evaluation parameters are , the perturbation DP mechanism, and the aggregation algorithm to be applied. The evolution in terms of accuracy for each mechanism and different values throughout the rounds are shown in Figs. 2 and 3. For the first one, FedAvg is configured as the aggregation algorithm to be used in every round, while Fed+ is used in the second one. As a reminder, a smaller provides a better privacy-preserving scenario. The evaluation also compares the achieved accuracy when using FL without applying DP techniques. For the Gaussian analytic mechanism, can take values higher than 1. Fig. 2 shows that the accuracy levels for all values are higher than 0.8, and that we achieve similar results to the configuration without DP, meaning that our framework almost does not impact the accuracy while providing more privacy than classical FL approaches. Moreover, Gaussian mechanism, as shown in Fig. 2, reaches close accuracy values for every value, even with the ones using the non-DP configuration. In general, in both cases, it is noticeable an increase in the first ten rounds in all cases, and then the accuracy stabilizes.
Furthermore, Fig. 2 shows the same information for Laplace bounded domain and Laplace bounded noise mechanisms. In this case, as in the previous mechanism, the accuracy values for different values are very close, even if the accuracy level is reduced at round 30 for Laplace bounded noise and round 20 for Laplace bounded domain for every and the non-DP configuration. In the case of the uniform mechanism, it should be noted that the accuracy value is clearly reduced throughout the rounds.
Moreover, Fig. 3 shows the results by using Fed+ as the aggregation algorithm. For this case, the evolution of the accuracy value is clearly ascending until round ten for every in all mechanisms. Compared with FedAvg, the final accuracy levels at the last round are slightly better, except for the uniform mechanism, which has a clearly higher accuracy level compared with FedAvg, since Fed+ behaves better with non-i.i.d. data, which is our case. According to Fig. 3, in most of the cases, the accuracy using our DP strategy outperforms the accuracy of the scenario without any perturbation. This could be explained by the fact that running stochastic gradient descent with noisy gradient estimates can help in the performance of the model [17], while the noisy data are above some threshold. Also, this phenomenon has previously been observed in other state-of-the-art studies, which apply DP for FL in a more limited way compared to us but yet obtain better accuracy when obfuscating data in certain cases [18].
Theoretically, the lower the value, since the privacy factor increases, the final model should reach lower accuracy values as the weights are more obfuscated. However, as can be seen in Figs. 2 and 3, for almost all perturbation mechanisms and both aggregation algorithms, there is no big difference between the accuracy reached by a mechanism using the most restrictive value (lower) and using the most relaxed one (higher). Nevertheless, using FedAvg and some mechanisms, such as Laplace truncated or Gaussian, it is clear that the more restrictive the is, the lower the accuracy throughout the rounds.
Moreover, it should also be noted that the privacy enhancement achieved by each mechanism is given by the distance to a PCC value of 1, which would represent the classical FL scenario where no perturbation mechanism is applied to data, and therefore, there is no obfuscation at all. Table IV gives the average PCC values for each perturbation mechanism and different values. It should be noted that, for all mechanisms, the lower the value, the lower the PCC. This indicates that, actually, lower values achieve a more obfuscated set of weights and, therefore, a higher privacy factor. As it can be seen, uniform is the mechanism that provides a more obfuscated set of weights, and consequently, the lowest PCC. Therefore, this mechanism provides the highest privacy factor compared with the other DP techniques analyzed in this article. Furthermore, the uniform mechanism also provides a similar accuracy level compared with the other DP techniques when using Fed+ as shown in Fig. 3. Consequently, it provides the best results considering the values of PCC and accuracy for the proposed scenario.
By last, Fig. 4 shows a box-plot graphic of execution times for every mechanism. This graphic is the result of ten different executions per mechanism for the same epsilon ( = 1), except for the uniform mechanism since it only accepts = 0. For each execution, the time spent in the perturbation process is measured, and then the maximum, minimum, and average times are shown in each box in the graphic.

VI. RELATED WORK
As discussed previously, despite the advantages provided by FL about privacy aspects, recent works demonstrated that different attacks (e.g., inference) are still possible during the federated training process through the access to the gradients/weights, which are uploaded by the FL clients to the coordinator [5]. Therefore, as mentioned in Section II, recent works have proposed the application of different privacy-preserving techniques, such as SMC and DP, to FL scenarios. In particular, a partial evaluation of DP techniques was carried out in [7], which implements the PrivacyFL simulator. Furthermore, Truex et al. [19] integrated SMC and DP to tackle inference attacks while maintaining an acceptable level of accuracy. The resulting approach is applied to several ML models, namely, decision trees, convolutional neural networks (CNN), and support-vector machine. However, these works do not compare the impact of different DP techniques considering their application to IDS approaches for IoT scenarios.
In the context of IoT, Briggs et al. [8] described the use of privacy-preserving techniques for FL, which also provides a set of challenges for resource-constrained scenarios. Hu et al. [20] addressed the application of DP for FL in IoT, which uses activity recognition data from smartphones and personalized models on each device. Also based on the use of DP, Lu et al. [21] made use of blockchain so that the computation required for the consensus mechanism is also used for the federated training process. However, they did not provide evaluation results related to the DP technique being employed. Furthermore, Zhao et al. [22] assessed the impact on the accuracy of applying Gaussian noise as a DP technique in an FL environment. However, the evaluation is based on the well-known MNIST dataset, and they did not compare these results with other DP techniques. In addition, Hu et al. [23] considered an IoT scenario with resource constraints, where a relaxed version of DP is applied and evaluated considering different datasets.
While previous approaches demonstrate the interest on the application of privacy-preserving techniques for FL, these aspects are not typically considered in the context of anomaly/intrusion detection. For example, Li et al. [24] proposed a fog architecture for DDoS attack detection and mitigation using FL. However, only DoS attacks are considered, and privacy techniques are not integrated. Furthermore, Liu et al. [25] used CNN for anomaly detection in IIoT based on a gradient compression mechanism to reduce the communication overhead in FL. In addition, the use of FL was also considered by Khoa et al. [26] to implement an intrusion detection approach in Industry 4.0 scenarios. They compared their solution using different ML techniques and datasets, but privacy aspects were not considered. Furthermore, Attota et al. [27] used an ensemble approach and an optimization algorithm for feature extraction to come up with an IDS approach for IoT. For validation purposes, they used a dataset containing traffic of a single IoT protocol. Moreover, recently a federated version of CNN was used by Man et al. [28] for intrusion detection in IoT that is intended to reduce communication overhead. Other works, such as [29], are based on old industrial datasets that do not consider recent attacks from IIoT scenarios. However, like in the previous cases, privacy-preserving techniques are not integrated. More related to our proposal, Chathoth et al. [30] recently proposed two DP-based continuous learning methods that consider heterogeneous privacy requirements for different FL clients in an IDS system. However, the approach is based on a non-IoT-specific dataset (CSE-CIC-IDS2018), and different DP techniques are not compared. In contrast to this approach, our work evaluates different DP techniques that are applied to the recent Ton_IoT dataset, which contains different types of attacks from IIoT environments, including network and sensor data manipulation attacks. To the best of our knowledge, this is the first effort that provides a comprehensive evaluation on the application of DP techniques for FL considering different aggregation techniques, in order to foster the development of a privacy-preserving FL-enabled IDS for IIoT.

VII. CONCLUSION
The development of ML-enabled IDS approaches was based on the processing of devices' network traffic to detect potential attacks and threats. While FL was coined to avoid parties to share their data, it still suffers from privacy issues associated with the communication of gradients/weights in each training round. To address such issue, this work provided an exhaustive evaluation on the use of DP techniques based on additive noise mechanisms, which are applied during the federated training process of the ToN_IoT dataset to come up with a privacy-preserving IDS for IIoT scenarios. We compared different noise addition techniques based on Gaussian and Laplacian distributions, and assessed the accuracy obtained using Fed+ as an alternative aggregation function to FedAvg that has recently been proposed to deal with non-i.i.d. data distributions, which are prevalent in the real world. According to our evaluation results, the use of such DP techniques maintains an acceptable level of accuracy, which is even close to a non-DP scenario in the case of low privacy requirements (i.e., with a high value). In the case of Fed+, the impact of DP techniques on the accuracy is not perceptible. To the best of our knowledge, this work represents the first effort to provide a comprehensive evaluation of an FL-enabled IDS in IIoT considering different aggregation functions. As future work, we will analyze the development of a personalized FL approach where each device has different privacy requirements, as well as the use of gradient compression techniques to be considered on IIoT scenarios with network constraints.