Fragility, Robustness and Antifragility in Deep Learning

We propose a systematic analysis of deep neural networks (DNNs) based on a signal processing technique for network parameter removal, in the form of synaptic filters that identifies the fragility, robustness and antifragility characteristics of DNN parameters. Our proposed analysis investigates if the DNN performance is impacted negatively, invariantly, or positively on both clean and adversarially perturbed test datasets when the DNN undergoes synaptic filtering. We define three \textit{filtering scores} for quantifying the fragility, robustness and antifragility characteristics of DNN parameters based on the performances for (i) clean dataset, (ii) adversarial dataset, and (iii) the difference in performances of clean and adversarial datasets. We validate the proposed systematic analysis on ResNet-18, ResNet-50, SqueezeNet-v1.1 and ShuffleNet V2 x1.0 network architectures for MNIST, CIFAR10 and Tiny ImageNet datasets. The filtering scores, for a given network architecture, identify network parameters that are invariant in characteristics across different datasets over learning epochs. Vice-versa, for a given dataset, the filtering scores identify the parameters that are invariant in characteristics across different network architectures. We show that our synaptic filtering method improves the test accuracy of ResNet and ShuffleNet models on adversarial datasets when only the robust and antifragile parameters are selectively retrained at any given epoch, thus demonstrating applications of the proposed strategy in improving model robustness.


Introduction
Deep neural networks (DNNs) are extensively used in various tasks and domains, achieving noteworthy performances in both research and real-world applications [1,2].It is the critical weaknesses of DNNs, however, that warrant investigation if we are to better understand how they learn abstract relationships between inputs and outputs [3,4].We propose to investigate the effects of a systematic analysis on DNNs by using a signal processing technique for network parameter filtering (the terms DNN and network are used interchangeably), in contrast to random filtering [5,6,7] methods.
Our work analyzes the performance of a DNN under (a) internal stress (i.e., the synaptic filtering of DNN parameters) and (b) external stress (i.e., perturbations of inputs to the DNN).We define internal and external s tress within the context of DNNs as a novel concept taking inspiration from the applications of stress on biological systems [8].Through analyzing the performance of a network to input perturbations (external stress) formed using an adversarial attack [9,10], we bring the weakness of the DNN to the foreground.We simultaneously apply synaptic filtering (internal stress) to the network parameters in order to identify the specific parameters most susceptible to the input perturbations, thus characterizing them as fragile.Similarly, we identify parameters of the DNN that are invariant to both internal and external stress when considering the network performance, thus characterizing them as robust to the applied stress.Following this reasoning, we introduce a novel notion of antifragility [11] in deep learning as the circumstance in which any applied perturbations (internal and external) on a network result in an improvement of the network performance.
When considering external stress, such as variations to the network input, we focus our analysis specifically on varying magnitudes of adversarial attack perturbations [9,10] due to their ability to exploit the learned representations of a network to decrease network performance [12].In our study, we focus on the fast gradient sign method (FGSM) attack for its equal single-step perturbation calculation for increasing network loss [13].Our synaptic filtering methodology (see Fig. 1) offers a comparative study of stateof-the-art DNNs using clean and adversarially perturbed datasets, and therefore, the study is relevant for any variation of perturbation introduced to the input space.We apply our methodology to expose the fragility, robustness and antifragility of network parameters over network learning epochs, which subsequently enables us to examine the landscape (performance variations over epochs) of the network learning process.
In order to better understand how an adversarial attack is effective in bringing a network to failure [14], we take a novel methodology that considers network susceptibility to adversarial perturbations in conjunction with network architecture and the learning processes (see Fig. 1).The proposed synaptic filters are considered to be the lenses under which we can characterize parameters of network architecture.
Our main contributions of this work, therefore, are as follows: • We offer a novel methodology based on signal processing techniques that apply internal stress (parameter removal) and external stress (adversarial attack) on DNNs to characterize the network parameters as either fragile, robust, or antifragile.
• We offer parametric filtering scores that use a defined baseline network performance to quantify the influence of specific parameters on the network performance.
• We apply internal stress on networks in the form of synaptic filters and use the filtered network per- Passing a DNN through parameter filters is equivalent to internal stress and applying an adversarial attack with various magnitudes on clean data is equivalent to external stress on a DNN.In this methodology, the DNN performances (labeled 1, 2, 3, and 4) are individually compared against a defined baseline DNN performance (sold green line in the illustration shown on the lower left) in order to characterize DNN parameters as fragile (red shaded area), robust (green shaded area), or antifragile (blue shaded area).
formances to show that networks trained on different datasets contain parameter characterizations that are invariant to different datasets throughout the network training process.
• We apply external stress to networks, in the form of an adversarial attack, to identify the specific parameters targeted by the adversary through a comparison of the synaptic filtering performances of the clean and adversarial test datasets.
• We show that our synaptic filtering method boosts the test accuracy of ResNet and ShuffleNet models on adversarial dataset when only the robust and antifragile parameters are retrained at any given epoch, thus proving a useful strategy for improving network robustness.
The following Sec. 2 gives insights into the background and related works.Section 3 offers definitions of the terms and concepts introduced in the proposed methodology.Section 4 reports the proposed methodologies.Section 5 shows the experimental results acquired using the proposed methodologies, and Sec.6 concludes the work.

Background and related work
We propose evaluating the resilience of DNNs using a physiologically inspired approach concerning the resilience of humans to stress on their physiology [8,21].Therefore, we analyze the performance of DNNs to internal and external stress.Within the context of deep learning, we consider internal stress to be the perturbations to the network parameters (i.e., synaptic filtering) [22,23] and we take external stress to be variations to the learning environment of the network (i.e., input perturbations) [24,25,26].
There exist various avenues of research when considering an analysis of DNNs to input perturbations [13,27] and synaptic filtering [23,6].The works of Szegedy et al. [9] and Goodfellow et al. [13] invited attention to investigate the vulnerability of DNNs to a particular method of crafting input perturbations in the form of adversarial attacks.The rapid development of new adversarial attacks [28] and equally abundant adversarial defense techniques [29,30], call for methods of analyzing the resilience of DNNs to carefully crafted input perturbations, designed to bring networks to failure.
The scrutiny of DNN resilience to these perturbations can be expanded to incorporate perturbations into network architectures.The study proposed by Han et al. [31] details how network parameters can be filtered out to reduce network size, without significantly affecting network performance.However, there may be conditions when filtering parameters may lead to improvements in the network performance.
Therefore, we use a notion of antifragility to describe an increase in network performance whilst being subjected to internal and/or external stress, in the form of synaptic filters [7,23,6] and adversarial attacks [14,28,30].Our notion of antifragility in DNN is in line with the antifragility notion described by Taleb and Douady [11]) to refer to a phenomenon whereby a system subjected to stress shows to improve in performance.We describe the related works on internal and external stress as follows: Internal Stress (Parameter Filtering) Network architecture affects how and what DNNs learn [32,33,34,35].Therefore, the works of Ilyas et al. [36] highlight the presence of robust and non-robust features within networks.In a similar context, we highlight the presence of fragile, robust and antifragile [11] parameters of different network architectures on both clean and adversarial test datasets.For the characterization of the network parameters, we propose a synaptic filtering methodology (see Fig. 1).
Identifying fragile, robust and antifragile parameters informs us about the compressibility of a network based on the variation and degradation in the network performance [37].A central principle of network compression techniques is to reduce network size whilst retaining network performance [31].A method of achieving network compression is through using network pruning techniques [38,23,6].Our work of parameter filtering differs from the objective of pruning techniques that aim to reduce DNN size, whereas we aim to analyze the characteristics of DNN parameters by systematically filtering them.As well as our works differ from those systematic tuning of DNN hyperparameters such as the number of layers and number of neurons in a layer to analyze the DNN performance [39], i.e., we systematically internal architecture of the DNN.Siraj et al. [40] proposed a robust sparse regularisation method for network compactness while simultaneously optimizing network robustness to adversarial attacks.Similarly, we use our synaptic filtering methodology (a network parameter removal technique) to study the performances of a DNN on clean and adversarial datasets, which enable us to identify parameters that cause a decrease in network performance on the adversarial dataset [41] compared to the clean dataset, thus characterized as fragile in our work.
We characterize parameters that are invariant to synaptic filtering on both clean and adversarial datasets as robust.Whereas the parameters that, when filtered, show to increase the network performance on the adversarial dataset compared to the clean dataset are characterized as antifragile.
External Stress (Adversarial Attacks) There are numerous methods for computing adversarial attacks on DNNs in the literature [25,26].The primary objective of adversarial attacks is to deceive a network into misclassifying otherwise correctly classified inputs [9,13].The importance of the analysis of adversarial attacks on DNNs is significant due to the existence of adversarial examples in real world applications [42,43].Similarly, in our work, we analyze the adversarial attack in order to characterize network parameters into the parameters that affect network performance negatively (fragile), invariantly (robust), and positively (antifragile).Adversarial examples are by design created to decrease network performance; however, when simultaneously carrying out synaptic filtering methods [41] it is possible to observe an increase in network performance, even under an adversarial attack, thus requiring the notion of antifragility.

Definitions
In this Section, we define fragility, robustness, and antifragility within the scope of DNNs.For defining fragility, robustness, and antifragility, we also need to define the internal stress, external stress and baseline network performance of DNNs.Here the stress on a DNN is a systematic perturbation, either internal (synaptic filtering) or external (adversarial attack).The purpose of applying the stress on DNN is to test the operating conditions of the DNN for both learned and optimized states, when evaluated on unseen datasets.The concepts of network fragility, robustness, antifragility and stress are shown in Fig. 2, where Fig. 2   Consider a neural network architecture as a set of functions f (x, •) that consists of a configuration of parameters, such as convolutions, batch normalization, pooling layers, activation functions, etc. [38], we define a parameterized neural network as f (x, W), for specific parameters W and input x.For an l-layer ) is given by ŷ = arg max 1≤k≤K f k (x, W).The network parameters W are assumed to be optimized, either partially or fully, using back-propagation and a loss function L : R × R → R given by L(ŷ, y) to calculate the network error.

Stress on DNNs
To formulate internal stress on the network, we consider two filtering domains: local (the parameters of any specific layer) and global (the parameters of the whole network).We apply synaptic filtering to filter the parameters of trainable convolutional layers and fully connected layers of the network, the non-trainable parameters, however, remain unaffected by the synaptic filtering procedure.The l-th layer network parameters (local parameters) are given as W (l) , while the global network parameters are W.
For convenience, we denote the network parameters to be evaluated by the synaptic filtering methods as θ, where θ = W (l) is the local parameter analysis [31] and θ = W is the global parameter analysis, as mentioned in [44].
Definition 1 (Synaptic filtering).The synaptic filtering involves taking a network f (x, θ) with parameter θ as an input and producing a filtered network f (x, θα ) with filtered parameter θα ) as: where α = {α 0 , α 1 , . . ., α A } is the normalised synaptic filtering thresholds across the complete parameter range of the evaluated network/layer with a lower bound α 0 = 0, upper bound α A = 1 and step size ∆ α given by α 1 = α 0 +∆ α .For synaptic filtering of a network, we have ŷ = f (x, θ) as the network predictions for the unperturbed network and ŷα = f (x, θα ) as the network predictions for the perturbed network.In Eq. 1, B α is a binary mask for a threshold α that filters parameters, θ α are the set of parameters to be filtered that may be different to θ, and ⊙ is the element-wise product operator To further constrain the internal stress analysis, we define that the network parameters to be filtered θ is not a zero vector prior to the synaptic filtering, i.e., θ must be a trained network: θ ̸ = 0.If this constraint is not met, the prediction of the network f (x, θ) will result in output values of zero for all inputs.
Definition 2 (Internal stress -synaptic filtering of the DNN parameters.).The internal stress on a DNN is the application of the synaptic filtering method with various magnitudes of α ranging from a minimum filtering threshold α 0 to the maximum filtering threshold α A in order to obtain a set of |α| filtered networks S α : With evaluating a network to internal stress, we examine how the filtering of learned parameters of a network, influences the network performance, thus identifying the specific filtering thresholds required to bring the network to failure.
Considering external stress as variations to the input x, we introduce x ϵ = x + δ ϵ as the perturbed example of x with an adversarial perturbation δ ϵ ∈ R d , where ϵ = {ϵ 0 , ϵ 1 , . . ., ϵ E } is the perturbation magnitude with minimum perturbation magnitude ϵ 0 and maximum perturbation magnitude ϵ E with step size ϵ 1 = ϵ 0 + ∆ ϵ .Using a single adversarial attack formulation method δ we define ŷ = f (x, θ) as the network predictions on clean dataset and ŷϵ = f (x ϵ , θ) as the network predictions on the adversarial dataset [Fig.1(Left)].When dealing with external stress only, θ is taken as the complete set of network parameters W.
The performance of f (x ϵ , θ) can inform us of the ability of the network to remain stable to external stress (input perturbations) applied to the network.This is achieved through a comparison of the network performance on a clean dataset and an adversarially perturbed dataset.There are numerous variations of δ that can be used to form external stress to the network, from targeting specific features of x to drawing distortions from a different distribution [25,26].However, in this work, we only focus on one perturbation method δ, i.e., FGSM attack, as our objective is to only compare DNN performance on clean and perturbed inputs (Fig. 1).
When applying external stress with various magnitude ϵ, we get a set of perturbed inputs for the network S ϵ : Definition 3 (External stress -adversarial attack on DNN).The external stress on a DNN is the application of an adversarial attack with various perturbation magnitudes ϵ ranging from a minimum perturbation magnitude ϵ 0 to the maximum perturbation magnitude ϵ E in order to obtain a set of |ϵ| inputs to the network S ϵ : With external stress on a network we examine how the variations in the input environment influence the network performance, thus identifying the specific magnitudes of the attack required to bring the network to failure.
An important consideration to make when analyzing networks using internal and external stress in Definitions 2 and 3, is that a resultant perturbed network (S α and S ϵ ) may offer equal performance to the unperturbed network, i.e., for all inputs x in test set, we observe: for internal stress threshold α, and where p(•) is a function that measures the network accuracy over all inputs x.This indicates that even under stress, a DNN may perform equivalently to an unperturbed network.Therefore, in order to evaluate the performance of a network to stress, we must define a baseline network performance against which we can measure the performance of perturbed and unperturbed networks.
A baseline network performance can vary for different types of stress (internal or external), as there may arise instances where the response of the baseline network performance, defined as f (x, θ α ) for internal stress and f (x ϵ , θ) for external stress, is not necessarily the same as the performance of the initially trained network (unperturbed network) f (x, θ).The baseline network performance for a combination of internal and external stress is defined as f (x ϵ , θ α ), where the baseline network is a function of ϵ and α.
To give context on why baseline network performance may not necessarily be the same as the performance of an unperturbed network, take the example when we apply internal stress to a DNN, the result is a set of filtered networks S α .If we define the upper bound of the stress magnitude equal to the total number of network parameters, i.e., α A = |θ|, then we obtain a network with parameter value zero θα A = 0.
Noticeably, the performance of a maximally perturbed network f (x, θα A ) cannot equal to the performance of unperturbed network f (x, θ), i.e., f (x, θα A ) ̸ = f (x, θ).Thus we require the baseline network performance to be a function of the magnitude of stress applied on the DNN.A detailed description of baseline network performance is given later in Sec.4.1.2.

Fragility, Robustness and Antifragility
Here we define the three characterizations of network parameters: fragility, robustness and antifragility.
In order to define the different characterizations of network parameters, we must establish the stress to which we can evaluate network parameter fragility, robustness and antifragility.The stress in question may be internal (S α ) or external (S ϵ ), or a combination of the two.
For simplicity, we consider only internal network stress for the definitions provided below.However, the change of variables from S α to S ϵ , from f (x, θα ) to f (x ϵ , θ), and from ∆ α to ∆ ϵ will give the definition of fragility, robustness and antifragility for external stress.
Definition 4 (Fragility).The parameters of a network are fragile if the performance of the networks decreases below a threshold −ε, compared to the baseline network performance for all magnitudes of the applied stress.Formally, the fragility to internal stress can be defined as: where ∆ α is the change in synaptic filtering threshold α, A is equal to |α|, ε ≥ 0 and asserts a variable fragility measure, as shown in Fig. 2(b) (red shaded region).When the threshold ε = 0, we have a strict fragility condition.Equation ( 4) computes the discrete area difference between the stressed network performance and the baseline network performance for all stress magnitudes of α.
Definition 5 (Robustness).The parameters of a network are robust if the performance of the networks is invariant to a threshold ±ε, compared to the baseline network performance for all magnitudes of the applied stress.Formally, the robustness to internal stress can be defined as: where ∆ α is the change in synaptic filtering threshold α, ε ≥ 0 and asserts a variable robustness measure, as shown in Fig. 2(b) (green shaded region).When the threshold ε = 0, we have a strict robustness condition.Equation (4) computes the discrete area difference between the stressed network performance and the baseline network performance for all stress magnitudes of α.
Definition 6 (Antifragility).The parameters of a network are antifragile if the performance of the networks increases to a threshold ε, compared to the baseline network performance for all magnitudes of the applied stress.Formally, the antifragility to internal stress can be defined as: where ∆ α is the change in synaptic filtering threshold α, ε ≥ 0 and asserts a variable robustness measure, as shown in Fig. 2(b) (blue shaded region).When the threshold ε = 0, we have a strict antifragility condition.Equation (4) computes the discrete area difference between the stressed network performance and the baseline network performance for all stress magnitudes of α.
2 and β 3 into a combined system performance β(l) .

Methodology of DNN parameters characterization
In this Section, we present the methodology of DNN parameter characterization that is shown in Fig. 1.
Concisely, Fig. 1 shows that this methodology has two major aspects a) the application of internal and external stress on DNN in terms of synaptic filtering and adversarial attack, and b) the need of a process to characterize parameters into fragile, robust and antifragile.This section first explains how we apply internal and external stress on DNNs in Sec.4.1 and then introduces parameter scores that characterize the parameters in Sec.4.2.Finally, we discuss the experiment setting in Sec.4.3.

Framework of internal and external stress on DNNs
We systematically apply internal and external stress on DNNs.The process of internal and external stress on DNNs is shown in Fig. 3, which is a three-step framework (adversarial attack on DNNs, synaptic filtering of DNNs, combined network performance) that leads to parameter score calculation for the DNN parameter characterization.

Attack on DNNs
In evaluating networks to internal stress, we compare the network performances to the synaptic filtering procedure for clean and adversarial (external stress) datasets (discussed in the following Sec.4.1.2).In this study, we work primarily with the FGSM attack [13] for the adversarial perturbation formulation; other attack formulation methods would not affect the synaptic filtering described in this section.The synaptic filtering technique is designed to be applied to a network with any variation on the inputs, therefore, the nature of the attack formulation method can be changed without affecting the synaptic filtering technique.
In order to experiment with an adversarial dataset, we must define some constraints of the attack [Fig.3(Left block)], such that the synaptic filtering responses are comparable between different network architectures and datasets.The constraints imposed upon the adversarial attack magnitude ϵ are, as follow: Definition 7 (minimum attack bound ϵ 0 -constraint 1.).We limit the adversarial attack to follow p(ŷ ϵ = y|x + δ ϵ0 ) < p(ŷ = y|x), for all inputs x in the test dataset.This constraint allows us to select a suitable minimum attack magnitude ϵ 0 , such that otherwise correctly classified inputs are misclassified, due to the adversarial attack.
Definition 8 (maximum attack bound ϵ E -constraint 2.).We limit the adversarial attack to a suitable maximum attack magnitude ϵ E , such that the network test accuracy is above a random guess (ŷ ϵ E K > 1), i.e., we have the constraint: p(ŷ ϵ E = y|x + δ ϵ E )K > 1, for all inputs x in the test dataset.
Definition 9 (relative attack ϵ -constraint 3.).To compare the performance of different network architectures and datasets to the synaptic filtering procedure, we must consider values of ϵ for different networks/datasets that reduce the network performance equally.Considering two different networks f 1 and f 2 , we use a single attack δ, for which ϵ 1 and ϵ 2 are the relative attack magnitudes for f 1 and f 2 .
Suitable values of ϵ 1 and ϵ 2 should be chosen, such that thus ensuring that the adversarial perturbations affect the network performances equally.

Synaptic filtering of DNNs
We investigate a set of synaptic filters h = {h 1 , h 2 , h 3 } containing three different synaptic filters [Fig.3(Middle block)]: h 1 , the ideal high-pass filter; h 2 , the ideal low-pass filter and h 3 the pulse wave filter.The operation of filtering for the three different filters is detailed in Eq. (1).
These three filters h 1 , h 2 , and h 3 with distinct properties when applied to a DNNs with threshold α i offers three sets of distinct perturbed networks f ( θ1,αi , x), f ( θ2,αi , x), and f ( θ3,αi , x).Therefore, we require three baseline network performances corresponding to the properties of the respective synaptic filters against which the three sets of perturbed networks are compared.
Baseline Network Performances We denote ϕ 1 , ϕ 2 and ϕ 3 to be the number of parameters filtered out by the synaptic filters h 1 , h 2 and h 3 corresponding to filtering threshold α i .If the synaptic filtering procedure is only applied to a local layer l (e.g., only on a convolutional layer or a linear layer) then ϕ (l)   is the maximum number of parameters in local layer l.For the whole network ϕ denote the maximum number of parameters in the network.Let us consider ϕ (l) 1 to the number parameters filtered out by the filter h 1 for the layer l at threshold α i , then the base network performance φ(l) 1,αi at threshold α i is given as: Similarly, the baseline network performances for filters h 2 and h 3 are ϕ (l) 2,αi and ϕ (l) 3,αi .In Eq. ( 10), the fraction ϕ (l) 1 ϕ (l) is a ratio between the number of parameters removed to the total number of parameters in the layer, defining the compactness of the filtered layer.
We consider the baseline network performance for all values of α, which we use to determine the parameter characteristics to synaptic filtering, as a function that reduces the network performance proportionally to the internal stress (synaptic filtering) applied on the network (see Fig. 4(a)).Using this definition of the baseline network performance, we expect the network performance to decrease proportionally to the number of parameters filtered by the synaptic filtering procedure (see Fig. 4(b)).The underling assumption of the baseline network performance is that the parameters being filtered out have an overall influence on the network performance.Hence, the baseline network performance represents the expected behaviour of the network, given as the classification accuracy on the test set, whilst the network is subjected the synaptic filtering procedure for all synaptic filtering threshold values α.
Network Compactness Our synaptic filtering method is a systematic ablation of DNN parameters to analyze variations in network performance caused by parameter filtering.We show that a proportion of the network parameters can be filtered out from a DNN, whilst retaining (and occasionally improving) the network performance on both clean and adversarially perturbed test sets [40].The characteristics of the baseline network performance, describes a network with parameters that, when filtered, proportionally influence the network performance.From Eq. ( 10), the proposed baseline network performance is linked to the the compactness of the network/layer; the characteristics of the baseline network performance is  1 (blue dotted line), which is scaled w.r.t the unperturbed network accuracy f (x, θ) (red dot), resulting in β(l) 1 (blue solid line).The blue shaded region rx, enclosed by base system response φ(l) 1 (green dotted line) is the area [Eq.( 14) and Eq. ( 15)] of synaptic robustness.
inversely proportional to the compactness ratio of the network/layer weights.For a specific non-random synaptic filtering method, the compactness characteristics of a network is constant for different variations to the input (e.g.adversarial attacks).Thus, we can compare the scaled network performances of a network to both clean and adversarial datasets, against the baseline network performance.
Network vs. adversary For a network, we define the network performances for all synaptic filtering thresholds α to be an |α|-length vector of the network prediction accuracies p(ŷ α = y|f (x, θα )) on the test set x.The network performance to the synaptic filtering h 1 [Eq.( 7)], h 2 [Eq.( 8)] and h 3 [Eq.( 9)] are given as β 1 , β 2 and β 3 respectively.We construct a clean network performance matrix β on inputs x by combining β 1 , β 2 and β 3 as: We further apply the synaptic filtering to the network under an adversarial attack δ with perturbation magnitudes ϵ, resulting in adversarial network performance matrix β ϵ .
Targeted parameters The matrices β and β ϵ are the network performances on clean and adversarial datasets to the synaptic filtering method that are the two different DNN states to compared.Thus, through a comparison of β and β ϵ (see Fig. 5), we expose the specific parameters (targeted parameters) that are either negatively, invariently or positively affecting the synaptic filtering performances for the adversarial dataset, compared to the clean dataset.Every pixel on the clean and adversarial images represents the network test accuracy and for targeted image it is the difference between former two over all evaluated epochs and α.

Combined network performance of synaptic filters
Different synaptic filtering methods expose different characterisations of parameters of the network.Thus we combine the network performances of different synaptic filters using a function g(•) to form a combined network performance β, as shown in the synaptic filtering framework in Fig. 3(right).In order to combine the performances, let us consider β as the network performance to be combined; the procedure is the same for the adversarial network performances to the synaptic filters β ϵ .We take β as the performance of the perturbed network (synaptic filtering performance) relative to the unperturbed network performance f (x, θ).Subsequently, we take the mean of the performances of the network of all three different filters, Fig. 6 shows an example of network accuracy results of the synaptic filtering procedure applied to layer 'conv1' of ResNet-18 trained on CIFAR10.The top row shows the effects of three different filters on network accuracy; the middle row shows each filter's epoch-wise effect on network accuracy.The third row is the effect of the combined response [as per Eq. ( 13)] of the filters.
Similarly, the combined adversarial network performance βϵ is computed by replacing β with βϵ in Eq. ( 13), where βϵ is the performance of the perturbed network (synaptic filtering performance) relative to the unperturbed network performance f (x ϵ , θ) on adversarial perturbed datasets x ϵ .Although the combined network performances β and βϵ offer more descriptive information to examine the network parameters, a single synaptic filter is also able to expose the targeted network parameters.As calculating β and βϵ is computationally expensive for local analysis (as this increases exponentially to the number of local layers in a DNN), we suggest computing β and βϵ for all network parameters (i.e., global analysis).
Fig. 7 shows the combined synaptic filtering responses (local response is in the left three columns and the global response is on the rightmost column in Fig. 7) for ResNet-18 trained on MNIST, CIFAR10, and Tiny ImageNet datasets (see each row in Fig. 7 for respective dataset) for every 10 epochs up to 100 epochs).

Parameter scoring for DNN parameter characterization
To expose the network parameters targeted by the adversary, let us consider the network performance 1 for synaptic filter h 1 on layer l.We scale β  (10)] and the procedure is captured in Fig. 4. Similarly, we compute β (l) 2 and β (l) 3 for synaptic filters h 2 and h 3 .The combined performance of the three different synaptic filters is β(l) [Eq.( 13)].

Parameter score for clean data
We take the baseline network performance φ(l) 1 [Eq.( 10)] for synaptic filter h 1 as the point to which we evaluate the filtered network responses to.We take φ(l) 1 to describe a network/layer that contains neither an excess nor a deficiency of parameters that influence the network performance (i.e. the removal of any parameter affects the network performance).The network performance, on average will react inversely proportional to ablation of network parameters to synaptic filtering.The parameter score to synaptic filtering for a network using a clean dataset is r x is shown in Fig. 8(a)(top)] and given as: Where ∆ α is the change in the α threshold window.A parameter score equal to 0 signifies that the network/layer responds, on average, proportionally to synaptic filtering, i.e., proportional to variations in architecture and thus is considered robust.Where the score r x is less than 0, this indicates that the network/layer contains fragile parameters to the network performance.Conversely, where the value of r x is greater than 0, the parameter score indicates that the network/layer contains antifragile parameters, where the removal of parameters from the network/layer results in a network performance that is better than the baseline network performance.16)] is the adversarial parameter score (bottom).

Parameter score for adversarial data
The parameter score to synaptic filtering for a network using an adversarial dataset is r xϵ , and is calculated using the baseline network performance φ(l) 1 .The baseline network performance is compared with the adversarial dataset performance β(l) 1,ϵ to give the parameter characterization score r xϵ , as per: Where ∆ α is the change in the α threshold window.A parameter score equal to 0 signifies that the network/layer responds, over all magnitudes of internal stress, proportionally to synaptic filtering, i.e., proportional to variations in architecture and thus is considered robust.Where scores r x and r xϵ are less than 0, this indicates that the network/layer contains fragile parameters w.r.t. the network performance.
Conversely, where r x and r xϵ are greater than 0, the scores indicate that the network/layer contains antifragile parameters.

Difference of parameter scores
To compute the effects of the adversarial attack on the parameter characterisation, using our proposed synaptic filtering method, we take the baseline network performance to be the synaptic filtering performance on the clean dataset ( . The difference in the adversarial dataset performance β 1,ϵ and clean dataset performance β (l) 1 (baseline network performance), results in the effects of the adversary on the synaptic filtering performance of the network.We take the area of the residual as the effects of the adversary on the network.The value of r ϵ is computed by taking the discrete area difference, as shown in Fig. 8(b) (bottom) and expressed as: If the network performs equally to clean and adversarial datasets for all filtering thresholds α, the value of r ϵ = 0.Where r ϵ < 0, the network performance on the adversarial dataset is greater than the network performance on the clean dataset.This signifies that the evaluated network/layer contains parameters that increase the network performance on the adversarial dataset compared to the clean dataset.Conversely, r ϵ > 0 signifies that the network performance on the clean dataset is greater than the network performance on the adversarial dataset.This signifies that the evaluated network/layer contains parameters that decrease the network performance on the adversarial dataset compared to the clean dataset.Hence, the magnitude of r ϵ gives us a scalar value of the difference in clean and adversarial responses to the filtering.

Experimental set-up
Our experiment setting includes standard training of state-of-the-art DNNs on popular benchmark datasets.
Training of DNNs on clean datasets For the training, the parameters of each DNN were initialised using the Kamming Uniform [45] method.We use a cross-entropy loss function and the Adam optimizer [46] configured with γ = 0.001, β 1 = 0.9, β 2 = 0.999, λ = 0 and ε = 1 × 10 −08 for training networks.We train the networks using clean datasets only and apply the adversarial attack only to the test datasets for analysis using the synaptic filtering methodology.Networks are saved at every 10 epochs during network training to a maximum of 100 epochs.Saved networks are subsequently evaluated using the proposed synaptic filtering methodology, the results presented in Sec. 5 are shown for the saved networks.
Adversarial attack on datastes For the adversarial attack, we use the single-step FGSM attack [13] and analyze the difference in network performance on the test set to the proposed synaptic filtering methods (Sec.4).The effectiveness of an adversarial attack on a given dataset can be attributed to the complexity of the datasets the attack has been applied to.

Collection of results
We normalize the r x and r xϵ parameter score values from Sec. 4.2.1 to be between -0.5 (indicating fragility) and 0.5 (indicating antifragility) with the mid-point being 0 (indicating robustness).We carry out the same normalization procedure independently for all r ϵ values from Sec. 4.2.3 to be between -0.5 and 0.5.
For each network and dataset, the synaptic filtering responses are averaged over three different randomly initialised (as per [47]) and trained networks.In order to satisfy constraint 3 from Sec. 4, we use a line search algorithm to find the optimal ϵ value for each model and dataset, that satisfies: When carrying out the synaptic filtering procedure, we select A = 25 for all experiments carried out.Therefore, the filtering step size ∆ α = 0.04 over the normalised range of parameters in the evaluated network/layer.Where computational resources permit, we recommend using larger values of A for experimentation in order to more accurately compute the parameter scores.

Results and Analysis
The results of global (full network parameters) and local (network layer parameters) analysis shown in Figs. 9 and 10 describe the fragility, robustness, and antifragility characteristics of parameters (cf. Sec. 4.2.1 and Figs. 10 and 9).Furthermore, the results show the adversarially targeted (r ϵ ) parameters (cf.Sec.4.2.3 and Figs. 10 and 9).We identify parameter characteristics that are invariant for clean and adversarial datasets across different datasets and networks.

Effects of batch normalization
We investigated the phenomenon of the network retaining classification performance despite all features at layer l removed (see column α A in Fig. 5).When we investigate the output of layers deeper than l, we discover that residual features continue to propagate through the network despite the filtering out of network weights at the l-th layer.This is attributed to the Batch normalization (BN) layers that follow convolutional layers and are tasked with minimising covariance shift in the network [48].When implementing a network architecture, we utilise the standard models in accordance with literature; the functionality of batch normalization layers is also predefined and remains unaltered in our analysis.Consider the condition where a convolutional layer l has been filtered maximally using a synaptic filter, the subsequent batch normalization computation is given as:   l) , the maximum number of parameters filtered).This is due to the following batch normalization layer features propagating through the network during the forward pass, thus highlighting the effects of batch normalisation layers on network performance.
Where ŷ(l) is the output of the batch normalization process at the output of convolutional layer l; y (l−1) is the output of the previous convolutional layer l − 1 given by f ( θ(l) {1,2,3} , ŷ(l−1) ).The variables γ (l) and β (l) are learnable parameter vectors and ϵ is a value added to the denominator for numerical stability (set to 1 × 10 −5 ).Implementations of networks compute the expectation and variance from Eq. (17) as running statistics during network training; the statistics calculated during training are used during network inference.In consequence, when the input to the BN layer following convolutional layer l is a 0 vector, the case where layer l has been filtered maximally through synaptic filtering, the BN layer retains features of the training batches, even when evaluating test sets.This is shown from the results in Fig. 11, where the filtering of parameters from certain layers results in only a slight decrease of network performance.The ability of the network to retain sufficient performance, despite the filtering out of certain layer parameters, is due to the features propagated during the forward pass by the batch normalization layer following the filtered layer.
Selective backpropagation on robust and antifragile parameters Upon identifying robust, fragile and antifragile parameters using the difference in parameter scores (see Sec. 4.2.3)we consider fragile parameters to be parameters that, when perturbed, result in greater degradation of synaptic filtering performance on the clean dataset compared to the adversarial dataset.Robust parameters show to be invariant to both clean and adversarial datasets, and antifragile parameters show to have an increased network performance on the clean dataset compared to the adversarial dataset.Thus, we consider fragile parameters to be parameters that are important to the network performance on the adversarial dataset.We propose selectively retraining only the robust and antifragile parameters using backpropagation.In order to carry out this operation during network training, we take a layerwise approach that considers the parameter characterization scores of individual network layers and we subsequently omit the characterized fragile layers corresponding to negative parameter characterizations scores from network training by zeroing out the update gradients during training.
The results from our selective backpropagation method is shown in Fig. 12 where the mean (solid lines) and standard deviation (coloured shaded regions) of network performances are shown for networks tested at epoch 10 to epoch 100 measured every 10 epochs.We test each network to a maximum perturbation magnitude (external stress magnitude) of ϵ E , which is selected using Definitions.7, 8 and 9.As can be seen from the results, our proposed method, shown in teal, outperforms the networks trained using regular backpropagation training, shown in orange, when considering robustness to adversarial attacks.
Our proposed method shows to improve network robustness better on the CIFAR10 (Fig. 12(b)) and ImageNet Tiny (Fig. 12(c)) dataset compared to the MNIST (Fig. 12(a)) dataset.The effectiveness of the selective backpropagation method on CIFAR10 and ImageNet Tiny compared to MNIST can be attributed to the complexity of the datasets [49], where MNIST can be considered to have a lower complexity relative to CIFAR10 and ImageNet Tiny.

Conclusions
We can examine deep neural networks using our proposed synaptic filtering technique to characterize parameters of the network as fragile, robust and antifragile on both clean and adversarial inputs as a test bed.When subjected to synaptic filtering and an adversarial attack the fragile parameters are the parameters that cause a decrease in DNN performance.Whilst parameters characterized as robust cause the DNN performance to remain within a defined tolerance threshold (e.g.±2% change in DNN performance).Parameters characterized as antifragile cause an increase in DNN performance.
Such an identification method can be applied to distill a trained network in order to make it usable in several resource-constrained applications, such as wearable devices.We offer parameter scores to evaluate the affects of specific parameters on the network performance and expose parameters targeted by an adversary.We find that there are global and local filtering responses that have invariant features to different datasets over the learning process of a network.For a given dataset, the filtering scores identify the parameters that are invariant in characteristics across different network architectures.We   DNN robustness to adversarial attacks on all evaluated datasets and network architectures.

Fig. 1 :
Fig. 1: Our methodology of parameter filtering and evaluating DNN performances on clean and adversarial datasets.Passing a DNN through parameter filters is equivalent to internal stress and applying an adversarial attack with various magnitudes on clean data is equivalent to external stress on a DNN.In this methodology, the DNN performances (labeled 1, 2, 3, and 4) are individually compared against a defined baseline DNN performance (sold green line in the illustration shown on the lower left) in order to characterize DNN parameters as fragile (red shaded area), robust (green shaded area), or antifragile (blue shaded area).
Fig. 2, where Fig. 2(a) shows the application of stress on a DNN and Fig. 2(b) shows the interpretation of DNN performance for parameter characterization.For detailed definitions of the above-mentioned concepts, we consider the following notations.

Fig. 2 :
Fig. 2: (a) Showing an overview of the proposed system evaluation method.(b) shows the characteristics of fragility, robustness and antifragility through analysing the performance of a system F whilst under stress.

Fig. 3 :
Fig. 3: synaptic filtering framework.Left block (1) shows the input x at time t 0 ; network f (x, W ) with parameters W ; the adversarial attack δ [this study computes δ using f (x, W )]; the perturbation magnitude ϵ and the resultant adversarial example xϵ at time t 1 .The perturbation magnitude ϵ is bounded by (ŷϵ ≈ ŷ) and (ŷϵK > 1) for K classes; ŷ and ŷϵ are clean and adversarial accuracies.Middle block (2) outlines the set of synaptic filters h, containing h 1 , h 2 and h 3 filters at each point α i applied to layer W (l) , resulting in the network performance to the filters.There are A sets of h for each α i ∈ [α 0 , α A ]. Right block (3) shows β (l) = f (x, h(θ, α)) as the system performances for all values of α, where θ is W (l) for a local analysis at layer l.The function g(•) combines β (l) 1 , β

Fig. 4 :
Fig. 4: (a) Baseline network performance (green dotted line) φ(l) 1 [Eq.(10)] for ResNet-18 trained for 100 epochs on CIFAR10.ϕ (l) 1 is the function that contains the number of parameters filtered (teal solid line) for filtering thresholds in α for filter h 1 on layer l.The maximum number of parameters in layer l is denoted by ϕ (l) (yellow dot).(b) Comparison of the scaled of synaptic filtering performance with baseline network performance, and synaptic robustness computation.The network performance to the synaptic filter is β (l)

Fig. 5 :
Fig. 5: Learning landscape of layers and the regime change of test accuracy.Targeted parameters of ResNet-18 trained on MNIST using filter h 1 .Showing the combined responses for layer 'layer3.0.conv2', measured every 10 epoch up to 100 epochs.The difference between clean (left) and adversarial (middle) responses results in the targeted parameters (right).Every pixel on the clean and adversarial images represents the network test accuracy and for targeted image it is the difference between former two over all evaluated epochs and α.

Fig. 6 :
Fig. 6: Example of network accuracy results of the synaptic filtering procedure applied to layer 'conv1' of ResNet-18 trained on CIFAR10, shown to illustrate the combined system response.The bottom-left plot is a combination of three top-row plots.

Fig. 7 :
Fig. 7: Combined synaptic filtering responses for ResNet-18 trained on CIFAR10, MNIST and Tiny ImageNet datasets for every 10 epochs up to 100 epochs.(1) Local layer-wise system response to the filtering methods for all α values.(2) Global network responses using the full network for all α values.Pixel intensities on the shown images represents the average network accuracy using the different synaptic filters on the clean test dataset, for each α i in α.

Fig. 11 :
Fig.11: Synaptic filtering network performances of ResNet-18 trained on CIFAR10 for layers 'layer2.0.conv1', 'layer3.0.conv1' and 'layer4.0.downsample.0'.The results show that, even after filtering all parameters in a layer, the network performs relatively well (shown by the purple dotted lines at Φ (l) , the maximum number of parameters filtered).This is due to the following batch normalization layer features propagating through the network during the forward pass, thus highlighting the effects of batch normalisation layers on network performance.
analyze the performance of DNN architectures through a selective backpropagation technique where we only retrain robust and antifragile parameters at given epoch.We compare the selective backpropagation technique with regular training to show that retraining only robust and antifragile parameters improves
datasets.The MNIST and CIFAR10 datasets both respectively contain 80,000 examples in the training set and 10,000 examples in the test set.The Tiny ImageNet dataset contains 80,000 training and 20,000 test examples from the original training set