Overcoming Adversarial Perturbations in Data-driven Prognostics Through Semantic Structural Context-driven Deep Learning

Deep learning has shown impressive performance across a variety of domains, including data-driven prognostics. However, research has shown that deep neural networks are susceptible to adversarial perturbations, which are small but specially designed modiﬁcations to normal data inputs that can adversely affect the quality of the machine learning predictor. We study the impact of such adversarial perturbations in data-driven prognostics where sensor readings are utilized for system health status prediction including status classiﬁcation and remaining useful life regression. We ﬁnd that we can introduce obvious errors in prognostics by adding imperceptible noise to a normal input and that the hybrid model with randomization and structural contexts is more robust to adversarial perturbations than the conventional deep neural network. Our work shows limitations of current deep learning techniques in pure data-driven prognostics, and indicates a potential technical path forward. To the best of our knowledge, this work is the ﬁrst to investigate the implications of using randomization and semantic structural contexts against current adversarial attacks for deep learning-based prognostics.


INTRODUCTION
Machine learning has gained much attention to solve a variety of challenges in cyber-physical systems (CPS).Data-driven methods, particularly deep learning, have made deep strides in health management and prognostics, such as anomaly detection and remaining useful life estimation.For a relatively complex system with a number of sensors, the service provider can utilize data flows from multiple smart data sources to perform data-driven prognostics using a deep learning model.However, recent research in prognostics has shown that statistical learning methods, e.g., deep neural networks, are susceptible to small adversarial perturbations, which are small but specially designed modifications to normal data inputs that can adversely affect the quality of the machine-learned Xingyu Zhou et al.This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
predictor (Echauz et al., 2019).For instance, owing to the deployment of external sensors in many application scenarios, an attacker could intercept and maliciously modify sensor readings to conduct these kinds of adversarial attacks, which could subsequently lead to critical damage caused due to inaccurate system health status evaluation.
We study the impact of such adversarial perturbations in datadriven prognostics as shown in Figure 1.Data-driven prognostics methods are suitable for scenarios where complete analytical models of the physics are difficult or impossible to formulate.In this context, recent research has shown the success of combining physics-based semantic information (Chao, Kulkarni, Goebel, & Fink, 2020) with pure data-driven prognostics.For this paper, two typical prognostics problem settings comprising classification and regression are considered (Saxena, Celaya, et al., 2008).In the classification setting, the model predicts the remaining useful life of the system in a certain range and outputs a categorical result to indicate to which health status category the system belongs to.In the more general regression setting, the model outputs a numerical prediction value for the remaining useful life.
We demonstrate our ideas using the widely-used C-MAPSS turbine engine dataset (Saxena & Goebel, 2008).We explore the vulnerability of prediction models and potential ways to defend against adversarial attacks.To the best of our knowledge, this work is the first to investigate the implications of using semantic structural contexts against current adversarial attacks for deep learning-based prognostics (Y.Li et al., 2020).We find that we can introduce obvious errors in prognostics by adding imperceptible noise to a normal input and that the model involving randomization and structural contexts is more robust to adversarial perturbations.Our work highlights the limitations of current deep learning techniques in pure datadriven prognostics, and presents a potential technical path forward.
We make the following contributions in this paper: • We present a framework that can formalize the security and resilience testing in data-driven prognostics settings.• We show the vulnerability of deep learning prognostics models under various settings.
• We investigate the possibility of inducing randomization elements and semantic structural context to mitigate adversarial impacts.
• We conduct a robustness case study using the engine degradation dataset on typical applications of health status classification and remaining useful life prediction.
The rest of the paper is organized as follows.Section 2 provides the motivation for evaluating and mitigating security risks in data-driven prognostics.Section 3 illustrates the theoretical background for our adversarial attack and defense settings.Section 4 presents a case study to demonstrate the capabilities of our framework on the C-MAPSS turbine engine dataset (Ramasso & Saxena, 2014).Finally, Section 5 concludes the paper and alludes to future research directions.

MOTIVATION
It is essential for service providers to maintain a capability to monitor and predict the health status of their system.Two primary system modeling technical paths have been widely used including model-based and data-driven.Classical model-based methods assume that the model must be accurate enough to depict system behaviors, e.g., Bond Graphs (Broenink, 1990) or analytical battery physical systems (Zhang & Lee, 2011).These methods need to know formally how different components of a system interact and how the system output can be computed analytically.For many real-world cases, however, due either to partial knowledge or the system's largescale, it becomes impossible to reveal the pattern in a detailed analytical way.Consequently, as the complexity of the system increases and the amount of generated data volumes increases, data-driven health management and prognostics are preferred (Baraldi, Cadini, Mangili, & Zio, 2013).
Recent advances in machine learning, particularly deep neural networks, enable a lower threshold for building a prediction model for health management and prognostics.However, deep neural networks are susceptible to adversarial attacks, which are a small amount of additional data designed as perturbations to misguide the original neural network prediction systems.Recently, the impact of adversarial examples in deep learning (Szegedy et al., 2013) has given rise to many concerns (Goodfellow, Shlens, & Szegedy, 2014).Prior research (Vorobeychik & Kantarcioglu, 2018) has shown how these adversarial examples can pose threats to current machine learning systems.Most prior work in the field of adversarial machine learning during the past decade has paid attention to classification tasks (Biggio & Roli, 2018).As regression tasks start playing an increasingly important role in CPS scenarios, the topic of adversarial regression is attracting more research attention.
For the health management and prognostics field, deep learning techniques have shown to be successful in a number of tasks like power disturbance classification (Valtierra-Rodriguez, de Jesus Romero-Troncoso, Osornio-Rios, & Garcia-Perez, 2013) and remaining useful life prediction (X.Li, Ding, & Sun, 2018).Consequently, even though a health prediction model could be built easily using state-ofthe-art deep neural networks and tool-flows, a more cautious view still needs to be taken due to potential risks of adversarial attacks with these kind of learning-based components (Echauz et al., 2019).
In a nutshell, although accurate status classification or prognostics is critical for efficient data-driven health management, the vulnerability in this broad practical scenario has not been carefully investigated to date.To address these issues in the prognostics and health management domain, we propose an approach to show model vulnerabilities from domain-specific settings and explore some potential ways to mitigate underlying adversarial impacts.

METHODOLOGY
In this section we delve into the details of the underlying methods used in our framework.These methods will be introduced in a step-by-step manner following the evaluation execution path of attack and defense on neural network predictors.

Adversarial Attack
As discussed above, attacks on neural network-based prognostics are essentially small adversarial data modifications.There are two important features for a successful adversarial attack.
First, the attack perturbation should not be 'obvious' enough to be detected by the system that it is attacking.Secondly, the attack should lead to 'obvious' performance deviation in the compromised prediction system.
Overall, given a model f (•), one well-recognized way to define an adversarial attack (Carlini et al., 2019) is the worst-case target loss L for a given perturbation budget defined by an bounded distance magnitude D(x, x ) between the original data point x and the perturbated data point x .
It is worth pointing out that the attack methods often regard the requirement of being stealthy as self-evident under the fixed maximum magnitude constraint of .This can be generalized to data with or can be pre-processed into a fixed range using techniques like MinMax Normalization (Patro & Sahu, 2015).

Gradient-based Attack
Based on the two features above, we can regard an adversarial attack against a prediction model as an optimization problem.
The goal of this optimization problem is to solve for an adversarial perturbation on the original data input that on one hand maximizes the target loss function and on the other hand be under some realistic constraints.Here we consider a whitebox setting where an attacker can obtain full knowledge of the prediction model.In this way, the attack procedure would be a gradient-based optimization computation using the loss function.We select two typical adversarial attack methods of Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD).
FGSM (Fast Gradient Sign Method) (Goodfellow et al., 2014) is one of the most well-known and popular adversarial attack methods.It formulates the optimization problem incorporating these two constraints using only one single equation: Here, x represents inputs to the model and is assumed to be in the range of [0, 1], y refers to the targets associated with x (for tasks with targets) and L f (x ), y is the goal loss function for deviating the neural network predictor f (•).The magnitude constraint added to the original sample is represented by .This method is quite simple and intuitive.The attacker adds fixed magnitude perturbations to maximize the loss function.
PGD (Projected Gradient Descent Method) (Madry, Makelov, Schmidt, Tsipras, & Vladu, 2017) is recognized as a universal attack procedure and by far one of the strongest attack methods.Instead of starting from the exact data point, it starts from a random perturbation in the l p norm ball around the input sample.It then utilizes the FGSM approach but with a much smaller gradient step towards the greatest loss and projects the perturbation back into the allowed l p ball range.In contrast to one-step FGSM, PGD iteratively adds small perturbations using updated gradients.With the same maximum level magnitude of the perturbation, PGD proves to be much stronger than FGSM.

Adversarial Goal Setting
In a real-world CPS scenario, the settings become more complex with a data input space that is potentially larger than a fixed range.As a result, we reformulate an adversarial attack as an optimization problem which attempts to find the best synthetic perturbations that maximize the prediction loss while keeping the modification magnitude at a small enough level so as to go undetected.
In CPS settings not only do input data formats show more complex patterns, the adversarial attack goals can also be more flexible than, say, computer vision classification tasks.
For an adversarial regression scenario like remaining useful life prediction, we propose three adversarial attack settings using corresponding loss functions: Mean Squared Error (to maximize absolute prediction deviation), Maximization (to maximize prediction value) and Minimization (to maximize prediction value).The latter two are not commonly discussed but would be extremely meaningful for health management and prognostics application scenarios where the cost of underestimating or overestimating could be high and unbalanced (Ramasso & Saxena, 2014).

Randomization-based Defense
Although there has not been an overall defense technique for existing adversarial attacks (Akhtar & Mian, 2018), previous robustness improvement works that induced randomization elements have shown a high success rate in terms of detecting adversarial examples (Athalye, Carlini, & Wagner, 2018).As redundancy design is a significant ideology toward higher system reliability, it is intuitive to combine this internal ideology with a probabilistic implementation.Therefore, we choose two typical randomization-based methods and further propose a potential way utilizing semantic structural contexts to help mitigate adversarial impacts in data-driven prognostics.

Gaussian Augmented Adversarial Training
One mainstream defense technique that has attracted much attention is improving model robustness during the training phase.Research has shown that data augmentation during training using Gaussian data augmentation (Zantedeschi, Nicolae, & Rawat, 2017) helps in improving robustness of neural networks to adversarial attacks.The intuition behind this is straightforward: in the high dimensional space, data points within a l p ball range should own the same label.Based on this assumption, the original training dataset can be extended by adding a norm-bounded Gaussian noise but with the same label.The training based on this extended dataset generates more resilient models against small perturbations.

Input Dropout Inference
In the inference phase, there has been research (Xie, Wang, Zhang, Ren, & Yuille, 2017) showing that random cropping or random padding with resizing of the adversarial examples reduces their effectiveness.Furthermore, randomly deactivating input neurons in the inference phase (S.Wang et al., 2018) has also been discussed.For the model deployment strategy, a recent work (Liu, Cheng, Zhang, & Hsieh, 2018) adds small random noises to one input for several predictions and suggests the ensemble of prediction results with the highest probable class.However, these methods have only been under trial on computer vision and classification tasks and cannot be applied to other scenarios directly.

Generalized Random Ensemble
Based on the above discussions, we design a generalized model application strategy against adversarial examples using randomization elements applicable for both classification and regression problems.
The system holds a predictor and a self-representative autoencoder detector for reconstruction.Given a data input, the system randomly sets input feature values to zero at a dropout rate level P R .We discuss two kinds of dropout here: (1) Normal Dropout with a constant rate on all features and (2) SemanticDropout using system structural contexts to assign features with different dropout rates as shown in Algorithm 1.The intuition behind this SemanticDropout is to decouple most correlated features from the same subcomponent that are most likely to be adversarially modified in the same direction.Then the data with deactivated values is sent to a self-representative auto-encoder to recover the matrix.This recovered data is then used for prediction.This {Drop => Reconstruct => P redict} procedure is conducted repeatedly for nIter iterations and generates nIter sets of predictions.
To deal with the classification problem, the output matrix contains likelihood values for classes.We sort these likelihood values for each class.The likelihood for one class is computed as the sum of the results of the highest nT hres rounds.The overall classification result refers to the class with the highest sum value.
To deal with regression problem, the system first uses some training data with the trained model and generate some adversarial examples using deviation maximization loss function (like mean squared error).These adversarial data from training data would be used to decide whether the model is more likely to be adversarially-maximized or adversarially-minimized or equal likely.Given that the output matrix contains numerical prediction values from the step above, we sort these prediction values for nIter rounds.If the model is adversariallymaximized, the final prediction result would be the mean value of the smallest nT hres prediction results.In contrast, if the model is adversarially-minimized, the final prediction result would be the mean value of the largest nT hres prediction results.Otherwise, the final prediction result would be the mean value of the median nT hres prediction results.

EVALUATION
For demonstration purposes, we conduct a case study using the widely-used C-MAPSS jet engine degradation dataset (Saxena & Goebel, 2008).It is worth pointing out that the proposed attack and defense settings as well as implementation methods can also be generalized to other data-driven health management and prognostics settings without any constraints.

Dataset Description
The C-MAPSS dataset (Saxena & Goebel, 2008) was initially generated for the PHM08 Data Challenge.It is a dataset for data-driven remaining useful life (RUL) prediction for jet tur-bofan engines.The standard version has four sub-datasets consisting of running data of the engine under a certain mode.The engine starts degrading from a time point and breaks down (RUL=0) at the end of the running cycle.Apart from time label, there are 24 features for each data point as shown in Table 2.The first three are the three operational settings (does not state exactly what they represent) that have a substantial effect on engine performance.The remaining ones represent the 21 sensor values.Some researchers also use trends to make a initial selection on features (T.Wang, Yu, Siegel, & Lee, 2008) to guarantee that the selected features have relative clear trends throughout the degradation process (Ellefsen, Bjørlykhaug, AEsøy, Ushakov, & Zhang, 2019).As we are trying to demonstrate the potential risk of data-driven predictors, we only choose the first set F D001 for our experiments here.There are some data preparation steps that could be applied on this dataset including duplicate removal and normalization like MinMax (Patro & Sahu, 2015) or Batch normalization (Ioffe & Szegedy, 2015).We implement MinMax normalization to transform feature values into a fixed range of 0 and 1.

System Models
Since we focus on sequence analysis, we choose the method of long short-term memory (LSTM) network (Hochreiter & Schmidhuber, 1997).A recurrent LSTM network enables us to input sequence data into a network, and make predictions per individual time steps of the sequence data.Given its good support for time series, LSTMs have been widely used for datadriven prognostics (Zheng, Ristovski, Farahat, & Gupta, 2017) for more than a decade (Heimes, 2008).Our work makes use of recurrent neural networks in two ways using sensor readings of different components.One is to to predict the health status or remaining useful life.The other is to use an LSTM autoencoder to help reconstruct data matrices with deactivated zero values after randomization executions.The input dimension is 50 * 24 indicating 24 features across 50 time steps.For the health predictor, we include two LSTM layers with 100 units.The status classifier has a Sigmoid (binary classification) or Softmax (multi-class classification) activation layer to generate likelihood for classification outputs.The regression predictor is added with a single neuron to compute the numerical output after the LSTM layers.For the auto-encoder, we include two LSTM layers with 50 units followed by a fully-connected layer with 24 units to generate an output with the same shape with the input.

Prediction and Attack Setting
For health management and prognostics applications, the input and output formats vary for different scenarios.For this engine dataset, research can be conducted on both remaining useful life value regression or health status classification (Umberto Griffo, 2019).To show the generalized existence of adversarial impacts across different data-driven settings, we consider three For the binary classification task, a predictor judges whether the engine is already in the failure state.We consider engines with no more than 15 cycles of life lengths as being in failure state.This predictor outputs a binary result of being true or being false.And as a result, the adversarial attacks make use of the known model knowledge and use the loss function of Binary Crossentropy to lead the predictor to misclassify the modified sample into another class.
For the multiclass classification task, a predictor judges whether the engine is in the steady phase, the degrading phase or the critical phase.We consider engines with no more than 50 cycles of life lengths as being critical, with 50 to 105 cycles as being degrading and with more than 105 cycles as being steady.This predictor outputs a class indicating the phase.As a result, the adversarial attacks make use of the known model knowledge and use the loss function of Categorical Crossentropy to lead the predictor to misclassify the modified sample into wrong classes.
For the regression task, a predictor outputs a positive numerical value of the estimated remaining useful life for the given time step of data input.This is the most fundamental task of this dataset.Here the attacker has three options for the attack goal setting using different loss functions:Mean Squared Error to maximize absolute prediction deviation, Maximization to maximize prediction value and Minimization to minimize prediction value.
The attack is a manipulation of sensor data under reasonable constraints with full knowledge of the prediction and detection model.Among the 24 features provided for data points, the first three are control settings and the remaining 21 are sensor readings.We assume the attacker can modify values of sensor readings but the control settings are kept untouched.For the attack strength, we conduct experiments on clean data along with different attack perturbation levels of advEps = 0.01 − 0.10.That is equivalent to 1 − 10% of data value range based on the MinMax Normalization.Under these constraints, we generate adversarial examples using the strongly iterative PGD attack(50 step) to maximize the prediction deviation.

Evaluating Attack and Defense
For comparison purposes, we conduct experiments on models trained from only clean natural data and also from Gaussian Augmented data (with a norm bound of 0.10).Moreover, for each model, we include three settings of no defense, normal dropout and SemanticDropout.In this section we first introduce how we incorporate semantic structural contexts and then we show more detailed experimental results.

Structural Information Embedding
Without any background knowledge of aircraft engine dynamics, we can get this high-level information from the layout description given by the original CMAPSS dataset (Saxena, Goebel, Simon, & Eklund, 2008).Among 24 column variables, the first three are control inputs and the rest are sensor values.Five features including three control inputs are regarded as conducting impact globally.Further, we use this mapping to build the feature adjacency matrix for the proposed SemanticDropout method discussed.Using the semantic context knowledge of the high-level component structure, we can briefly relate features to where they work in the engine and how different features are connected as shown in Figure 2. The definition of 24 column features as well as the proposed semantic grouping are also shown in Table 2. Features in the same group would be marked as being adjacent to each other.The global impact features would be assumed to be adjacent to all other features.

Experimental Results
We conduct experiments on various settings.To evaluate the worst case, Without loss of generality, the number of detection iterations is set to 10 and the detection threshold is set as half of that.For all prediction result figures, the detection dropout rate is set at 40%.And we show results under increasing adversarial attack strengths(advEps ranging from 0.01(1%) to 0.10(10%)).
Figure 3 shows binary classification accuracy under adversarial attack.Figure 4 shows multiclass classification accuracy under adversarial attack.From both these classification cases, the SemanticDropout method shows highest adversarial robustness.But there are some obvious differences shown from our experiments.In the multiclass case, the Gaussian Augmented model always shows better robustness whereas in the binary case the Gaussian Augmented model does not hold a stable performance.This shows the potential impact of noise strength for the adversarial training phase.The other point is that the robustness of the multiclass model is higher and in other words the binary classification model itself is more vulnerable.From our empirical experiences this should have something to do with the balance of dataset splitting and output encoding (Buckman, Roy, Raffel, & Goodfellow, 2018).
Figure 5 shows adversarial regression errors under prediction deviation maximization attack setting.Figure 6 shows adversarial regression errors under prediction minimization attack setting.Figure 7 shows adversarial regression errors under prediction maximization attack setting.For most of these model and attack settings, the SemanticDropout method shows highest adversarial robustness.
Figure 9-11 show prediction results under a medium level(advEps = 0.05) adversarial PGD attack on the Gaussian Augmented model.Figure 9 shows prediction results when the attacker uses the mean square error between clean and adversarial prediction as the optimization loss function and aims to make prediction results on adversarial data that deviates from clean data as much as possible.Figure 10 shows prediction results when the attacker uses the negative derivative of the predictor as the optimization loss function and aims to make prediction results on adversarial data as small as possible.Figure 11 shows prediction results when the attacker uses the derivative of the predictor as the optimization loss function and aims to make prediction results on adversarial data as large as possible.
Figure 8 We provide an example adversarial perturbation as shown in with a ground truth RUL value of around 60. We can see features get different magnitudes of modification along time steps.And from this example here, we can also see why it is difficult to isolate the origin of vulnerability.Even for a certain feature, it might show different sensitivity levels along time steps.By far, we cannot provide a general description of how a certain sensor reading would be sensitive to adversarial attacks.
To make visualization easier, we sort test data according to We present experiment results under more flexible settings in Table 3.The table shows experimental results under three levels of detection dropout rate: 30%, 40%, 50% (three columns from left to right in each setting) with two cycles of reconstruction using dropout and autoencoder.The error metric we choose to show here is the most commonly used mean absolute error (MAE) for regression and accuracy for classification.
For different adversarial attack settings, the settings with best output results are marked in dark black.We can see that low detection dropout rates usually perform better on clean or less perturbed data but on the other hand are less robust against strong adversarial attacks.
In summary, we show the vulnerability and potential risks of deep learning based predictors in data-driven prognostics applications.We also show the efficiency as well as the potential of the proposed randomization-based defense technique involving semantic structural contexts.One significant advantage of our randomization-based framework is that it makes use of existing pre-trained models in a resilient way, which means it can work together with other defense techniques seamlessly.

CONCLUSION
Data-driven models in health management and prognostics scenarios.We explore the vulnerabilities of the state-of-theart deep neural network data-driven health management and prognostics caused by the technique called adversarial attack.
We show the significance of inducing randomization elements to improve model robustness.Furthermore, we investigate the possibility of inducing randomization elements from semantic structural contexts to mitigate adversarial impacts.Our discussions and experimental implementations on the engine degradation dataset cover the most typical settings in PHM applications.These general settings would help formalize the security testing in more scenarios.
Our future work is focused on the following three aspects.First, our current investigation is only on a single operation and error degradation mode.We are only incorporating semantic structural contexts using a high level of whether features belong to the same component part or not.We are planning to incorporate more hierarchical operational settings in the semantic feature analysis for more tasks (Pasareanu, Gopinath, & Yu, 2018).Secondly, we emphasize the significance of using supervised self-representative model for data reconstruction using an auto-encoder.But there are also other methods that are also worth exploring like matrix completion (Yang, Zhang, Katabi, & Xu, 2019) or generative adversarial networks (Samangouei, Kabkab, & Chellappa, 2018).Thirdly, even though our defense framework has shown promising performances on various settings, the optimal choice of the defense settings remains empirical.Therefore, we need to dive deeper into the origins of the vulnerabilities in the system and conduct the attack and defense evaluation in a more systematic way.Further research is necessary for measuring the efficiency of this framework and speed up computations towards more real-time applications.

Figure 1 .
Figure 1.Overall Workflow for Robustness Evaluation of Deep Learning-based Prognostics on Conventional & Hybrid Models (with Randomization and Structural Contexts)

Figure 2 .
Figure 2. Groups of Features Along Layout Components (one color per group)

Table 1 .
Attack Goal for Perturbated Sample x : D(x, x ) < tasks with five adversarial settings as shown in Table1.Here, the labels for classification tasks are transformed from the original numerical remaining useful life values.

Table 2 .
Sensor Feature Grouping according to Their Semantic Structural Contexts