A neural network framework for similarity-based prognostics

Graphical abstract


Method details
The complexity in accurately estimating remaining useful life (RUL) has led to a rise in the number of algorithms and methods in the prognostic literature. The significance of these methodologies relies on their potential to estimate the evolution of degradation conditions in time. Prognostics provide sufficient time for condition-based maintenance operations and notification for necessary actions. Many prognostic applications proposed for various domains can be found in the literature, such as composite materials [2], aircraft actuators [3], turbofan engines [4][5][6][7], electronic components [8], etc.
In this article, the motivation is to propose a hybrid framework of data processing and RUL estimation by using various complex system indicators and multi-dimensional sensor measurements. The main research focus is based on the use of data pre-processing methods in order to increase the prognostic performance. Fig. 1 demonstrates the main stages of the protocols which has introduced a conceptual framework to overcome the challenges of multi-regime degradation behavior.
The science of prognosis is predicated on system ageing and monotonic damage accumulation, and it is possible to correlate sensor behavior with signs of ageing to estimate the remaining useful life of systems [9]. However, the multi-dimensional characteristics in raw data does not provide useful  information to measure the monotonic damage accumulation. Further data processing is needed to provide useful information for remaining useful life estimations.
The first step of multi-dimensional data processing is to identify the operational regimes that can be found by finding the number of clusters in the operational settings. f x p À Á ¼ arg k¼1;:::;R findðx ¼ kÞ ð1Þ where arg k¼1;::::R findðx ¼ kÞ defines the index of the occurrence of regimes, k, in the data x. Fig. 2 provides an illustrative example of the multi regime sensors after regime assignment. In this sample, a multiple regime normalization method can carry out adjustments by returning raw condition monitoring measurements into a common scale (for example the z-score). To compute a zscore for each regime, it is required to know the mean (m) and standard deviation (s) of the regime population of each sensor to which a data point belongs.
This multiple regime normalization process is applied to all regimes separately; the standardized sensors are reassembled to produce a single-regime condition monitoring dataset. To produce a single health output from these normalized sensors, the multiple readings are aggregated by taking the mean of all sensor measurements at each time step.
where s is the single health output and n is the number of sensors. Even though these aggregated outputs are smoother than the normalized sensors, they are still noisy and there is a risk that when they are included in the training stage, the network might learn from the noise. Therefore, a two-term power series fitting model is applied to describe the relationship between the health output and a Health Indicator (HI) to be used in the neural network training (see Fig. 3 Adjusted Cycle and HI).  here an approximation to a power-law distribution s b includes two fitting terms a and c. The fitted HIs calculated by this fitting model exhibits only increasing values with minimal wear increase levels in early stages.
Artificial Neural Network (ANN) training is intended to accommodate the uncertainty of measurements. The parameters from the multiple regimes are handled by a network mapping process between the raw sensor values and the assigned output HI at the model identification stage. This allows the current and upcoming data to be normalized individually. The function fitting neural network is a two-layer feedforward method including a sigmoid transfer function (f h ) in the hidden layer and a linear transfer function in the output layer (f 0 ). This adaptive filtering model is critical for taking into account the standardization issues as well as the consistency in damage information. The general equation representing this network model is denoted by: where the state variables (x) are multiplied by fixed real-valued weights (w) and a bias (b) is added. The neuron's activation is obtained as a result of summation of the nodes and the nonlinear activation function (f ) is applied to this sum [10,11]. Fig. 4 shows such feed-forward network training functions which take a set of input vectors (raw multidimensional condition monitoring data), and then another set of target vector (HI). Neurons, which are connected with coefficients (weights), constitute the neural network structure and evaluate the input state variables. Where a network function is trained with a single trajectory input and  output, an alternative network that is trained with other input and output values, might result in different estimations due to different weight and bias coefficients. To that end, as seen on Fig. 4, multiple neural networks with their inputs and outputs from different trajectories are trained to filter the raw data and each trained function is stored in a network library with "nl" number of trained functions. This library receives all novel trajectories and produces multiple estimates for each inserted input trajectory. Fig. 5 shows a set of network library estimations that result in similar exponential growth patterns. Each line here is an estimated output of the same raw training input resulting from a certain trained network function in the library. There is a consistency between the results from different functions and similar HI estimations starting and ending at parallel wear levels are assigned adequately.
As there are multiple network estimations for a single input "x"), a final HI equation adapted from the moving average method is derived as follows: where the final HI (s) is the mean of (p) moving average of (n) number of estimations and (l) is the length of trajectories. The window size of moving average is a numeric duration including the parameters in the current location plus previous and upcoming neighbors. After both stages of training are achieved, the estimated health indicators are stored to calculate the pairwise distance relations for the remaining useful life predictions. A similarity-based remaining useful life estimation model is used to identify the best matching training HI units for each test HI unit. The similarity-based estimation algorithm used in this article predicts the future behavior of the systems only when there is sufficient training data to map out the damage space. HI derived from the network estimation must provide a realistic representation of system performance.
RUL estimation is applied as a tool to match the test cases with the full operational training periods. The pairwise distance between the two cases is given as a similarity measure which can be expressed as the difference between two vectors. Fig. 6 shows the moving pairwise distance calculation between a test trajectory and two full sets of training observations. The similarity for each training baseline is initially calculated at time step "1" and then, the testing curve is moved step by step to the end of the training curve to find the bestmatching location that can be in the later part of the curve. Thereby, the moving pairwise distance is calculated throughout the training baseline by the following equation; where n tr and n te respectively corresponds to the length of the training and test trajectories. Once the test cases are moved, the location of the best similar segments is identified by the minimum pairwise distance value. L te;tr ¼ arg f ind d te;trÞ j ¼ minðd te;trÞ 1Þ; d te;trÞ 2Þ; :::; d te;trÞ n tr À n te Þ À Á ; j ¼ 1; 2; :::; ðn tr À n te Þ ð8Þ Then, these locations are used for the calculation of RUL for each baseline (training trajectory).
RUL te;tr ¼ n tr À ðn te þ L te;tr Þ ð9Þ The final RUL is found by averaging multiple minimum distance values of baselines so that the multiple estimates could decrease the risk of biased multistep ahead estimations.

Method validation
The algorithm is tested on PHM08 data challenge provided by the Prognostics Center of Excellence at NASA Ames, and it achieved the current overall leading score in the literature [1,12]. While the methods are designed through the application of different protocols for multi-step-ahead predictions, error based prognostic metrics are dynamically used to measure the performance of the model. The prognostic metrics, and their impact on the technical prognostic requirements, gain particular importance in the way that they complement each other. Therefore, the presented methods will be based on both the modelling and performance evaluation of prognostics in terms of technical metrics.
Before the results of the presented algorithm are sent for validation, the model is tested by reconstructed secondary test datasets which were randomly selected from full size training trajectories [13]. The length of units in reconstructed database is equal to the final test units from the original file to ensure that similar behavior of test trajectories is obtained.