A Bayesian Approach for Maintenance Action Recommendation

This paper presents a Bayesian approach for maintenance action recommendation tested on the PHM 2013 Data Challenge dataset. The Challenge focused on maintenance action recommendation based on historical cases and the algorithms were evaluated on their ability to recommend confirmed problem types. The proposed approach is based on a Bayesian inference methodology and deals with recommending an already known problem type for each case. The recommender can be viewed as a classifier among the confirmed problem types. For each such problem type class the a priori probabilities for the events which characterize the problem type from the training data are estimated. When testing cases are presented, the recommender calculates the a posteriori probabilities for each of the confirmed problem types and suggests the type of problem that corresponds to the maximum a posteriori (MAP) probability.


INTRODUCTION
In the recent years, Prognostics and Health Management (PHM) is an area that has attracted the attention of both academia and industry.It is gradually recognized as a key pillar in implementing Condition -Based Maintenance Strategies (CBM), which are aimed at improving the efficiency of Asset Lifecycle Management practices.
Significant research results have been emerged with the development of sophisticated algorithms and advanced solutions for system detection, diagnosis and prognosis.PHM paradigms typically involve a training stage, where algorithms are used to discover and learn patterns and trends from historical data followed by a testing stage, where trained models are applied to new machine condition data for health assessment and/or prediction.
Whereas a multitude of such methodologies have emerged and were applied in various problems and domains, a lack of common ground regarding performance evaluation is hampering efforts to establish a better understanding of the efficiency and potential impact of PHM methodologies.This lack was related to both the availability of common benchmarking datasets, as well as to the adoption of specific evaluation metrics.The former is being addressed in the last few years with the production and distribution of publicly available benchmarking datasets, often the result of focused research efforts to establish prototype test beds for obtaining such data (Nectoux et al., 2012).The latter has given rise to a debate as per what would constitute adequate performance evaluation metrics for different aspects of the PHM problem domain (Eker et al., 2012), (Zhou et al., 2013).
There is a diverse range of PHM methodologies based on machine learning techniques that have been employed for health assessment and prediction from multiple sensorial data.Among these techniques Bayesian approaches provide an elegant and theoretically sound framework for estimating the expectancy of an event to belong to the learned class by modeling the interdependencies among the attributes of the historical sensor data.
In the last decade Bayesian approaches have been applied in tackling various tasks of CBM and PHM.Elnahrawy and Nath (2004) employed Bayesian classifiers to learn contextual information and then make inference in sensor networks.Saha and Goebel (2008) also use Bayesian techniques for statistical modeling of operational conditions to provide estimates of remaining useful life (RUL) in the form of a probability density function.More recently, Karandikar, Abbas and Schmitz (2013) applied Bayesian inference to estimate the RUL for selected tools.The implementation of Naïve Bayes classifiers for early detection of developing catastrophic failures in machining operations and its integration a machine tool control system was also recently proposed (Mehta et al., 2013).
In the present work, we adopt Bayesian inference to implement a recommender that was submitted to the 2013 PHM Data Challenge, employing MAP probability classification of test cases into the known problem types.
The rest of the paper is organized as follows.Section 2 described the problem formulation, the training and test datasets and introduces the employed notation.Section 3 explains the approach of the Bayesian recommender.In Section 4 we discuss the experimental results and the evaluation of the suggested algorithm and, finally, we present the conclusions in Section 5.

PROBLEM FORMULATION AND DATASETS
In this section we present the problem formulation for the task of the PHM Data Challenge 2013, the available datasets and the performance evaluation method.
The task of the Data Challenge 2013 was to implement a recommender which takes as input records of a case, decides whether this case is a nuisance or problem, and in the latter case recommends also the problem type from a known set of problems emerged from historical data.
A case consists of a collection of event codes each of which corresponds to a number of parameters.This data was generated automatically the equipment monitoring system.Every time a specific condition is met onboard, the control system generates a specific event code and takes a snapshot of all of the parameters that are measured onboard.On the other hand, cases have been created either manually by an engineer or automatically by a control system.A record of a case therefore can be defined as a single event code along with the respective measurements of the parameters.There were 30 parameters of onboard measurements, recorded each time an event code was generated.
The Data Challenge provided a dataset of cases with their event codes and respective parameters for training and another one for testing/evaluation.The training set included the classification of the cases into nuisance or problem.For the cases classified as problems, the corresponding problem label/identifier was also provided.It must be noted that due to proprietary concerns a detailed description of the data and the domain was not provided.The training data involved in total 1.316.653records that correspond to 10.459 cases of which 10.295 were characterized as nuisance and 164 as problem, resulting into 13 distinct problem identifiers/codes.The testing dataset involved in total 1.893.882records of event codesmeasured parameters that correspond to 9.358 distinct cases.The ground truth of the testing dataset involved 174 problem cases, with the remaining 9.184 being nuisance cases.
A recommender should identify the problem cases from the 9.358 testing cases and for each one of them provide the respective problem identifier.
The evaluation of the proposed algorithms was carried out on a set of cases that involved all 174 ground truth problem cases and a random selection of 174 from the total of 9.184 nuisance cases.The evaluation of the proposed algorithms was based on a score that was calculated by subtracting from the number of outputs, the number of wrongly identified problems as well as the number of nuisance cases.In particular, the ranking of the submitted recommenders was carried out using the following formula: Note that the maximum number of outputs that a recommender could achieve is 348, i.e. the sum of 174 ground truth problem cases and 174 nuisance cases.Therefore, a recommender should identify on one hand the correct problem types of the test cases and on the other hand discard nuisance cases.
Let us now introduce some notation that will assist us in the mathematical definition of the problem.We denote by the set of cases, where the index indicates the unique identifier of each case.For each case i C , the corresponding events are denoted by i C j E , where j ranges from 1 up to the number of events

THE BAYESIAN RECOMMENDER
Our approach is based on Bayesian classification, in which we define distinct problem classes one for each problem type, calculate the posterior probability of a test case for each problem class and subsequently recommend the problem class that corresponds to the maximum a posteriori (MAP) probability.
Bayesian inference provides a rigorous mathematical framework for updating our belief for the occurrence of an unknown random variable when new information on dependent random events or variables becomes available.
In particular, the Bayes rule relates the prior probability of a random event with the likelihood obtained from experimental evidence in order to determine the posterior probability about the random event after observing the experimental results.We refer the reader to Koller & Friedman (2009) for an extensive study of the theoretical foundations of Bayesian techniques and inference on Probabilistic Graphical Models.
It must be noted that we treat only the problem of assigning a test case to a problem type without caring for nuisance cases or, to put it another way, our recommender would never assigns the nuisance label 0 L P to any of the test cases.
Each case can be represented by the set of distinct event codes that are included in it and is also assigned to a problem type/class.We therefore calculate from the training dataset the number of occurrences that each event code is encountered given the respective problem type.Let us define the matrix where N is the number of distinct event codes of the set to problem types that corresponds to the column j , 1,..., , 1,..., one obtains an estimate for the conditional probabilities of observing a case that includes an event code i E given that this case has been classified in the problem type j P , i.e.

Pr[ | ]
, 1,..., , 1,..., In order to avoid zero values for the ij p , we replace zeros with a small positive number ε, with a typical value of 10 -10 .In addition, one may estimate from the training data the prior probabilities of each problem type by using the following expression 1 Pr[ ] , 1,..., Now, given a test case Since each case consists of a number of event codes, assuming independence among the event codes, we may approximate the first term in the right hand side of Eq. ( 9) by where is the number of event codes of the test case k C .
In order to avoid numerical overflow we replace the product of probabilities of Eqs. ( 9) and ( 10) with the sum of the respective logarithms as follows Therefore, by using Eq. ( 5) and ( 6) in ( 11) and ( 12) we find the problem type that gives the MAP probability for the test case k C under consideration.

EXPERIMENTAL RESULTS AND EVALUATION
The approach presented in this paper focuses of the task of identifying the correct problem type of a case without taking care whether this case is nuisance.In this section we present the experimental results and discuss the performance of the suggested approach.Initially we tested the approach on the training dataset, since the ground truth for the test dataset was not available.We relaxed the requirement for the algorithm to suggest only one problem type and examined the sorted list of problem types in descending order of their MAP probabilities.In the remainder, we refer to the top-k choice, when the correct problem type is found in the sorted list of the recommender within the first k problem types.The overall performance of the recommender for the top-1 choice of the problem type was 86.59%, for the top-3 choice 95.12% and for the top-5 choice 96.95%.The detailed results of the recommender for the training dataset are shown in Table 2.The suggested algorithm was evaluated on the test set of the PHM Data Challenge 2013 and received an overall score of 60 (http://www.phmsociety.org/events/conference/phm/13/challenge).
Note that our algorithm does not take into consideration the task of minimizing the number of nuisance cases (Type II errors), and as a result in our score the term #Nuisance outputs in Eq. ( 1) is equal to 174, i.e. the total number of nuisance terms of the test set.The detailed results of the suggested algorithm are presented in Table 3.Note that the percentages of the aggregate score reported in the bottom line of In addition, if the event codes for a problem type are not balanced between the training and test cases then in the likelihood term of Eq. ( 12) there will be more event codes with conditional probabilities close to zero that will make very unlikely the specific problem type to be the MAP estimate of the recommender.Another factor that needs further experimentation is related to the underline assumption that the event codes are independent, which in general may not be the case.To circumvent such a problem one would need to consider statistics for detecting correlations of event codes for each problem type category.This of course is an even harder problem if you consider the small number of available training data.
Further work is needed in the direction of defining adequate performance metrics, depending on the nature of the PHM problem at hand (Zhou et al., 2013).Metrics may take into account in different ways performance issues, such as false positives and false negatives in detection tasks, precision and recall in diagnosis tasks, as well as accuracy and confidence in prognostics.Adjusting the adopted metrics to the priorities of the problem at hand, for example by making it important to avoid false positives or false negatives, may also lead to the adoption of a vector of multi-objective, rather than a single scalar performance criterion.

CONCLUSION
A Bayesian method for maintenance action recommendation that was submitted to the PHM Data Challenge 2013 was presented.The method is characterized for its simplicity and the low computational resources it requires.The method incorporates prior information about the event codes for the targeted problem types.The prior information can also be improved after each test case is classified in a problem type.However, the method requires a balanced set of event codes for each problem type between training and test cases.Future work will try to circumvent this drawback as well as to address the task of detecting nuisance cases.A hybrid recommender that fuses the ability to detect nuisance cases with that of identifying the most likely problem type could lead to improved performance.
the dimension of the parameter space.As mentioned above, the event codes can take values from a finite set that eventually captures all possible conditions met onboard, denoted by being the cardinality of the event codes set.It must also be noted that some of the parameter measurements might be missing due to possible failure of the equipment or other unknown factors.We define with possible problem types that one may encounter, with the problem type 0 L P being the label/identifier used for the nuisance cases.We define the training dataset of casesevent codes-parameters by tr D that consists of tuples of the form ( m , where the tr subscript in tr i C denotes that the case belongs to the training dataset.The training dataset of cases-problem types is denoted by .In a similar manner we define the testing dataset by te D .The aim of a recommender is to produce the set te U which includes the pairs of testing cases- problem types of the form ( , ) te i C te i CP , where te i C is the i- th test case and te i C P the recommended problem type.A test case te i C may be assigned a problem type label tr i C L P  P .There are two types of errors of a recommender: (a) Type I errors have to do with cases that although correctly identified as corresponding to a problem, they have been assigned a wrong problem type.(b) Type II errors have to do with cases which do not correspond to any problem, but they have been assigned a problem type, i.e. nuisance cases.

N
., we ignore the nuisance class.In practice, we use as N not the total number of event codes that one may observe in the training set but we restrict N to be the number of event codes of only the training cases that have been classified as problems.The elements ij n of the matrix tr EP N represent the counting measure of how many times in the training dataset one encounters a record of a case that is classified to problem type j and at the same time involves the event code i.Summing over the columns of the matrix from the training set that have been assigned

Table 1 .
Distribution of cases in problem type categories for the training and test datasets

Table 2 .
Number of training cases correctly classified for each problem type.

%) 38,22 52,87 59,24 72,61 80,25Table 3 .
Number of correctly classified test cases in the topk choice for each problem type.86% and goes up to 100% for the top-5 choice.This is mainly due to insufficient training data to calculate in a reliable way the conditional probabilities for the event codes of the Bayesian model.For example, in the case of problem type P9965 there are only two training cases, whereas for problem type P7695 there are 17 instances in the training dataset.