An Improved Evidential-IOWA Sensor Data Fusion Approach in Fault Diagnosis

As an important tool of information fusion, Dempster–Shafer evidence theory is widely applied in handling the uncertain information in fault diagnosis. However, an incorrect result may be obtained if the combined evidence is highly conflicting, which may leads to failure in locating the fault. To deal with the problem, an improved evidential-Induced Ordered Weighted Averaging (IOWA) sensor data fusion approach is proposed in the frame of Dempster–Shafer evidence theory. In the new method, the IOWA operator is used to determine the weight of different sensor data source, while determining the parameter of the IOWA, both the distance of evidence and the belief entropy are taken into consideration. First, based on the global distance of evidence and the global belief entropy, the α value of IOWA is obtained. Simultaneously, a weight vector is given based on the maximum entropy method model. Then, according to IOWA operator, the evidence are modified before applying the Dempster’s combination rule. The proposed method has a better performance in conflict management and fault diagnosis due to the fact that the information volume of each evidence is taken into consideration. A numerical example and a case study in fault diagnosis are presented to show the rationality and efficiency of the proposed method.


Introduction
The structure of the modern engineering system is more and more complex [1,2], and how to maintain the safety of these systems is a critical problem. Various types of faults may occur because of long-time continuous operation and the changing environmental factors, which may bring great threats to human life [3][4][5][6]. Therefore, fault diagnosis plays an important role in real applications in daily life [7][8][9][10]. In practical applications, a multi-sensor system is widely used in fault diagnosis to make a comprehensive judgment [11][12][13]. For example, fault detection and isolation have been successfully used on the well known Airbus aircraft [14,15], which plays a key role in ensuring the safety of the aircraft [16,17]. However, the information, which may be obtained from a multi-sensor system, is heterogeneous and imprecision [18]. Therefore, it is essential that the uncertain information is pre-processed before data fusion and decision-making [19,20].
In the frame of Dempster-Shafer evidence theory, while dealing with the conflicting data fusion, one kind of method is to modify the conventional combination rule. Yager modifies Dempster's combination rule through redistributing the conflicting evidence [53]. However, this method may destroy the good properties of Dempster's combination rule, such as the commutativity and associativity. In addition, it is unreasonable to blame the combination rule if the incorrect results are caused by sensor failure. Another typical method is to modify the evidence before applying Dempster's combination rule. Murphy's method averages the evidence, which does not consider the difference among the evidence [54]. The distance of evidence is used to obtain the weight in Deng et al.'s method [55], which does remedy the disadvantage of Murphy's method to a certain extent.
In this paper, an improved evidential-Induced Ordered Weighted Averaging (IOWA) sensor data fusion method is proposed in dealing with multi-sensor data fusion in fault diagnosis. Firstly, according to the global distance of evidence d g and the global belief entropy E g d , α value of the maximum entropy method (MEM) is established. Namely, the α value is jointly determined by d g and E g d . Secondly, a weight vector W = (w 1 , w 2 , · · · , w n ) T is generated based on the MEM model. After that, the evidence are modified by the new IOWA-based weight factor. Finally, the obtained evidence is combined (n − 1) times with Dempster's combination rule. A numerical example and a case study on fault diagnosis verify the validity and reasonability of the proposed method.
This rest of this paper is organized as follows. The preliminaries are introduced in Section 2. In Section 3, a new evidential-IOWA sensor data fusion method is proposed. The application of the new method is presented in Section 4. Conclusions are given in Section 5.

Dempster-Shafer Evidence Theory
Dempster-Shafer evidence theory was introduced by Dempster and then developed by Shafer, which is usually applied to manage the conflicting evidence [56,57].
Let Θ be the frame of discernment, and be defined as Θ = {θ 1 , θ 2 , · · · , θ n }. A basic probability assignment (BPA) m : 2 Θ → [0, 1], is defined as follows [25,26]: when m(A) > 0, A is called a focal element. Suppose m 1 and m 2 are two BPAs on the frame of discernment Θ, Dempster's combination rule is defined as follows [25]: where k = ∑ B∩C= m 1 (B)m 2 (C), is regarded as a measure of conflict between m 1 and m 2 . The larger the k, the larger the degree of conflict.

Jousselme Distance
Jousselme distance is presented to measure of the difference-or the lack of similarity-between any two BPAs, which is introduced as follows.
Let m 1 and m 2 be two BPAs on the frame of discernment Θ, then the distance between m 1 and m 2 is [58]: where D is an 2 |Θ| × 2 |Θ| matrix whose elements are

Belief Entropy
Deng entropy is the generalization of Shannon entropy [59], which is defined as follows [60]: where B i is a proposition in the BPAs, and |B i | is the cardinality of B i . The entropy can definitely degenerate to the Shannon entropy especially when the belief is only assigned to single element. Namely, and, for m 1 (A) = 2 |A| −1 , A, B ⊆ X, m 1 is the mass function having the maximum Deng entropy for the frame of discernment X = {a, b, c}, and its uncertainty can also be calculated by ∑ B⊆X log 2 (2 |B| − 1).

IOWA Operator
The Induced Ordered Weighted Averaging (IOWA) operator [61], which is introduced by Yager and Filev, is a more general type of the Ordered Weighted Averaging (OWA) operator. An important feature of this operator is that the ordering of the arguments is induced by another variable called the order inducing variable.
Assume there are n two-tuple OWA pair u i , a i , i = 1, · · · , n that has an associated weight vector W = (w 1 , w 2 , · · · , w n ) T of dimension n having the following properties: Then, the IOWA operator is defined as follows [61]: where b j is the a i of the OWA pair having the jth largest u i . u i is referred as the order inducing variable and a i is referred as the argument variable. orness, which is associated with the weight vector W = (w 1 , w 2 , · · · , w n ) T , is defined as follows: where 0 ≤ orness ≤ 1.

Maximum Entropy Method
To apply the IOWA operator in fault diagnosis, a very crucial issue is to determine its weight. The weight problem is denoted as a constrained nonlinear optimization model in the MEM model, which is presented by O'Hagan. The weight is gained by the following optimization model [62]: Suppose n = 5 and the weights satisfy different degrees of orness : α = 0, 0.1, . . . , 1, then the weight vector is determined by MEM model, which is shown in Figure 1. From Figure 1, we can conclude that: the value of the weight vector is closer to the average value W = (1/n, 1/n, · · · , 1/n) T ; the value of α is closer to α = 0.5; the value of the weight vector is closer to W = (1, 0, · · · , 0) T ; the value of α is closer to α = 1. Namely, the smaller the credibility gap among BPAs, the more average for weight distribution.

The Evidential IOWA-Based Fault Diagnosis Method
As shown in Figure 2, in the fault diagnosis technique, typically, the first step should be information collecting from actuators. Secondly, all hypotheses are modelled (by BPAs in the frame of Dempster-Shafer evidence theory). Thirdly, the evidence is modified according to the IOWA operator. Finally, data fusion is applied for fault diagnosis and decision-making. Here, how to get an appropriate weight to modify the evidence is very important for locating the possible fault accurately. In the proposed method, the MEM model based on the distance of evidence and the belief entropy are used to generate the appropriate weight of evidence.

The Evidential-IOWA Parameter
Recently, the IOWA operator has aroused the attention of scholars and is widely used in real applications [63][64][65]. However, there are some problems while using the IOWA operator. For example, the α value of a constraint condition usually depends on the experience of the experts, which does not lead to an objective result. In this paper, based on the the distance of evidence and the belief entropy, the α value is induced as an objective weight.

Definition of α in IOWA
The distance of evidence and the belief entropy are jointly considered to determine the α value. The value of α is defined as follows: where d g is the global distance of evidence, E g d is the global belief entropy, and 0 ≤ d g ≤ 1, 0 ≤ E g d ≤ 1, 0.5 ≤ α ≤ 1. α 1 is a data-driven value based on the distance of evidence, and α 2 is another data-driven value based on belief entropy.

Definition of α 1 Based on the Distance of Evidence
Assume that there are many pieces of evidence for fault diagnosis. The Jousselme distances d ij , i, j = 1, 2, · · ·, n between two evidence m i and m j can be calculated according to Equation (3), and the distance matrix (DM) is defined as follows: The average distance of evidence of m i , i = 1, 2, · · · , n, with respect to the other evidence, denoted as d i , is defined as follows: then, the global distance of evidence among all the evidence d g is defined as follows: If the global distance of evidence d g has a big value, the smaller the global similarity degree among the diagnosed results, the smaller the credibility degree of each sensor. In other words, the smaller the weight gap among the BPAs, the more average the weight distribution is, which means that the value of α is closer to α = 0.5. If d g = 1, which means that the diagnosed fault type of multi-sensor is entirely different; in this case, the credibility degree of each evidence is the same with each other. Thus, the evidence should be assigned the same weight, namely, the weight vector is W = (1/n, 1/n, · · · , 1/n) T and α = 0.5.
Conversely, the smaller the value of d g , the greater the global similarity degree of the diagnosed results, so the BPAs can be represented approximately by less or even one BPA with a high credibility degree. That is to say, the BPA with high credibility degree is given a greater weight and the BPA with a low credibility degree is given a small weight. Thus, the smaller the value of d g , the more inequality of the weight distribution, which means the value of α is closer to α = 1. If d g = 0, which means that the diagnosed results are similar, so the BPA can be represented by any BPAs. Considering the consistency of the algorithm, the initial weight is assigned as W = (1, 0, · · · , 0) T and α = 1.
Based on the above analysis, a relational formula of the degree of orness α 1 is defined as follows: where d g is the global distance of evidence, and 0 ≤ d g ≤ 1, 0.5 ≤ α 1 ≤ 1.

Definition of α 2 Based on the Belief Entropy
Deng entropy is an efficient tool to measure uncertainty, not only under the situation where the uncertainty is represented by a probability distribution, but also under the situation where the uncertainty is represented by the BPAs. Thus, this entropy is used to determine the α value.
The global belief entropy E g d is defined as follows: where E di is the belief entropy of the evidence m i . (E d ) max is the maximum belief entropy on the frame of discernment X, which is defined as: The greater the global belief entropy E g d , the greater the global uncertainty of the diagnosed faults. Therefore, the weight distribution should be more average, and the α value is more close to 0.5. If E g d = 1, it shows that the diagnosed faults is entirely uncertainty, so they should be assigned to the same weight, that is, α = 0.5.
The smaller the global belief entropy E g d , the smaller the global uncertainty of the diagnosed faults. Then, the BPA can be represented approximately by a few or even one BPA of relatively small uncertainty. Therefore, the smaller the E g d , the more inequality the weight distribution, the closer α = 1. If E g d = 0, the BPA can be represented by any BPAs, that is to say, the weight vector is W = (1, 0, · · · , 0) T and α = 1.
Based on the above analysis, a relational formula of the degree of orness α 2 is defined as follows: where E g d is the global belief entropy, and 0 ≤ E g d ≤ 1, 0.5 ≤ α 2 ≤ 1.

The Weight Vector of IOWA
After obtaining the parameters α, the weight vector W = (w 1 , w 2 , · · · , w n ) T can be obtained according to the MEM model. Assume that there are n BPAs m i , i = 1, 2, · · ·, n, the weight vector W = (w 1 , w 2 , · · · , w n ) T can be calculated according to the following steps: Step 1 According to Equations (14) and (15), the global distance of evidence d g and the α 1 value can be calculated, respectively.
Step 2 The global belief entropy E g d and the α 2 value are obtained by Equations (16) and (18), respectively.
Step 3 The α value and the weight vector W are calculated based on Equations (11) and (10), respectively.

Multi-Evidential Fusion Model
After getting an appropriate weight vector, the evidence can be modified before using Dempster's combination rule. The evidence are reordered according to the IOWA operator. Assume there are n BPAs, denoted as m i , i = 1, 2, · · ·, n, the steps of ordering and evidence fusion are defined as follows: Step 1 Construct the inducing variable S i : where d i is the average distance of evidence obtained by Equation (13).
Step 2 Obtain the OWA pairs S i , M i , i = 1, 2, · · · , n, where M i is the argument variable, namely, it is the BPAs of the evidence m i .
Step 3 According to Equation (8), the weighted average evidence can be calculated.
Step 4 Combine the new evidence with Dempster's combination rule by (n − 1) times.
With the fusion results, decision-making can be made based on the maximum principle of BPAs. An illustrative explanation of the new method is presented in Figure 3. Firstly, the degree of orness α should be computed based on distance of evidence and belief entropy. Secondly, the weight vector W = (w 1 , w 2 , · · · , w n ) T can be obtained based on the MEM model. Thirdly, a corresponding inducing variable can be constructed. Fourthly, evidence modification and fusion can be achieved. Finally, decision-making in fault diagnosis is based on the fused results.

Start
Obtain the value of IOWA.
Construct the inducing variable of IOWA.
Modify the BPAs based on IOWA operator and get the weighted average evidence.
Evidence fusion and decision-making in fault diagnosis.

End
Obtain the weighting vector based on the MEM model.
Calculate the average distance and the global distance based on distance of evidence respectively.
Calculate the belief entropy and the global Belief entropy respectively.

Experiment with Artificial Data
This numerical example is used to illustrate how to apply the proposed method in fault diagnosis. Assume that, in the case of motor rotor fault diagnosis, vibration signal is collected by five sensors. There are three faults, denoted as A, B and C, in motor rotor, which represents the unbalance, misalignment and pedestal looseness fault types, respectively. The BPAs based on these sensors are assumed to be independent and there are abnormal sensor reports, as is shown in Table 1. Intuitively, m 2 comes from abnormal sensor report. Since evidence modelling is another open issue in Dempster-Shafer evidence theory, we do not discuss how to model data with BPAs in this paper. For more detail on how to generate BPAs, please refer to some related work such as [45,46,49]. According to the new method shown in Figure 3, firstly, with Equations (13) and (14), the average distance of evidence d i , i = 1, 2, · · ·, 5 and the global distance of evidence d g can be calculated, respectively, and the results are:  (5) and (16), the belief entropy E di , i = 1, 2, · · ·, 5 and the global belief entropy E g d can be calculated, respectively, and the results are: E d1 = 1.5664, E d2 = 0.4690, E d3 = 1.8092, E d4 = 1.8914, E d5 = 1.7710 and E g d = 0.3534. Secondly, the degree of orness α can be calculated by Equation (11): The weight vector W = (w 1 , w 2 , w 3 , w 4 , w 5 ) T is calculated according to Equation (10), and the result is: In addition, the inducing variable S i , i = 1, 2, · · ·, 5 are calculated according to Equation (19): In Table 2, we compare the results among several existing methods. It also shows the process of locating the fault type. With the new method, the belief in the fault diagnosis results that A is the fault type is 99.14%, which is not lower than the other methods.  [55] and the proposed method. However, Murphy's method is only a simple arithmetic mean which does not consider the difference among the evidence, while Deng et al.'s method ignores the influence of evidence itself in generating the weight factor. The proposed method takes into consideration more available information before making data fusion and fault diagnosis, e.g., the distance of evidence and the belief entropy.

A Case Study
In order to verify the effectiveness and success of the proposed evidential-IOWA sensor data fusion approach, the new method is applied to a case study adopted from [66].
Recall the fault diagnosis problem in [66]. Three potential fault types are denoted as F 1 , F 2 and F 3 ; thus, the fault hypothesis set is Θ= {F 1 , F 2 , F 3 }. Three sensors report the diagnosis results independently, the diagnosis results are modelled as three bodies of evidence, denoted as E 1 , E 2 and E 3 , and the BPAs of the diagnosis results are shown in Table 3. Intuitively, F 1 is the fault type because both E 1 and E 3 have a belief of more than 60% on the fault type F 1 , while the E 2 may come from an abnormal sensor in comparison with the other two bodies of evidence. This is a challenge for data fusion, especially for some conventional combination rules, such as Dempster's rule of combination. The proposed method is applied to solve this problem. According to the proposed method shown in Figure 3, the first step is to calculate the average distance and global distance of the evidence E 1 , E 2 and E 3 . Based on Equations (13) and (14), the calculation results of the average distance of each piece of evidence, denoted as d i (E i ), i = 1, 2, 3, and the global distance, denoted as d g (E i ), i = 1, 2, 3, are shown in Table 4.  = 1, 2, 3).
Then, based on Equations (5) and (16), the corresponding belief entropy, denoted as E di (E i ), i = 1, 2, 3, and the global belief entropy, denoted as E g d (E i ), i = 1, 2, 3, are calculated in Table 5. Table 5. The belief entropy and global belief entropy of E i (i = 1, 2, 3). With Equation (11), the degree of orness α of the case study, denoted as α (E i ) , is calculated as follows:

Belief Entropy-Based Parameter
According to the Maximum Entropy Method defined in Equation (10), the weight vector of the evidence, denoted as W (E i ) = (w 1 , w 2 , w 3 ) T , can be calculated, and the result is The inducing variable, denoted as S i (E i ) (i = 1, 2, 3), can be calculated based on Equation (19) and the parameters in Table 4, and the results are shown as follows: Combining the inducing variables with the parameters in Table 5, the OWA pairs < S i (E i ) , E i >, i = 1, 2, 3, are ordered as follows: Now, the BPAs in Table 3 can be modified according to Equation (8), and the weighted average evidence are as follows: Finally, combining the weighted average evidence with Dempster's combination rule by four times, the fusion results are as follows: The fused results with the proposed method are compared with the method in [66] where this case study comes from, and the comparison result is shown in Table 6. It can be concluded from Table 6 that the proposed method has the most distinguishable fusion results on sensor reports, which means a clear indicator on the most possible fault type. The highest belief degree on fault type F 1 is 91.23%, which is higher than the method with Fan et al's method with more than 10%. This is helpful for decision-making in real applications. While the fusion results of fault type F 1 and F 2 with the conventional Dempster's rule of combination are close to each other, it is hard to judge which fault has occurred. The case study verifies the effectiveness of the proposed method. In addition, the case study indicates a better performance of the proposed method in comparison with some of the existing methods.

Discussion
The effectiveness of the proposed method is verified according to the applications based on both artificial data and the experiment adopted from the literature.
A few reasons contribute to the success of the new method. Firstly, not only the distance of evidence, but also the belief entropy and the IOWA operator are taken into consideration, which means more available information are used while doing information processing. Thus, information loss is decreased. Secondly, the way of getting the degree of orness a of IOWA (based on belief entropy and evidence distance) is data-driven, which is more reliable compared with some subjective methods. Finally, the final fused rule is based on Dempster's rule of combination. The merits of Dempster's rule of combination, such as satisfying the commutativity and associativity, contribute to the effectiveness of the proposed method.
In the fault diagnosis (FD) research area, an FD technique is good if a new method can guarantee that there is no false alarm, no missed detection and a full detection for all considered faulty scenarios [67,68]. The ongoing work of the proposed method should try to focus on this case. In future work, the following situations should be well addressed: • FD without fault to be sure that the proposed solution doesn't give false alarm, • FD with a misalignment fault to highlight that we detect this fault well, • FD with pedestal fault.

Conclusions
In this paper, in the frame of Dempster-Shafer evidence theory, an improved evidential-IOWA sensor data fusion approach is proposed in dealing with a multi-source data-based fault diagnosis problem. Before applying sensor data fusion for final decision-making, the sensor data comes from different independent sources modelled, as BPA is pre-processed to avoid unreasonable fusion results that may be caused by conflicting evidence. In the new method, the IOWA operator is used to determine the weight of different sensor data sources, and the parameter of the IOWA is based on the distance of evidence and the belief entropy. The proposed method has a better performance in conflict management and fault diagnosis due to the fact that the information volume of each piece of evidence is taken into consideration. The proposed method outperforms the other methods according to the applications.
The ongoing work of the proposed method will be focused on some basic rules of fault diagnosis in industrial environmental scenarios, e.g., no missed detection and a full detection for all considered faulty scenarios should be strictly obeyed while applying the fault diagnosis technique.