Intelligent wear mode identification system for marine diesel engines based on multi-level belief rule base methodology

Wear faults are among the chief causes of main-engine damage, significantly influencing the secure and economical operation of ships. It is difficult for engineers to utilize multi-source information to identify wear modes, so an intelligent wear mode identification model needs to be developed to assist engineers in diagnosing wear faults in diesel engines. For this purpose, a multi-level belief rule base (BBRB) system is proposed in this paper. The BBRB system consists of two-level belief rule bases, and the 2D and 3D characteristics of wear particles are used as antecedent attributes on each level. Quantitative and qualitative wear information with uncertainties can be processed simultaneously by the BBRB system. In order to enhance the efficiency of the BBRB, the silhouette value is adopted to determine referential points and the fuzzy c-means clustering algorithm is used to transform input wear information into belief degrees. In addition, the initial parameters of the BBRB system are constructed on the basis of expert-domain knowledge and then optimized by the genetic algorithm to ensure the robustness of the system. To verify the validity of the BBRB system, experimental data acquired from real-world diesel engines are analyzed. Five-fold cross-validation is conducted on the experimental data and the BBRB is compared with the other four models in the cross-validation. In addition, a verification dataset containing different wear particles is used to highlight the effectiveness of the BBRB system in wear mode identification. The verification results demonstrate that the proposed BBRB is effective and efficient for wear mode identification with better performance and stability than competing systems.


Introduction
The marine diesel engine is the main power source in almost 99% of ships all over the world [1], and therefore, its reliability is extremely crucial for the safe and economical operation of ships. As reported in Main Engine Damage, a study published by The Swedish Club [2], main-engine damage contributed to 34.3% of the total marine machinery claims in 2012-2014 and accounted for 46% of the total cost of marine machinery claims with an average claim per vessel of USD 545 000, remaining the most expensive category in hull and machinery claims [2]. Wear faults are among the chief causes of main-engine damage, accounting for almost 50% of faults in main diesel engines [3]. Therefore, it is necessary to diagnose wear faults in main diesel engines to guarantee ship safety and reliability.
To date, many techniques have been developed for mechanical wear fault diagnosis, such as vibration analysis [4], acoustic emission [5], oil monitoring [6], and surface texture analysis [7]. Each technique has its own advantages and limitations. When applied to marine diesel engines, vibration analysis and acoustic emission measurement are subject to background noise, and abnormal vibration/acoustics may be perceptible under severe damage conditions. Moreover, it is difficult to acquire the surface texture of friction pairs in marine diesel engines, because it is costly/impossible to disassemble a marine engine for surface texture analysis. In contrast, oil monitoring is an appropriate technique for identifying wear faults of marine diesel engines. Engine lubricant oil carries debris that can give information about engine wear and oil samples can be collected without destroying any structure of the engine. Therefore, oil monitoring is easy and reliable in operation.
Generally, wear particle debris analysis and physicochemical property analysis are used in oil monitoring [8]. The wear modes of friction pairs can be identified by using morphological characteristics extracted from wear particles, and then the causes of abnormal wear can be determined. Developing an intelligent wear mode identification system holds great promise for the condition monitoring and fault diagnosis of marine diesel engines [9][10][11]. With an intelligent system, the wear states of marine diesel engines can be evaluated without stopping or dismantling them, and early wear faults can be detected for just-in-time maintenance. Moreover, an intelligent wear mode identification system can assist inexperienced engineers in evaluating the engine's health state. It is worth noting that a crew with insufficient experience is listed by The Swedish Club as one of the top six causes of main-engine damage [2].
Wear characteristic extraction is the basis of developing an intelligent wear mode identification system. The characteristics can be divided into 2D characteristics, such as particle shape and size, and 3D characteristics, such as surface topography. Two-dimensional characteristics are extracted from 2D images acquired by optical microscopes, while 3D characteristics are extracted from 3D images acquired by laser scanning confocal microscopes (LSCMs) or atomic force microscopes [12]. Two-dimensional characteristics are widely used in identifying cutting wear particles [13]. However, 3D characteristics become necessary when wear particles have similar sizes or shapes. In recent years, 3D characteristic extraction has attracted much attention. Podsiadlo and Stachowiak [14] developed a modified partitioned iterated function system and used this system to obtain a full description of the topography of wear particles and surfaces. Yuan [15] proposed a method for characterizing the roughness of engineering surfaces and the surfaces of small wear particles in accordance with wavelet theory. These 3D characteristics make it possible to improve the accuracy of wear mode identification.
Many intelligent algorithms have been applied in wear mode identification. The most representative ones include artificial neural networks (ANNs), support vector machines (SVMs) and Grey models. Myshkin [16] performed an evaluation of the possibility of employing an ANN for the classification of debris. Peng [17] and Xu [18] combined an ANN and knowledge-based expert system for the analysis of microscopic wear particles. Gwidon [19] developed an identification model based on an SVM to classify particles into fatigue, adhesive and abrasive wear particles. Wang [20] used principal component analysis to optimize the characteristic parameters of wear particles and then distinguish wear particles with Grey relational analysis. Other methods have also been used in this area, such as classification and regression trees [11], deterministic tourist walks [21], the AdaBoost algorithm [22], and extreme learning machines [23]. To the best of our knowledge, most identification models for wear particles are based on 2D characteristics, and very few researches have used 3D characteristics to identify categories of wear particles [24]. Moreover, most wear mode identification models have been developed on the basis of data-driven methods, such as ANNs and SVMs. Expert-domain knowledge cannot be incorporated into these models. How to integrate expert-domain knowledge into an intelligent wear identification system, to enable the system to process quantitative and qualitative information simultaneously, remains a challenging task.
The belief rule base (BRB) is a semi-quantitative method that can address this challenge. The BRB is developed on the basis of the D-S theory of evidence, decision theory and traditional if-then rules [25]. This method can not only use different types of information (i.e. quantitative information and qualitative information) with uncertainties but also represent the relationship between input and output in a transparent and interpretable way [26]. The initial BRB model can be built by expert-domain knowledge, and then optimized by intelligent algorithms [27]. The BRB has shown excellent performance in many fields, including medical care, consumer behavior prediction and safety assessment [28][29][30]. Therefore, it is worth evaluating the performance of the BRB in the wear mode identification of marine diesel engines. When applying the BRB in wear mode identification, one should pay attention to several issues. First, the number of antecedent attributes (i.e. characteristics of wear particles) and referential points should be appropriately determined, because it directly affects the number of rules and parameters in the BRB. In wear mode identification, not all wear particles need to be identified by using 2D and 3D characteristics together. Some wear particles (e.g. cutting wear particles and spherical particles) can be well identified by their shape and size. Second, the initial BRB model should be optimized by a proper optim ization method to ensure a robust system with good performance. However, researches addressing these two issues in constructing a BRB wear mode identification system have not yet been conducted.
In order to address the aforementioned problem, a multilevel BRB (BBRB) method is developed in this paper to identify the wear modes of marine diesel engines. The multi-level structure is used to determine a proper number of appropriate attributes. The complexity of the BRB and the computational burden can be reduced significantly with this model structure. The silhouette value and fuzzy c-means clustering are employed to determine referential points in the BBRB and transform input sets into belief distributions. The genetic algorithm (GA) is adopted to optimize the BBRB parameters. A series of friction-wear experiments on the cylinder linerpiston ring of marine diesel engines are carried out to evaluate the BBRB method.

Structure of the proposed BBRB system
Wear particles have a direct relationship with the wear modes. Through wear particle identification, the wear modes can be determined and the engine condition can be further predicted. A literature review indicates that wear modes generally include abrasive wear, fatigue wear, cutting wear and adhesive wear [8]. Specifically, abrasive wear and cutting wear mainly produce cutting wear particles (C); fatigue wear produces laminar particles (L), spherical particles (SP) and fatigue spall particles (FS); and adhesive wear produces severe sliding wear particles (SSL) [31,32].
Let X = [x 1 , x 2 , …, x M ] denote the characteristics of wear particles which are extracted from wear particle images acquired by an LSCM. Here x i (i = 1, 2, …, M) represents an attribute and M is the number of attributes. We further assume that D = [D 1 , D 2 , ..., D N ] is the output of the identification model and P is the corresponding parameter vector, where N is the number of wear particle categories. In order to develop an intelligent wear mode identification system using the BRB technique, the first step is to establish a causal relationship between X and D, the second step is to determine the referential points for X to avoid a combination explosion of BRBs, and the third step is to optimize the model parameter vector P for the purpose of achieving high identification accuracy.
As indicated in the literature, fatigue spall particles, severe sliding wear particles and laminar particles are difficult to distinguish by 2D characteristics. To solve this problem, a twolevel BRB (BBRB) model is designed such that each level can separately process the 2D and 3D characteristics of wear particles. Figure 1 shows the structure of the proposed BBRB system for wear mode identification.
In figure 1, ] denote the input 2D characteristics for the first BRB level and the 3D characteristics for the second BRB level. D 1 and D 2 are the output of each level, which can be represented by the belief distributions as follows: where β j i ( j = 1, 2; i = 1, 2, 3) is the belief degree of every wear particle category and satisfies the constraints 3 i=1 β j i = 1 and 0 β j i 1(i = 1, 2, 3; j = 1, 2). In the output D 1 , severe sliding wear particles, fatigue spall particles and laminar particles are integrated into one category D 1 3 as they are difficult to classify by their 2D characteristics. The output attribute with the maximum belief degree is considered as the wear particle type, i.e. j = arg max{β j 1 , β j 2 , β j 3 }( j = 1, 2). However, if β 1 3 is the maximum belief degree, the level-2 BRB is activated. In the level-2 BRB, the three wear particles which are difficult to identify in the level-1 BRB can be classified by the output attribute with the maximum belief degree.
As can be seen in figure 1, each level in the BBRB system is an independent BRB system; therefore, the inference process and optimization process of each level are conducted independently. In the optimization process, the predicted category of the wear particle S j ( j = 1, 2) is compared with the real type S j ( j = 1, 2), and (1 − UA j )( j = 1, 2) is used as the objective to be minimized with the GA method, where UA j ( j = 1, 2) is the user accuracy on the jth level.
with rule weight θ i k and antecedent attribute weights .., L i ; j = 1, ..., M i ) denotes the referential value of the jth antecedent attribute in the kth rule, and L i is the number of rules in the ith BRB.
The number of rules in the ith BRB L i is determined by equation (4) when the BRB is built by the exhaustive method: where T i j ( j = 1, ..., M i ) is the number of referential points for the jth antecedent attribute. As indicated in equation (4), the size of a BRB is directly determined by M i and T i j ( j = 1, ..., M i ). Therefore, appropriate antecedent attributes and referential points should be used in order that the BRB system can achieve satisfactory precision with proper model structure.

Determination of referential points and input transformation.
To avoid an overlarge BRB and increased calculation difficulty, the number of referential points should not be too large. Moreover, the referential values of each antecedent attribute should be surrounded by adequate points [30]. Often, the number and values of referential points are determined by experts subjectively and therefore turn out to be inaccurate. In this paper, a new approach based on the k-means clustering algorithm and average silhouette value is proposed to determine referential points for every antecedent attribute. Specifically, the silhouette value s(i) (i = 1, ..., n) for every data point measures the similarity of one point to other points in its own cluster compared to its similarity to other points in other clusters [33,34], with a value range of [−1, 1]. A high s(i) indicates the ith data point is clustered reasonably, and a negative value represents a bad partition, which should be avoided.
Every antecedent attribute in the BRB system for wear mode identification is considered to be independent of the others with equal importance, and therefore the number and values of referential points for each attribute are determined with the corresponding input of one antecedent attribute x = (x 1 , x 2 , ..., x n ), where n is the number of data points. The flowchart for determining referential points, shown in figure 2, consists of five steps: Step 1: Initialize the number of clusters k = 2, the maximum number of clusters K (here K is 5 to avoid an overlarge BRB), and the average silhouette value set S .
Step 2: Partition x into k clusters with the cluster centers Based on the clustering result, calculate the silhouette value of each point s k (i)(i = 1, ..., n).
Step 3: Traverse the whole silhouette value set S k . If all values in S k are over 0, calculate the average silhouette value s(k) and add s(k) into the set S . The number of clusters then changes, i.e. k = k + 1. Otherwise, k acquires a new value directly, and then repeat Step 2.
Step 4: Repeat Step 2 and Step 3 until k is over K . Then, find the maximum average silhouette value s _best in S . The corre sponding number of clusters k _best will be the number of referential points. Sequence the clustering centers Step 5: Output the number and values of referential points k _best and C.
Once the referential points are determined, the input is transformed to belief degrees-that is, it is matched to the referential points using the fuzzy c-means clustering algorithm. For every antecedent attribute, the belief degree α ij (1 i k _best , 1 j n), where x j matches the referential value C i _best (1 i k _best ), is calculated according to equations (5)-(7): In equation (6), m is the weight exponent, which controls the relative weight placed on each of the distances from one data point to the referential points C i _best . As suggested in reference [35], 1.5 m 3 generally gives good results; m = 2 in this paper. An input data point x j ( j = 1, ..., n) can be represented by a belief distribu-

Rule inference with evidential reasoning (ER) algo-
rithm. On each level, the activation weight of the kth rule ω k is calculated by where {δ i } is the normalized antecedent attribute weight. ω k indicates the degree to which the kth rule is activated. If ω k = 0, the kth rule will not be activated [24]. The ER analytical algorithm is used to aggregate all the activated rules and generate final conclusions N) is the predicted belief degree of D i , which is acquired by equations (9) and (10): where β k i is the belief degree assigned to D i based on expert experience or statistical analysis in the initial BRB model.

Optimization of BBRB parameters based on GA.
The initial parameters of the BBRB on each level are given according to expert-domain experience, which may not be accurate. It is therefore necessary to fine-tune the parameters of a BRB model by using an optimization algorithm to improve the model performance. As indicated in figure 1, model optimization on each level is conducted independently. On each level, P = [β 11 , ..., β NL , θ 1 , ..., θ L ] represents the parameters to be adjusted, and the total number of parameters optimized equals (N × L + L). Since wear mode identification is a classification problem, the misclassification rate should be as low as possible.  is used as the optimization objective function, where UA is the ratio of the number of correctly identified samples to the total number of samples as indicated in equation (11): where n c is the number of correctly identified samples and n is the total number of training samples. The optimization model is defined by equations (12) and (13): The parameters of the BBRB model are optimized by the GA. In the optimization, expert experience in wear mode identification should be embedded in the initial population. Therefore, in this paper, the initial population consists of two parts: individuals provided by experts and randomly generated individuals. The optimization objective function f (x) in equation (12) is used as the fitness function. The optimization process can be conducted by using the Global Optimization Toolbox in MATLAB.

Establish the BBRB system using experimental data
Experimental datasets on wear particles of diesel engines were used in this study to build the proposed BBRB system for wear mode identification. The experimental wear particles were generated from an EQD 210-20 diesel engine, a ZH 1115 diesel engine and an abrasion testing machine. A total of 150 samples, containing cutting wear particles, spherical particles, fatigue spall particles, laminar particles and severe sliding wear particles, were obtained and prepared for analysis using the BBRB system.

Determination of referential points
Five characteristics of the wear particle morphology were chosen as antecedent attributes of the multi-level BRB system for wear mode identification based on analysis of the characteristics of various wear particles [19,[36][37][38]. These characteristics included three 2D characteristics and two 3D characteristics: the aspect ratio (AR), equivalent diameter (D e ) and roundness (R) for the level-1 BRB and the roughness average (S a ) and texture direction index (S tdi ) for the level-2 BRB. These five antecedent attributes are described as follows [39,40]: (1) The AR is the ratio between the length and breadth of a wear particle and is often greater than or equal to 1. (2) D e is the diameter of a circle having an area equal to the area enclosed by the shape contour of a wear particle: (3) R is the roundness, which describes the shape resemblance of a wear particle to a circle: where D max indicates the maximum diameter of the minimal enclosing circle. (4) S a is the arithmetical absolute mean of the ordinate values with a definition area: (5) S tdi is defined as the average amplitude sum divided by the amplitude sum of the dominant direction. S tdi is always between 0 and 1. A surface with very dominant directions has a S tdi value close to 0 and if the amplitude sums of all directions are similar, S tdi is close to 1.
where A(α) is acquired by calculation of the Fourier power spectrum.
The referential points of every antecedent attribute are described by semantic terms and the corresponding referential values as shown in table 1. From table 1, it can be seen that the antecedent attributes AR, D, R, S a , and S tdi are described by two, two, three, two and four semantic terms, respectively. The semantic terms for referential points are very low (VL), low (L), middle (M) and high (H).
Every input needs to be transformed in terms of the referential points defined in table 1 and represented by belief degrees indicating how the input matches the referential points. Using equations (5)- (7), the input of the BBRB can be transformed into a belief distribution. For example, the AR, i.e.  represent the cutting wear particles, spherical particles, and SBL particles (which is a genetic term for severe sliding wear particles, fatigue spall particles, and laminar particles) in the level-1 BRB, and D 2 1 , D 2 2 , D 2 3 represent the severe sliding wear particles, fatigue spall particles and laminar particles in the level-2 BRB. Belief rules in the level-1 BRB and level-2 BRB are represented by (18a) and (18b) respectively: where 3 i=1 β 2 ik 1, with rule weight θ 2 k . According to equation (4), it can be found that 12 rules and eight rules are generated in the level-1 BRB and level-2 BRB respectively. On each level, we assume that all rules are of the same weight of 1 and the weight of every antecedent attribute is 1. The belief degree assigned to every kind of wear particle is based on expert knowledge and statistical analysis of historical data.
The inference process of the initial BRB system for wear mode identification is conducted based on equations (8)- (10) with the initial parameters we assume. If the maximum belief degree of the output in the level-1 BRB corresponds to SBL particles, the input sample is further identified by the level-2 BRB, as the kind of wear particle with the maximum belief degree.

Optimization of the BBRB system
As shown by the structure of the multi-level BRB system, the BRB model on each level is optimized independently. In the level-1 BRB, N 1 = 3 and L 1 = 12 so that the number of parameters optimized in P 1 is 48, while N 2 = 3 and L 2 = 8 so that the number of parameters optimized in P 2 is 32 in the level-2 BRB.
To acquire more information from limited samples and reduce variability, five-fold cross-validation is performed. Specifically, the original dataset is randomly partitioned into five equal-sized sub-datasets. Of the five sub-datasets, a single sub-dataset containing 30 samples is retained as validation data for testing the model, and the remaining four sub-datasets with 120 samples are used as training data. The cross-validation process is then conducted five times and the validation results are averaged over the five rounds. The initial parameters of all BRB systems are the same in every cross-validation process. Specifically, the two initial population sets for the BRBs on the two levels consist of 50 individuals each; the maximum evolutionary generation for each BRB is 100. After the initial model is tuned, optimized parameter sets P 1 and P 2 are obtained, which are further used in wear mode identification. The optimized parameter sets of each BRB level in the five-fold cross-validation are shown in appendix A. Take the sample (x 1 : 1.83, x 2 : 88.86, x 3 : 0.37, x 4 : 0.423, (x 1 : 1.83, x 2 : 88.86, x 3 : 0.37, x 4 : 0.423, x 5 : 0.0833) as an example and use the parameters of the BRB system in the first cross-validation to determine the type of wear particle. indicating that this sample belongs to the category of severe sliding wear particles with a belief degree of 0.669. A specific input set can activate several rules synchronously, but generally, only one or two rules play dominant roles in generating output with greater activated rule weights. In the above example, although the input set activates five rules on each BRB level, the activated weight of the fourth rule in the level-1 BRB and that of the fifth rule in the level-2 BRB are significantly greater than those of the other activated rules. It can be concluded that the two rules are crucial to making the final determination.

Cross-validation
4.1.1. Validation results. The BBRB system for wear mode identification was evaluated by five-fold cross-validation using the experimental datasets in section 3. The UA of the testing dataset acquired in each cross-validation trial was used as the evaluation index. The final validation result was the average value of the UA obtained in the five cross-validation trials. The performance of the BBRB was compared with that of four other models. These four models were a bi-level BRB without the GA (INI-BBRB), a single-level BRB system using only 2D wear characteristics (SBRB-2D), a single-level BRB system using 2D characteristics and 3D characteristics (SBRB-2D&3D), and a bi-level ANN (BANN) system with 2D characteristics and 3D characteristics as the antecedent attributes. The referential points for the single-level BRB systems (i.e. SBRB-2D and SBRB-2D&3D) are shown in table 2. Consequently, there are 12 rules in SBRB-2D, and 96 rules in SBRB-2D&3D if the rule bases are built according to equation (4). To avoid any knowledge gap on the 96 rule bases and reduce the optimization burden, only rules that can be clearly decided by expert experience in the experimental data were considered in SBRB-2D&3D, and hence, we finally obtained 20 rules. As for the BANN model, the structure of the level-1 ANN was 3-4-3, and the level-2 ANN was designed as 2-3-3. Figure 3 shows the average UA value in the five-fold crossvalidation, and figure 4 shows the UA values of the individual wear modes. It can be seen in figure 3 that the UA of the INI-BBRB is 78%, which is lower than the 89.3% of the proposed BBRB. As shown in figure 4, the INI-BBRB performs poorly on SBL particles, especially on laminar particles, with only   43.33% accuracy. This is because the model parameters of the INI-BBRB are just determined by experts and cannot fit the specific data characteristics in the training dataset. Moreover, since only a few researches have focused on 3D characteristics, there is insufficient knowledge on the 3D characteristics of different wear particles that is referred to in building initial BRBs. As a result, the performance of the INI-BBRB is lower than that of the BBRB. Thus, it is necessary to optimize the BRB parameters to enhance the precision of wear mode identification.
As can be seen in figure 3, SBRB-2D provides the lowest average UA value among the five models, i.e. 67.33%. From figure 4 it can be seen that SBRB-2D performs well on identifying cutting wear particles and spherical particles, but extremely poorly on identifying severe sliding wear particles, fatigue spall particles, and laminar particles. The UA values of the other three modes are below 60%. These observations confirm that it is necessary to use 3D characteristics to distinguish wear particles that have similar values in size and shape. Further, cutting wear particles and spherical particles can be accurately identified by SBRB-2D because of their distinguishable 2D characteristics.
As indicated in figures 3 and 4, the performance of SBRB-2D&3D is quite similar to that of the BBRB model. Although SBRB-2D&3D is inferior to the BBRB model in identifying fatigue spall particles and laminar particles, it has satisfactory accuracy compared to the other three wear modes. However, it should be noted that the rule base of SBRB-2D&3D in the analysis is not complete but only covers cognizable rules for the experimental data. If a sample does not match any rule of the current rule base, no rule is activated. As a result, the SBRB-2D&3D model may fail to work. More importantly, the SBRB-2D&3D model is more complex than the BBRB no matter what the number of rules or optimized parameters is. When a rule base is constructed by the exhaustive method, the number of rules is determined by equation (4). Any additional antecedent attributes or new referential points will increase the size of the rule base sharply. For example, the referential points of the five antecedent attributes in SBRB-2D&3D are 2, 2, 3, 2, and 3; therefore, there are 96 rules to be determined and 576 parameters to be optimized. It is difficult for experts to construct the initial rule base and it is impossible/difficult to optimize the identification model with insufficient datasets. In contrast to that in the SBRB-2D&3D model, the total number of rules in the BBRB is calculated by equation (19), where the addition operator can decrease model complexity significantly. As for the BBRB model in this paper, the total number of rules is 20 ( L total = 2 × 2 × 3 + 2 × 4) and the number of parameters optimized on both levels is 48 and 32.
Since an ANN has remarkable ability to approximate the relationships in a dataset with sufficient data samples, the BANN provides the best average UA value, 92%, among the five models as shown in figure 3. The BANN performs well on identifying laminar particles with a UA value of 93.3%, compared to the 76.7% achieved by the BBRB, but it generates the worst UA value on identifying cutting wear particles (C), as shown in figure 4. As a data-driven method, the BANN has several inherent limitations, but these limitations can be overcome by the proposed BBRB model owing to its embedded expert-domain knowledge and transparent inference process.
Firstly, the BANN model is a black-box simulator. In the whole process, it can be known that the BANN model is a non-linear combination of some neurons, but it is difficult for engineers to explain what each neuron is doing, and engineers do not know what rules have been applied in the tools when they are using the BANN model to identify wear modes. In contrast, each parameter of the BBRB model has its own real meaning. The intermediate parameter ω (activated rule weight) shows which rules are activated by a specific input set and aggregated in the BBRB model by using the analytic ER algorithm, and which rules contribute more to the final output. Additionally, relationships between the characteristics of wear particles and wear particle categories are represented by the final BRB directly and transparently. Every rule in the BRB illustrates one kind of input-output mapping. As a  transparent model, it is clear for engineers to understand how the BBRB model makes the determination, and all belief rules in the BBRB model can be checked by experts before it is implemented in real practice so that irrational rules can be avoided. Secondly, the performance of the BANN model varies obviously in different training datasets. Table 3 shows the performance of the five models in every round of the five cross-validation trials. From table 3, it can be seen that the performance of the BBRB is the most stable in the five-fold cross-validation, indicating that the BBRB model can identify different wear particles in a robust way. Compared with that of the models developed on the basis of the BRB inference method, the performance of the BANN varies more obviously from one fold to another, and the range of UA of the BANN is 0.167 while that of the BBRB is only 0.067. Furthermore, the parameters of the BBRB model change slightly from fold to fold as shown in appendix A, which further proves the robustness and credibility of the BBRB model. This may be partly because expert-domain knowledge has been embedded into the BBRB model.

Discussion.
Considering the performance of the BBRB model in the five-fold cross-validation, the parameters in the first fold are selected for level 1 of the BBRB model and the second-fold parameters for level 2. The two BRBs after optimization are listed in appendix B. Figure 5 shows the identification results for the whole dataset based on the final-optimized BBRB model. It can be seen from figure 5 that two cutting wear particles are misidentified as laminar wear particles, one fatigue spall particle and six laminar particles are misrecognized as spherical particles, and five fatigue spall particles are misclassified as laminar particles. Table 4 is the confusion matrix of the final-optimized BBRB model, which clearly reflects the performance of the BBRB model in wear mode identification. In the confusion matrix, TP, FN, and FP represent the number of true positive, false negative and false positive results, respectively. The true positive rate (TPR) of the final-optimized BBRB on the experimental wear particles is beyond 80%, and most of the positive predictive values (PPVs) are over 80% except that for laminar particles (77.4%). Both figure 5 and table 4 indicate that the proposed BBRB model can obtain satisfactory results in wear mode identification.

Addition validation using filtergrams
In order to highlight the effectiveness of the proposed BBRB model, a verification dataset totally independent of the training and testing datasets was used to further verify its performance. The samples were collected in an operation of a ZH1115 diesel engine with four cylinder liners of different surface textures: an original cylinder liner, a concave cylinder liner, a groove cylinder liner, and a concave-and-groove cylinder liner. The engine with every cylinder liner operated for five cycles, and in one cycle the engine operated at 200 r min −1 for 2 h, 400 r min −1 for 2 h and 800 r min −1 for another 2 h. An oil tube was connected to the hole on the cylinder to collect the oil samples. These oil samples were used to make filtergrams to observe the wear particle morphology. The Scanning Probe Image Processor was used to analyze the characteristics of the wear particles.
Thirty-nine samples were acquired from hundreds of particle images after excluding invalid images and normal sliding wear particle images. During the operation of the diesel engine, fatigue wear was the dominant wear mode. As a result, fatigue spall particles made up the majority. The typical filtergrams on every wear particle are shown in figure 6. Figure 7 is the distributed output given by the final BBRB model, showing the categories of wear particles in the verification dataset. As indicated by the rectangle in figure 7(a), two cutting wear particles are misclassified as SBL particles, and two severe sliding wear particles and two fatigue spall particles are misrecognized as laminar particles as shown in figure 7(b). Since the output of the final BANN model (i.e. the model acquired in the fifth trial) can only be discrete values without belief degrees, the categories of wear particles are represented by 1, 2 and 3 as described in figure 8. In figure 8(a), one cutting wear particle and two spherical particles are misidentified as SBL particles, and two severe sliding wear particles, four fatigue spall wear particles and three laminar particles are misclassified as indicated in figure 8(b). Consequently, the accuracies of the BBRB and BANN are 0.846 and 0.692, indicating the final BBRB wear mode identification model is more flexible.
Actually, there is a debate as to whether it is necessary to develop an intelligent model for wear mode identification. As indicated in [18], experts are good at utilizing all types of information to assess wear particles, but when information is limited (such as when only numerical information or 3D information is available), experts may have difficulties in making a correct decision. At the same time, experts can perform well with the typical particles, such as cutting wear particles, as shown in figure 6(b), but they may make mistakes on wear particles that have similar 2D or 3D characteristics, just as the INI-BBRB and SBRB-2D models might. Additionally, it is time-consuming and subjective for experts to process wear particle filtergrams, especially for new engineers with little experience. In this situation, an intelligent model for wear mode identification will be much helpful, and necessary.  (%)   C  28  0  0  0  2  30  93.3  SP  0  30  0  0  0  30  100  SSL  0  0  30  0  0  30  100  FS  0  1  0  24  5  30  80  L  0  6  0  0  24  30  80  TP + FP 28  37  30  24  31 150 PPV (%) 100 81.1 100 100 77.4 Figure 6. Filtergrams of the five typical wear particles: (a) severe sliding wear particle, (b) cutting wear particle, (c) fatigue spall particle, (d) laminar particle, and (e) spherical particle.

Conclusions
This paper is the first to propose a multi-level BRB system to categorize wear particles, for the purpose of identifying the wear modes of marine diesel engines. To reduce model complexity and the calculation burden, two individual levels are designed to deal with the 2D and 3D characteristics of wear particles. The model referential points and parameters are appropriately determined by the silhouette value, fuzzy c-means clustering, and GA algorithms. Experimental datasets are used to verify the effectiveness of the BBRB model. The analysis results demonstrate that (1) the BBRB model can achieve promising results on wear mode identification; (2) 3D characteristics are essential to distinguishing wear particles, especially for wear particles with similar 2D characteristics; (3) dividing a whole BRB model into several levels and determining referential points by using clustering algorithms are effective for reducing the size of rule bases; and (4) since expert-domain knowledge has been embedded into the proposed BBRB, the model exhibits stable performance in five-fold cross-validation. Although the BBRB model for wear mode identification has been verified by experimental data, its capability still needs to be enhanced and tested by wear particles collected from marine diesel engines in real-world ships. Currently, we are collecting oil samples from a real ship, named Changjing 2, in the Yangtze River, China, to evaluate the performance of the proposed BBRB system. In addition, based on our exper imental analysis, if new rules could be added to a rule base automatically based on the new samples, the performance of the BBRB will be further enhanced. Our future work will investigate self-adaptive BRB models.