Health index extraction for power-shift steering transmission using selected oil field data

The concentration of wear debris in used lubricating oil is generally observed from spectral oil analysis. The element concentration data are one of the commonly used oil field data for degradation evaluation of power-shift steering transmission. However, in practical applications, the underlying degradation degree of the power-shift steering transmission is difficult to evaluate due to the complexity of element concentration data. To solve this problem, we propose a health index extraction methodology using a weighted average method to better characterize the underlying degradation, which leads to an accurate estimation of the residual life before the power-shift steering transmission needs to be maintained. The extracted health index not only maximizes the monotonic trend of underlying degradation but also minimizes the failure threshold variance between different power-shift steering transmissions. The method includes element concentration data modification, data selection, and data fusion steps that result in a reasonable power-shift steering transmission degradation model. Finally, a case study is provided to illustrate the proposed method. The results show that the extracted health index outperforms each selected element concentration data.


Introduction
Power-shift steering transmission (PSST) is one of the vital components that widely used in many heavy tracked vehicles. With PSST operating, metal wear debris spalling from each mechanical pair evenly mixed in lubricating oil and its concentration increase, which is a slow degradation process. Metal wear debris mixed in lubricating oil accelerates wear and tear of every mechanical components, consequently leading to the degradation of the PSST system. 1,2 After a certain degree of degradation, the PSST system will fail to perform normal operations. For mechanical powertrains, faults caused by oil contamination are the primary failure modes, with more than 50% associated with metal wear debris. 1 Failure of the PSST will result in production downtime and delays, which often leads to economic losses and even safety problems that, in turn, may lead to catastrophic consequences. Thus, the PSST system should be regularly monitored, and the system soft failure occurrence time should be accurately estimated to perform predictive maintenance (PM). To address this issue, lubricant condition monitoring (LCM) technology has been used to observe oil field data for oil-lubricated machines. 3,4 The purpose of this article is to address a health index (HI) extraction problem for PSST with the observed oil field data to determine the expected residual life (RL) before the PSST needs to be maintained.
In the literature of LCM, 5-7 the amount of wear debris is one of the most common types of degradation data to assess the severity of system degradation. The amount of wear debris has been sampled from oil spectral analysis using MOA II (atomic emission spectroscopy) during the oil sampling period. 8 Using this method, 15 groups of element concentration data sample can be obtained. With the sampled element concentration data, prognostic analysis can then be performed to estimate the RL of an operating PSST. 9,10 In the existing literature, many techniques and methods have been used to model the relationship between system degradation of oil-lubricated machines and its association with element concentration data. [11][12][13][14][15][16] For example, Wang et al. described the analysis procedure of oil field data in the study of Wang and Burairah 17 and developed a machine degradation model using stochastic diffusion process in their study 18 by assuming the oil field data are a valuable condition indicator for characterizing the underlying degradation. In this article, the rationality of using element concentration data for degradation modeling was investigated. Recently, by modeling the increasing trend of wear particles in oil, Valisˇand Ž a´k 19 built a system degradation model for an internal combustion engine to determine the expected moment when a soft failure occurred. It is noted that the expected moment can be used as the system RL and further be set as the time to perform planned PM. In the most recent, Yan et al. 20 presented a condition-based maintenance (CBM) problem with selected oil field data to determine the optimal time of machine maintenance. In addition, the work of Yan et al. 20 presents a framework of using sampled oil field data for maintenance optimization. A review of the application of oil field data for LCM and CBM can be found in Wakiru et al. 21 and the references therein. Most of the existing literature simply use single-element data (e.g. Fe, Cu, and Mo 9-12 ) in wear debris concentrations to establish degradation models. In practice, however, many studies have shown that using a singleelement concentration, data-based degradation model is problematic, 22,23 which may cause inaccuracy in system RL estimation.
To our knowledge, no HI extraction method has developed in the literature that can be utilized for the fusion of oil field data. Therefore, this article seeks to fulfill this gap by fusing multiple element concentration data to extract an HI to characterize system degradation process that can be used for RL estimation. To address this issue, a unified degradation model method has been proposed in our previous research 24 in which a copula function is utilized to deal with multiple oil field data. However, it is often challenging to determine an appropriate copula function, especially when dealing with more oil field datasets. Unlike the existing works, this article proposes an HI extraction methodology by developing a quadratic programming problem that can simultaneously maximize the HI monotonicity and minimize the failure threshold variance of different units. With the proposed efforts, we expected to attain more accurate RL estimation.
The remaining parts are organized as follows. The ''Overview of the element concentration data'' section describes the PSST system and the oil field data used in this article. The ''Development of an HI extracting method'' section describes the key steps associated with the preprocessing of selected oil field data, including element concentration modification for lubricating oil supplement and element concentration data selection, and develops explicit formulas for extracting an HI based on quadratic programming for degradation modeling and RL estimation. The ''Case study'' section provides an illustrative case study. Finally, the ''Discussion and conclusion'' section provides the conclusion and future research.

System model description
This article considers a PSST system monitored using regular oil spectral analysis to implement LCM and predict its soft failure occurrence. The concerned PSST system combines a multi-speed shifting system with an infinite steering system, and is widely used in heavy industrial applications such as heavy-duty tracked vehicles. Figure 1 shows a sketch diagram of a PSST, and   Table 1, and detailed sampling principles can be found in Yan et al. 7,8

Dataset description
The element concentration dataset used herein includes 20 training units and 5 test units. Each unit was run to failure due to severe wear under the same operating conditions. Time was measured in inspection periods, in this case, 5 Mh. All of the collected samples were analyzed immediately. The concentration data in parts per thousand of 15 types of element were obtained using MOA II, which are related to the degradation process of the operating PSST. 7,8 The element concentration data of one unit are shown in Table 2.
As mentioned above, modeling the degradation mechanism and representing the degradation state of the operating PSST are difficult due to the information redundancy of multiple element concentration data. 20 Thus, to characterize the degradation level of the operating PSST, in the next section, a composite HI will be extracted based on the fusion of the element concentration data.

Development of an HI extracting method
In this section, several key steps associated with the preprocessing of the element concentration data and constructing of a composite HI are discussed.

Dataset preprocessing
Modification of element concentration. Since a certain volume of used lubricating oil is sampled for LCM, an equal amount of unpolluted oil is then injected. The concentrations of wear particle in the subsequent inspection periods are not ideal; specifically, it is less than the actual concentration. Thus, the first step to enable an accurate degradation condition characterization is to modify the original element concentration data during preprocessing. Thus, a function was proposed to modify the element concentration data where x i, j and x Ã i, j are the measured value of the wear particle concentrations and the actual concentrations for unit i in the inspection period j; x 0 represents the element concentration data of the unpolluted oil; V 0 represents the initial volume of the lubricating oil; and V s is  the sampling size. To eliminate the influence of initial wear particles, the concentration value can be obtained by subtracting Selection of element concentration data. Recalled that 15 types of major element concentration data are obtained from oil spectral analysis using MOA II. Therefore, it is necessary to determine the element concentration data used as the input of our HI extraction method. In engineering practice, time-series data that exhibit a significant increasing or decreasing in trend during operation are of interest. Based on this criterion, if the last sample is larger or smaller than the initial sample, the element concentration data are selected. 25

Methodology development
Researchers usually use weighted average functions to fuse multiple time-series data and extract indicators that can characterize the implicit information. 22,26 Therefore, to fuse the selected element concentration data, we use the weighted average function to formulate the HI extracting method, represented as where v 2 R s 3 1 represents the vector of weight coefficients that is used for fusing multiple element concentration data at each inspection period of each unit, and s represents the number of selected element concentration data; d i, j and X i, j are the values of the extracted HI and the vector of the selected element concentration data for unit i in inspection period j; v 0 M 0 1 = 1, where M 2 R s 3 s is a diagonal matrix used for representing the degradation trend information, of which the diagonal element is 1 (-1) when the corresponding selected element concentration data have an increasing (decreasing) trend.
Desirable properties for extracting the HI. To guarantee the obtained HI is more suitable for LCM and RL estimation than the element concentration data, a few desirable properties for the HI are defined to enhance its effectiveness for reasonable degradation modeling and successful RL estimation. The element concentration data are always nonmonotonic due to measurement errors; thus, an HI that has a monotonic trend should be extracted. Furthermore, for given element concentration data, the failure thresholds between different units are always different, so the obtained HI should have little variation in the failure threshold. Considering these two requirements, the two basic properties, which were proposed by Saxena et al. 27 to make the condition monitoring data have a better application in degradation modeling and RL prediction, are adopted for the extracted HI to have a better LCM application.

Property 1:
The degradation trend of the extracted HI should be monotonic with the PSST system degradation. Property 2: The variance in the failure threshold of different PSST systems should be minimal under the same unit and operating condition.
Methodology optimization. Since each PSST system is assumed to fail due to wear debris accumulation under the same operating condition, we expect to extract an HI that has a similar degradation pattern for all systems. Thus, the expectation can be achieved by jointly programming these two properties when extracting the HI. Therefore, the objective function should consist of two parts: the weighted amount of the nonmonotonicity of Property ''1'' and the variance of the failure threshold of Property ''2.'' For Property ''1,'' a slack variable is used to compensate for monotonic conflictions. In the case where the value of HI is less than the previous inspection period, the difference between HI and the reasonable value is measured using a slack variable. Specifically, e i;j = maxðd i;jÀ1 À d i;j ; 0Þ. Then, the total weighted amount of confliction in monotonicity is obtained by where e i, j represents a slack variable used to quantify the degree of confliction in monotonicity of unit i in inspection period j; c i, j represents the weight coefficient of the slack variable e i, j ; m represents the number of training units; and n i represents the total number of inspection periods of unit i. For Property ''2,'' the variance of the failure threshold of the HI is measured using an unbiased inspection variance, which can be formulated as where Y 2 R m 3 s is a matrix in which rows represent each unit and columns represent each selected element concentration data, and D 2 R m 3 s represents a symmetric matrix of the form: D = (I À O=m)=(m À 1), where O is a matrix of full values of 1, and I is an identity matrix. Above all, by maximizing the monotonic trend and minimizing the threshold variance between different units, the method optimization model is formulated as a quadratic programming problem, represented as where r is a tuning parameter used to measure the relative importance of these two properties in the optimization model, and E i 2 R (n i À1) 3 (s + n i À1) is a matrix used to maximize the monotony for unit i, which can be formulated as where x i, k, j is the kth type of element concentration data of unit i in inspection period j, and v i 2 R (s + n i À1) 3 1 is a vector that needs to be determined to ensure that the extracted HI is monotonic after introducing the slack variable e i, j . The vector v i can be formulated as v i = v 0 ; e i;1 ; . . . ; e i;n i À1 Â Ã 0 Parameter setting. In this section, how the crucial parameters in our optimization model are set is discussed such that the obtained HI is reasonable for the operating conditions and degradation mechanism. Specifically, we focus on setting weight coefficients c i, j and tuning parameter r.

Weight coefficient setting:
Recalling the each slack variable e i, j , c i, j is the corresponding weight coefficient as unit i degrades and c i, j e i, j measures the weighted amount of confliction in monotonicity for unit i in inspection period j. As a unit degrades, the accuracy of degradation modeling and RL prediction becomes increasingly sensitive to the number of monotonic conflictions of HI. Therefore, as the inspection period j increases, the slack variable e i, j is assigned higher weights. To be specific, the weight coefficient c i, j must satisfy the following condition The value of weight coefficient fc i, j g is further assumed to follow an arithmetic series (2c i, j = c i, j + 1 + c i, jÀ1 ) and the weight coefficients have the following form Since more weight is assigned to e i, j as the inspection period j increases, v, the estimated weight coefficient vector, becomes increasingly dominated by the element concentration data at inspection period close to failure time. Note that the arithmetic series assumption may not be suitable in some applications; thus, the geometric series (c 2 j = c i, j + 1 c i, jÀ1 ) and other assumptions may be used to initialize the optimization model based on the emphasis on the monotonic confliction.

Tuning parameter setting:
The relative importance of the two terms in the optimization model in equation (5), that is, confliction in monotonicity and variance in the failure threshold, is controlled by the tuning parameter r. We can use cross-validation to obtain the optimal value of r. Details of the calculation process can be found in Kohavi. 28 Increasing r puts more emphasis on reducing the confliction of the monotonicity property of the extracted HI at the expense of increasing the variance of the failure threshold. This multi-objective optimization problem is often solved by plotting the efficient frontier with regard to these two items of the optimization function in equation (5). In practice, we set the optimal value of tuning parameter r based on the importance assigned to the two items of the optimization function. Figure 3 shows the flowchart of the proposed HI extraction method. Using this method, the optimal values of v and r, represented by v Ã and r Ã , can be obtained. Then, the HI for LCM and RL prediction can further be extracted based on v Ã and the selected element concentration data.

Case study
In this section, we provide a numerical study using element concentration data collected for inspections from Figure 3. Flowchart of the HI extracting method.
each PSST system in the ''Overview of the element concentration data'' section to illustrate the HI extraction procedure. To challenge the developed method, we further investigated the performance of our proposed method and compared it with the existing singleelement concentration data-based degradation modeling method. Specifically, we compare the accuracy of the RL prediction by using the constructed HI and using each selected element concentration data based on the same Wiener process (WP)-based degradation model.

Dataset preprocessing and HI extraction
Concentration modification and data selection. After the wear particle concentration is modified using equation (1), the element concentration data used for extracting HI are then selected. The selection is based on the modified element concentration data showing a consistent decreasing or increasing trend for all units. Among the 15 types of element concentration data shown in Table 2, six (i.e. s = 6) element concentrations are selected for HI extraction and degradation modeling, namely, Fe, Cr, Cu, Ni, Mn, and Mo. As a result, the corresponding diagonal elements of M are set as 1; 1; 1; 1; 1; 1 ½ 0 .
Weight coefficient and tuning parameter setting 1. Weight coefficient setting: Since we assume that the weight coefficient increases linearly in the ''Development of an HI extracting method'' section, we adopt an arithmetic series for fc i, j g. These weight coefficients represent the confliction in monotonicity for Property ''1.'' Furthermore, we select the weighted average function to demonstrate the HI extraction method. 2: Tuning parameter setting: In this case study, the tuning parameter was set as r = 0:5 to illustrate the proposed method. The optimal model in equation (5) is solved based on the selected element concentration data. Table 3 shows the optimal weight v Ã . Note that the value of r can be selected based on the importance assigned to each property when extracting the HI.

WP-based degradation modeling
Degradation model development. The PSST system initially operates in a healthy state with unpolluted oil and is subject to wear particles obtained from regular inspections. The selected six element concentration data are shown in Figure 4.
With the plots of element concentration data in Figure 4, the element concentration data show random diffusion forms. Therefore, the WP-based degradation model proposed in Si et al. 29 is utilized for evaluating the performance of the extracted HI when used for RL prediction. The degradation model is given by where the degradation process fX (t), t ø 0g is driven by a standard Brown movement fB(t), t ø 0g; s is the diffusion parameter; u is an unknown drift coefficient characterizing the degradation rate; sB(t);N (0, s 2 t) denotes the randomness and time-varying uncertainty of the degradation process. It is assumed that a PSST system is identified as failed and that needs to be replaced when fX (t), t ø 0g exceeds the failure threshold. Therefore, the RL of PSST system is defined based on the first hitting time (FHT) of the degradation process. Given the failure threshold v, RL L k at time t k can be formulated as Based on the Markov property of the WP, the degradation track of PSST system after time t k is given the following According to the above description of the degradation model, if time t is the FHT of the degradation process fX (t), t ø 0g, then the RL of PSST system at time t k can be represented as t À t k . Assume that  l k = t À t k ;X ðl i Þ = X ðtÞ À x k and w k = v À x k , then the RL of PSST system at time t k is equal to the FHT of the degradation process fX (l k ), l k ø 0g Under the above construction, the FHT of the WP conforms to an inverse Gauss distribution. Thus, the RL of PSST system at time t k is given the following Parameter estimation. To initialize the model defined in equation (8), we estimate the model parameter s 2 , u using the maximum likelihood estimation (MLE) based on the element concentration data of the training unit. The element concentration data of the ith training unit at time t j are denoted as x i, j , and the entire dataset is fX i (t j ) = x i, j , i = 1, . . . , N, j = 1, . . . , Mg. The degradation model parameter vector is denoted as Q = (s 2 , u) 0 . Then, the likelihood function j(QjX) of all element concentration data histories is expressed as where The MLE of s 2 , u can be easily obtained by maximizing j(QjX). A detailed description about the MLE method can be found in Pecht and Jaai. 30 Table 4 shows the estimated value ofŝ 2 for all selected element concentration data and the extracted HI, from which we can see that the extracted HI fits the degradation process better than each original element concentration data.
The dataset of the last inspection period before failure in all training units for element concentration data k is denoted as X m;k;n = ½X 1;k;n 1 ; . . . ; X M;k;n M . The variance of the failure threshold for element concentration data k is denoted as v k f , as shown in Table 5. We can see that the v k f in the extracted HI is smaller than that in any other selected element concentration data. Takingŝ 2 and v k f into consideration, the extracted HI conforms to the two desirable properties.

Prediction results
Using the estimated parameters, the degradation model can be initialized to predict the RL of the testing units. Figure 5 shows the degradation state of a random testing unit. The FHT of the extracted HI is 180 Mh, which represents the failure period of PSST system degradation that provides a reference for the maintenance policy formulation. In addition, Figure 6 shows the predicted conditional distribution of the degradation state at several inspection periods, which characterizes the uncertainty degradation of the PSST system.
The relative error between the RL prediction and actual RL is calculated and the comparison considers two cases: (1) the RL prediction based on the extracted HI and (2) the RL prediction based on each selected element concentration data. Specifically, the relative error, err i, k , is defined as the relative difference between  the RL prediction values and the actual RLs for unit i and element concentration data k, and is given by where T i represents the actual RL for testing unit i,T i, k represents the RL prediction for unit i and element concentration data k, n i represents the number of inspection periods of unit i, and t s represents the inspection interval (i.e. t s = 5Mh). Figure 7 shows the absolute value of the relative error using each selected element concentration data and the extracted HI at different quantiles of the machine operating time. Figure 7 shows that (1) compared with each selected element concentration data, the extracted HI provides the best prognostic result due to the control of the two desirable properties. In other words, the monotonic property is maximized and the variance in the failure threshold is minimized when extracting the HI. (2) The RL prediction using the extracted HI becomes increasingly accurate with the unit operating due to the control of the weight coefficient c i, j . Specifically, given more penalties to the slack variables, fe i, j g in the inspection periods are closer to failure using the arithmetic series for fc i, j g. These useful characteristics have practical applications when deal with the equipment has high reliability requirements.
In addition to the relative errors shown in Figure 7, researchers are often interested in comparing the root mean square error (RMSE) 31 of the RL predictions and actual RLs. A small RMSE indicates a better RL prediction with less absolute error. The RMSE value for all selected element concentration data and the HI is shown in Table 6. Based on Table 6, the HI extracted using our proposed method has the smallest RMSE compared with using each selected element concentration data. The extracted HI results in a more accurate RL prediction results, which provides a useful foundation for an optimal PM strategy for oil-lubricated systems.

Discussion and conclusion
This article provides a systematic methodology that includes concentration modification, data selection, data generalization, and data fusion procedures combining element concentration data (e.g. Fe, Cu) obtained for inspections from a PSST system to extract an HI that accurately characterizes the system degradation condition. The novelty of this method is integrating multi-dimensional element concentration data in a unified HI. The constructed method is advanced in that it can maximize the monotonicity of the indicator and minimize the variance in the failure threshold simultaneously. The developed method was tested and validated using element concentration data from several PSST systems. The WP-based degradation model was utilized to evaluate the validity of the extracted HI by estimating the RL of each oil-lubricated system in time.
The results show the improved performance of the extracted HI compared with each selected original element concentration data.   There are several important directions deserving further studies. First, more oil field data (e.g. ferrography, viscosity, and acidity) tailored to LCM are necessary. Second, kernel methods that can fuse nonlinear timeseries data should be investigated. Third, an effective maintenance strategy optimization method based on the extracted HI should be developed.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is partially supported by the National Natural Science Foundation of China (NSFC) under grant numbers 51475044 and 51975047, and partially supported by the China Scholarship Council under grant number 201806030083.