On Non-Homogeneous Markov Reward Model to Availability and Importance Analysis for CNC Machine Tools

This paper exploits the Markov reward method to availability and importance evaluation for CNC machine tools. The presented paper regards CNC machine tools as a multi-state system that state transitions process caused by elements and subsystem’s failures and corresponding maintenance activities during the lifetime. Simultaneously, considering the timevarying characteristics of the failure rates of subsystems, the non-homogeneous Markov reward model is introduced for evaluation of availability and subsystem importance identification for aging multi-state CNC machine tools under minimal repair. Corresponding procedures for the failure rates, state transition matrix, and reward matrix definition are suggested for the availability and importance measures. A numerical example is presented in order to illustrate the approach. International Journal of Mathematical, Engineering and Management Sciences Vol. 6, No. 1, 30-43, 2021 https://doi.org/10.33889/IJMEMS.2021.6.1.004 31 KeywordsCNC machine Tools, Availability, Non-homogeneous Markov reward model, Time-varying failure rates, Minimal repair, Subsystem’s importance


Introduction
Facing the fierce market competition, the items related to the reliability of CNC machine tools are increasingly valued by manufacturers . Naturally, availability as the crucial attribute of CNC machine tools can demonstrate if the system can be used when it is desired. By combining the definitions taken from both the IEC international vocabulary (IEC 2001) and the IFIP WG10.4 working group (Avizienis et al., 2004), availability can be defined as the ability to be in a state to perform a required function, under given conditions, at a given instant of time, or after enough time has elapsed, assuming that the required external resources are provided. Although there are a large number of studies and applications on system availability (Trivedi and Bobbio, 2017), rarely found research on the availability of CNC machine tools and most research focuses on building the specific aspect of reliability or maintainability, including functional component models, component-based system models, and fault-based dynamic system models (Ran et al., 2018). Keller et al. (1982) first systematically studied the reliability and maintainability of CNC machine tools, which used the Weibull distribution and lognormal distribution to analysis the mean time to failure (MTBF) and mean time to repair (MTTR) based on the field fault data collected on 35 CNC machine tools in 3 years. Kim et al. (2005) developed a web-based analysis system to establish the reliability assessment model of machine tools, which could study the failure mode of system components. Sung and Lee (2011) presented the failure analysis results and major reliability parameters for improving the reliability of machine tool changing servo motor. Lanza et al. (2009) presented a method which calculates the optimal time for preventive maintenance and spare part provision by a stochastic optimization algorithm based on a load-dependent reliability model. Shen et al. (2017a, b) explored the relationships between CNC machine tools and corresponding subsystems, and presented the maintenance characteristic of system from the perspective of independent or correlated subsystems, respectively. Zhang et al. (2019) proposed a fault diagnosis strategy based on cascading failure to ensure the safe operation of CNC machine tools. Yang et al. (2011Yang et al. ( , 2017 established a comprehensive reliability distribution model for CNC machine tools, which assembled the interval analysis, fuzzy comprehensive evaluation and analytic hierarchy process to study the reliability evaluation, allocation and maintainability of the system. Based on the stress-strength interference theory, a general method for modeling and analyzing inaccurate reliability of machine tool components is proposed by Huang et al. (2018) for the characteristics of heavy-duty CNC machine tool uncertainty analysis. Moreover, a meta-action decomposition method proposed by Ran et al. (2016Ran et al. ( , 2017 for reliability modeling and quality characteristic correlation analysis of CNC machine tool assembly process. Currently, most research on availability of CNC machine tools concentrated the inherent availability, which calculated through the mean time between failure (MTBF) and mean time to repair (MTTR). Simultaneously, some researchers have contributed to the study of instantaneous availability of CNC machine tools. Wei et al. (2011) used the Petri net to establish the state of the CNC machine tools and corresponding dynamic process model, then the system availability simulated by the Monte Carlo method. Li et al. (2014) proposed the linear model of instantaneous availability of CNC machine tools based on the functional data analysis method. Zhang et al. (2016) from the client's perspective, consider the correlation between the user's requirements for the availability of CNC machine tools (precision accuracy, failure rate, difficulty of diagnosis, maintenance difficulty, maintenance cost, etc.) and market competition factors to determine the importance of each demand indicator. Therefore, a novel and concise instantaneous availability evaluation approach of CNC machine tools proposed in current paper that considers the system as a multi-state system (MSS), which formed by the changes in subsystems' state. Thus, the state changes process of CNC machine tools can be regarded as a discrete-state continuous time process (DSCT). Additionally, incorporating the time-varying characteristics of failure rates of subsystems, the non-homogeneous Markov reward model (NHMRM) and Howard differential equation are introduced to determine the availability of the CNC machine tools under minimum repair. Moreover, a sensitivity analysis is performed on each subsystem to determine the failure criticality importance (FCI) of each subsystem. A numerical example is presented in order to illustrate the approach.

CNC Machine Tools Availability Evaluation Procedure Based on MHMRM
CNC machine tools belong to complex electromechanical products. When evaluating the availability, the system needs to be divided into different subsystems according to functions, principles and structural features, as shown in Figure 1. The system state changes from normal to fault in the event of any subsystem failure and transferred from faulty to normal after the corresponding repair completed. Therefore, we believe that the state change of the CNC machine tools throughout its life cycle is a multi-state system. Due to the time-varying characteristics of failure rates of subsystems (Wang et al., 2001), it is judged that the CNC machine tools failure events are described by the non-homogeneous Poisson process. Therefore, the presented paper introduces the non-homogeneous Markov reward model to determine the instantaneous availability of CNC machine tools, the specific steps shown as follow.

Construction of State Transition Matrix
In this paper, the normal operation of CNC machine tools is defined as an acceptable state, and the system shutdown caused by subsystem failures is defined as an unacceptable state. The following assumptions are made:  The failures between the machine tool subsystems are independent, that is, system shutdown is caused by one subsystem fault.  The subsystems are binary-state system, normal operation and failure, which means the maintenance activity begins with the discovery of system failure.  As a complex system composed by a large number of components, we define the maintenance effect of CNC machine tools as "minimal repair".  The probability of failure and the repair rate of subsystems are independent.
Thus, as shown in Figure 1, CNC machine tools are divided into subsystems, when a subsystem failure leads to the system shut down, it is considered that the machine tool is transferred from the acceptable state to an unacceptable state caused by the corresponding subsystem. Conversely, after the successful maintenance, it is considered that the machine tool is transferred from the unacceptable state to the acceptable state. Therefore, the operation process of the CNC machine tools can be divided into ( + 1) states. The system state transfer process is shown in Figure 2. The state transition process of CNC machine tools only related to the current state and has no relationships with historical states. It can be judged that the machine tool is a discrete-state continuous time system. According to the characteristic of system performance changing with time, combined with practice, it is determined that the failure rate is time-varying functions, which are expressed in and the repair process of each subsystem conforms to the exponential distribution, designed as . Additionally, based on the characteristics of such systems, the transition rates between state and ( , = 0,1,2, ⋯ , )can be expressed by the corresponding failure rate and repair rate of the system (Lisnianski et al., 2010), where 0 states denote the acceptable state. Consequently, according to reference (Gertsbakh, 2000;Xie et al., 2004), the state transition process of CNC machine tools can be regarded as a Markov model with time-varying transition intensities ( ), namely the non-homogeneous Poisson process. Therefore, the state transition intensities matrix can be constructed for CNC machine tools as follows: Taking time-varying failure rate ( ) and repair rate into matrix , , = 0,1,2, … , . For convenience of omitting 0 in ( ) and subscripts and based on the mathematical theory of reliability (Cao and Chen, 2006), the state transition intensities matrix is obtained as follows:

Key Subsystems Identification and Corresponding Failure Rates Calculation
Due to the complexity of structure and function, the CNC machine tools contain a variety of fault modes and cause. For saving the cost of experiment and calculation, the key subsystems of machine tool are determined according to the frequency of subsystem fault occurrence. Then, the failure rates of key subsystems should be determined. Moreover, the procedure of subsystem failure modeling includes three steps: fault data rank correction, parameter estimation and hypothesis test, which shown as follows: (1) Subsystem Fault Data Rank Correction In order to save the test time and cost, the reliability tests of CNC machine tools are carried out by means of timing truncation. Therefore, at the end of each test will produce a truncated data. Furthermore, when a subsystem fails, a failure time of its own is generated and the time for all other subsystems is truncated. Because the existence of multiple truncated data, the order of subsystems failures is changed, so the rank increment is introduced to correct this rank change by using Johnson method in this paper.
The new rank i m of the th i failure of any subsystem is calculated as follows: (2) (2) Parameter Estimation. In this paper, the two-parameter Weibull distribution is taken as the hypothetical model, and the model parameters of the subsystem are estimated by the least square method based on the fault data obtained from reliability test.
The two-parameter Weibull distribution function is where α and β are the scale parameter and shape parameter, respectively.

̂= + (4)
where A is the intercept of the straight line and B is the slope. The estimated values of parameters A and B based on the least square method are as follows: The logarithm of the two ends of formula (3) is obtained:  (4) and (5), the parameters estimation of Weibull distribution model is (3) Hypothesis Test In this paper, the Kolmogorov-Smirnov test is used to inspect the goodness of fit for the fault interval time distribution function of subsystems.
Kolmogorov-Smirnov test is to arrange the test data in the order from small to large. According to the hypothetical distribution, the corresponding 0 ( ) of each data is calculated and compared with the empirical distribution function ( ) , where the maximum absolute value of the difference is the observation value of the test statistic . And is compared with the critical value , . If the following conditions are satisfied, the original hypothesis is accepted, otherwise the original hypothesis is rejected.
where, , is the critical value, 0 ( ) is original hypothetical distribution function value and ( ) is an empirical distribution function of sample size , that is, Howard (1960) proposed the Markov reward model, which was widely used in various theoretical and practical studies (Reibman et al., 1989, Trivedi andBobbio, 2017). The core idea of the Markov reward model is as follows: a continuous Markov chain with different states and a transition matrix between states is =[ ]， , = 1,2, … , . It is assumed that if the process stays in any state during unit time, a certain amount of reward is achieved. Similarly, it is also assumed that each time the process transits from state to state an additional amount of reward is achieved. The rewards may also be negative when it characterizes a loss or penalty. Consequently, for such processes, it is crucial to determine a reward matrix =[ ]， , = 1,2, … , in additional to a transition intensity matrix. Then, many important reliability indexes can be determined by comprehensively applying and calculating these two matrices during the operation of the system. Because the machine tool state transition process is a non-homogeneous Poisson process, the Markov reward model related to its state transition is called the non-homogeneous Markov reward model.

Build Reward Matrix and Calculate the Availability of CNC Machine Tools
Let ( ) be the expected total reward accumulated up to time , given the initial state of the process as time instant = 0 is in state . Howard differential equations with time-varying transition intensities should be solved under specified initial conditions to find the total expected rewards.
In the most common case, the MSS begins to accumulate reward after time instant = 0, thus, the initial conditions are For instance, if the state with the highest performance level is defined as the initial state, the value ( ) should be found as a solution of system (10).
Therefore, for determining the availability of CNC machine tools, it is necessary to build the reward matrix , and the corresponding rewards in can be determined in the following manner: (1) All rewards that indicate acceptable are defined as 1.
(2) The rewards associated with all unacceptable states should be zeroed as well as the rewards associated with all transitions.
Thus, the reward matrix of the CNC machine tools depicted in Figure 2 In this paper, the cumulative reward 0 ( ) of the CNC machine tools in interval [0, ] define a time that the system will be in the set of acceptable state in the case where state 0 is the initial state. After obtaining the 0 ( ), the CNC machine tools instantaneous availability A 0 (t) can be expressed as follows:

Reward Matrix Creation for the Calculation of FCI of Subsystems
After determining the system availability, it is significant to compare the contribution of each unacceptable state to the system downtime, which helps optimize the production cost. Therefore, the failure criticality importance (FCI) (Wang et al., 2004;Toledano et al., 2016) is introduced to calculate the percentage of each failure state to the system downtime. The FCI can be obtained as follows: where ( ; ) is the failure criticality importance of the th j failure state in [0, ]. Subsequently, the key is to evaluate the number of system failures and the number of the system entrances to each down state, which can be obtained through the Markov reward approach. In this case, the mean accumulated reward ( ) obtained by solving (10)  The reward matrix formulated under the following rules: (1) All rewards that indicate an acceptable state transfer to an unacceptable state are defined as 1; (2) The other awards in the matrix are defined as 0.
Then the mean number of system transfer from normal to unacceptable state caused by subsystem can be obtained through the Howard differential equations, defined as 0 ( ) , = 1,2, … , . Correspondingly, the criticality importance of ℎ subsystem is defined as following:

Numerical Example
The data used in this paper are obtained by a field test of 67 CNC machine tools and the last fault occurrence time is recorded as truncation time. The failure interval time is designated as and we realized that faults of this type of CNC machine tools are mainly concentrated in the feed system, tool magazine and the spindle system. Moreover, the mean time to repair of each subsystem also collected in the field. Thus, three key subsystems and the corresponding specific data are shown in Table 1. The failure rate functions of the key subsystems are established by using a Weibull distribution, and the linear correlation coefficient and the KS-test were carried out. The results are shown in Table 2 and Figure 3, respectively.  According to the identified key subsystems, the state transition process of CNC machine tools can be determined as shown in Figure 4: The cumulative reward differential equations associated with the availability are: According to the formula (12), the availability of this type of CNC machine tools can be obtained by using MATLAB ® solution, and the resulting curve is shown in Figure 5.
Taking the feed system as an example, the Howard differential equations determined as following.
Using MATLAB ® solution, the mean number of failures 01 ( ) of the feed system can be obtained through calculating the following system.
Therefore, according to the formula (14) the "failure criticality importance" of Feed system, Tool magazine and Spindle system are shown in Figure 6. Here one can see that the importance of the Tool magazine is a decreasing function over time, the importance of the Spindle system is mostly increasing, and the importance of the Feed system is increasing over time. During the first 300 hours of usage, the Tool magazine has greater importance than the Feed system, and after 300 hours, the importance of the Feed system becomes greater.

Conclusions
(1) The CNC machine tools during the life cycle were considered as a discrete-state continuous time process from the perspective of state transition of subsystems. Incorporating the timevarying characteristic of subsystems' failure rate, then the non-homogeneous Markov reward model was developed for the calculation of instantaneous availability for aging multi-state CNC machine tools. Furthermore, the failure criticality importance (FCI) of subsystems was evaluated by the proposed approach, which could identify the contribution of each subsystem to system downtime.
(2) The method proposed in this paper is a concise and rational solution for the non-homogeneous Poisson process in the field of CNC machine tools reliability research. More importantly, the suggested approach is well formalized and suitable for practical application in reliability engineering. In fact, the presented work can be applied to any system. The numerical example was presented in order to illustrate the suggested approach.