Joint optimization strategy of condition-based preventive replacement and spare parts ordering for multi-unit systems

To solve the joint optimization problem of condition-based maintenance and spare parts ordering for multi-unit systems, an exact formulation based on the Markov decision process is proposed. The condition of components is described by a continuous process, i.e., the Wiener process. The system is inspected periodically and the remaining useful life is updated based on components’ condition. Through the value iteration algorithm, the optimal policy is obtained by minimizing the average maintenance and ordering cost. A numerical investigation with a two-unit system is conducted to validate the effectiveness of our policy by comparison with the threshold strategy.


Introduction
In the manufacturing field, maintenance-related expenses can account for 15-70% of total production expenses [1]. The optimization of maintenance policies is critical to the reduction of total costs. Prevalent maintenance policies generally include two types: the time-based maintenance (TBM) and the condition-based maintenance (CBM). TBM determines the replacing and ordering of components based on the lifetime or usage time. Compared with TBM, CBM can predict the remaining useful life of components through key deterioration parameters (voltage, speed, etc.) and schedule maintenance actions more precisely. With the development of monitoring technology, CBM has gradually become the mainstream of maintenance policies. Regarding the maintenance studies, Li et al. [2] developed an age-based replacement policy of a single-unit system considering production wait time. Yi et al. [3] studied the transmission loss and maintenance of a bus performance sharing system. Duan et al. [4] studied a two-level condition-based maintenance (CBM) policy for a ship pump considering stochastic maintenance quality and formulated the problem in a semi-Markov decision process.
The effectiveness of maintenance policies is intercorrelated with the availability of spare parts. In recent years, the optimization of the two aspects has received extensive attention. Armstrong and Atkins [5] studied the joint optimization problem of a single-unit system and proposed a degradation model based on usage time to derive optimal thresholds for replacing and ordering units. Wang et al. [6] considered a similar problem but assumed that the deterioration of components follows the Wiener process. The joint optimization of multi-unit systems has also been studied by many scholars. Wang et [7] studied the joint optimization problem of a multi-unit system through a fixed threshold replacement strategy and , ordering policy, where parameters were optimized through genetic algorithm and Monte Carlo simulation. Zhang and Zeng [8] proposed a state-space partitioning method with a genetic algorithm to optimize the preventive threshold and safety stock level simultaneously under CBM for multi-unit systems.
The majority of studies on joint optimization problems adopts a threshold control strategy. Under the threshold strategy, actions are performed only when indicative values exceed or are lower than predefined thresholds. For example, a component is replaced only when its degradation status reaches a fixed threshold under the threshold strategy. Nevertheless, the threshold control strategy is not optimal in many cases. When it is cost-effective to replace multiple components, the total maintenance cost can be reduced by replacing a part that exceeds and close to the replacement threshold simultaneously [9]. There are several studies adopting the Markov decision process, where thresholds are established based on components' real-time conditions [10,11]. However, these studies assumed that components followed discrete deterioration processes, and did not consider the failure of components between two inspection points.
In this paper, we consider a system composed of multiple components that follow the Wiener process and can fail between two inspection points. The joint problem of replacement and ordering is formulated through a Markov decision process. Based on the model, the value iteration algorithm is applied to optimize the joint decisions. A two-unit system is presented to analyze the effectiveness of our policy. Finally, the impact of different parameters on the strategy is studied through sensitivity analysis.
The remainder of our study is organized as follows. In Section 2, we introduce our problem. In Section 3, we formulate our problem as a Markov decision process (MDP) and introduce the value iteration algorithm. An investigation of a two-unit system and the sensitivity analysis is conducted in Section 4. Finally, we conclude our work in Section 5.

Problem description
We consider a manufacturing system composed of identical components. The deterioration of each component is independent of each other and follows the Wiener process initiating at 0 as follows: , 0, 0, 0, (1) where is the degradation value of component at time . and represent the drift parameter and the diffusion parameter respectively.
represents the standard Brownian motion. A component will deteriorate until its degradation status reaches the failure threshold and then, a failure occurs. The remaining useful life (RUL) of component is the time required to reach the threshold for the first time. Given the current degradation value , the cumulative distribution function (CDF) and the probability density function (PDF) distribution of RUL based on the Wiener process can be expressed as follows according to [12]: where and . Φ is the CDF of a standard normally distributed variable.
A failed component will generate a penalty cost per unit time. The system is under periodic inspections and condition information of all components is detected at discrete time points 1, 2, … , where is the inspection interval. Specifically, we assume the degradation status of a component is implicit and can only be known through inspections. Decisions of replacing components and ordering spare parts are based on the condition information. The replacement of components is divided into two types: the preventive and the corrective. When a component's degradation value is less than the failure threshold, the replacement is referred to as a preventive replacement. Otherwise, it is a corrective replacement. The cost of replacement includes two parts: the setup cost and the cost of for a preventive replacement and for a corrective replacement. The replacement time is assumed to be negligible. Spare parts share an inventory pool and are ordered from a supplier with a fixed lead time . The fixed cost of each order is and the price of a spare part is . If spare parts on hand are not used immediately, the system will generate a holding cost for each spare part per unit time. In this paper, we aim to minimize the average cost of replacing components and ordering spare parts including penalty cost, replacement cost, spare parts purchasing cost and holding cost.

Model establishment
The section establishes a joint decision-making model for replacements and spare parts ordering through the Markov decision process (MDP). The value iteration algorithm is utilized to solve the optimal strategy.

Markov decision process
We introduce the MDP model from five basic elements: decision point, state, action, transition probability and the expected cost. The decision points are discrete-time points 1,2, … at which decisions are executed. The set of possible states of a system at decision points constitutes the state space . Given the system state ∈ , the set of all feasible actions is . If action ∈ is taken, the probability of a system transiting to the state at the next decision point is | , , resulting in expected total costs between the current and the next decision point. The system state is composed of degradation values , , … , of components and spare parts' status , , … , , . ∈ 0, represents the degradation value of component . 1, 2, … , 1 indicates the number of spare parts ordered periods before, and is the quantity of on-hand spare parts. We assume the inventory capacity is and it should satisfy that ∑ . The action at a decision point is expressed as , , … , , , where indicates whether component will be replaced 1 or not 0 and is the number of ordered spare parts. We assume components transit from , , … , ) to | , | , … , | ) after replacement, where | is Since the replacement cannot be more than on-hand spare parts, there is ∑ . Given the current state , we assume a system will transit to the state , , … , , , , … , , at the next decision point with probability ′| , after performing action . During the transition, the inventory status is determined by , , … , , ′ , , … , , . The transition of components' degradation states is uncertain. Component may be in one of two states: failure or operating, at the next decision point. The probability that component fails at the next decision point (with degradation value ) is ; , based on Equation (2) Based on Equation (5), the transition probability for a component is | | , = | ; , | . Since components deteriorate independently, the probability that the system transits to ′ , ′ , … , ′ at the next decision point is where the indicator function 1 takes 1 if , otherwise it takes 0. The expected cost , which is incurred after action , , … , , is taken in state , is composed of four parts: replacement cost, spare parts ordering cost, holding cost and penalty cost. 1) Replacement cost: We utilize to represent the number of failures and to represent the number of replaced components, i.e., ∑ . Therefore, the replacement includes corrective replacement and preventive replacement. The total replacement cost is 1 .
2) Spare parts ordering cost: The order cost can be expressed as 1 . 3) Holding cost: After components are replaced, the inventory of the spare parts pool is updated to and the holding cost between two decision points is . 4) Penalty cost: If the PDF of component 's RUL is , the expected penalty cost of component before the next decision point is d . The expected total penalty cost of components is ∑ d .
The value function represents the cumulative cost of a system given the initial state after iterations. According to the Bellman Equation [13], the relationship of cumulative costs between two adjacent decision points is min ∈ , where represents the expectation of the cumulative cost at the next state after performing action .

Value iteration algorithm
In this paper, we aim to minimize the long-term average cost, and the value iteration algorithm is applied to solve the optimal strategy. Since the algorithm can only suit discrete state space problems, the discretization method adopted in [14] and [15] is used. Given a discrete interval ∆, the component degradation value is discretized as 0, ∆, 2∆ , … , . The processes of the algorithm are as follows: 1) Initialize the termination factor of iteration 0, 1, 0, ∀ ∈ ; 2) Calculate and the optimal action as: 3) Calculate and as: max ∈ ; (10) min ∈ .

Case study
In this section, a system of two components is investigated. The basic parameters are summarized in Table 1 where cost parameters are taken from the Literature [16]. In addition, we set the discrete interval ∆ and the inspection interval to be 1. The inventory capacity is set to be as large as the number of

MEIE 2021
Journal of Physics: Conference Series 1983 (2021) 012120 IOP Publishing doi:10.1088/1742-6596/1983/1/012120 5 components and the error for convergence is set to be 0.0005 implying the gap between the result and the optimal cost is less than 0.05 percent. As shown in Figure 1, the iteration error | | decreases rapidly as the number of iterations increases, and converges at about 52 iterations in about 3.5 seconds. The corresponding average cost is 45.3.  Figure 2 shows the decisions under the optimal strategy of the two-unit system. Since 1, the system's state is expressed as , , , where could take 0, 1, 2 as the maximum inventory capacity is 2. For each value of , we use the degradation values of components and as the coordinate axis. Each graph is divided into different areas representing different decisions in Figure 2. Compared with the threshold control strategy, the policy obtained by the value iteration algorithm will comprehensively consider the degradation information and spare parts' information when deciding on replacing and ordering. Hence, the replacement threshold and the ordering threshold for the optimal strategy are not fixed. The replacement threshold of a component depends on the inventory level and the degradation status of the other component. For example, when 2 as Figure 2(c) shows, the replacement threshold of component 1 decreases from 9 to 8 as increases. A similar phenomenon is illustrated for the ordering threshold. For example, the ordering threshold increases as increases from Figure 2(b) to Figure 2(c). Dynamic thresholds enable planners to adjust maintenance and ordering decisions dynamically based on the system status, through which the total cost can be reduced.
We further compare the cost of the proposed policy to the threshold strategy , , adopted in [7] and [8]. In the , , strategy, the ordering quantity satisfies , if 0, otherwise , where and are the ordering threshold parameters. represents the replacement threshold, implying that a component is replaced when its degradation value reaches and spare parts are available. The optimal parameters , and are determined by enumerating the feasible values to minimize the average cost. According to the basic parameters in Table 1, the optimal parameters for , , strategy are (0,2,9), and the corresponding cost is 46.4. The computation time of , , strategy under the optimal parameters is about 2.5 seconds while it takes more than 10 seconds to search for the optimal parameters. Compared with , , strategy, our method reduces the total costs by 2.4% with a shorter computation time (i.e., about 3.5 seconds in our method while more than 10 seconds in the , , strategy).

Sensitivity analysis
In the section, to analyze the effectiveness of our policy, we conduct sensitivity analysis between the proposed strategy and , , strategy under different scenarios based on parameters in Table 1. In the following, we analyze how the holding cost, ordering cost and preventive replacement cost influence the performance of the proposed policy. The results obtained are summarized in Table 2. The influence of each parameter is summarized as follows: 1) In all cases, the cost of the optimal policy proposed in this paper is lower than that in the , , strategy. Specifically, the cost can be reduced by up to 9.1% and at least 2.2%.
2) As increases, the inventory level decreases to reduce the holding costs. Since the optimal policy can adjust the ordering thresholds as shown in Figure 2, less holding time will be incurred by the optimal policy than that in the , , strategy. Hence, the optimal strategy will reduce more costs as increases. As increase, both policies order less frequently resulting in fewer replacements and spare parts. The cost gap between the two policies decreases with . As increases, planners would perform more corrective replacements than preventive replacements implying dynamic thresholds play a less important role, which will result in a less gap between the two strategies.

Conclusions
To solve the joint optimization of replacing components and ordering spare parts in multi-unit systems, this paper proposes a policy according to the Markov decision process. Through the value iteration algorithm, decisions of replacing components and ordering spare parts are jointly optimized. Compared with the traditional , , strategy, the average cost is reduced by the proposed policy in the paper since it simultaneously considers the degradation information and inventory information. Numerical results show that the proposed policy can reduce the cost by up to 9.1%. For the future, the study can be extended in two directions: 1) Some companies will order components from multiple suppliers which may be unreliable. Therefore, it is interesting for planners to investigate how to balance multiple suppliers considering maintenance. 2) Manufacturing systems have different structures where the