A novel modeling framework for a degrading system subject to hierarchical inspection and maintenance policy

,


Introduction
Most systems such as infrastructure [1] , manufacturing systems [2] and energy assets [3] exhibit degradation in their functions and eventually lead to failure in the designated service time due to influencing factors, for example, working conditions. Inspections and maintenance (I&M) are key measures for reducing operating costs and maintaining the system performance. I&M intervention modeling and optimization have the potential to provide economic benefits through lower maintenance costs and downtime reduction, resulting in a significant amount of research on related topics over the last few decades [4][5][6] .

Research motivation
The selection and arrangement of the I&M policy intend to ensure that the system continues to satisfy the specific requirements. In many cases, especially in on-demand protection systems, dangerous failure modes with respect to system functions are not self-announcing. These hidden failures remain hidden before being revealed by inspection techniques. Let us consider the example of the failure modes 'Closing too slowly' and 'Leakage in the closed position' in safety valves [7] . These failure modes can lead to production loss and cause safety concerns, especially in high-risk industries, such as hydrocarbon gas [8] . Functional tests must be conducted to detect hidden failures to ensure that the functionality of the safety valves is adequate.
The recommended testing strategy for those hidden failures in safety valves is normally a two-level policy consisting of full stroke test (FST) and partial stroke test (PST) [9,10] , whose difference lies within the scope of the test. FST is based on the simulation of the on-demand function of the valves in actual scenarios to manifest hidden failures. The protected production process must be terminated, leading to a high cost burden. On the other hand, PST is performed by partially opening or closing the valve and then returning it to the initial position. The slight movement of the valve barely impacts the process flow or pressure but is still sufficient to reveal some hidden failure causes [7] . Simple preventive maintenance actions, such as lubrication and cleaning, are taken. Considering the difference in the test scope, PST serves as a supplementary measure to demonstrate valve integrity. Thus, system performance can be improved by follow-up maintenance actions, considering the information collected from both FST and PST. The selection of a two-level policy undoubtedly strikes a balance between economic costs and system performance.
Another case is the washing procedure for foulants on blades in compressors. The accumulation of foulants not only affects the efficiency of the blades, resulting in deteriorating compressor performance, but also causes damage to the blades in extreme operating environments which can lead to failure [11] . Thus, it is imperative to schedule offline and online washing to remove foulants in a cost-effective manner [12] . Offline washing is a thorough cleansing process but requires the complete stoppage of the turbine at thereby bearing an economic cost. In contrast, online washing is conducted during the normal operation of the compressor through the injection of an atomized cleaning fluid [13][14][15] . Online washing sacrifices cleansing efficiency compared to offline washing, intending to keep the compressor running. Online washing is thus expected to be implemented hand-in-hand with offline washing to ensure optimum results, both in terms of maintaining system efficiency and maintenance cost.
In the aforementioned cases, the I&M policy can be generalized as a two-level hierarchical model with full and partial inspections. Accurate information on the system state can be gathered during full inspections during system shutdown, enabling sophisticated, expensive, and thorough maintenance actions. As the name suggests, partial inspections focus on certain potentially hazardous areas, operations, and conditions in order to eliminate uncertainties in two consecutive full inspections. They are less efficient in revealing the system state, and their subsequent maintenance is cheaper and less effective. Thus, this study addresses the issue of scheduling a hierarchical I&M policy.
To date, there is existing literature that considers imperfect preventive maintenance but lacks an examination of the inequivalence in system information collection in inspections. Liu et al. [39] propose a maintenance policy for a degrading system with age-and state-dependent operating costs and then investigate the optimal preventive maintenance policy with a repair-replacement model. Inspections are considered as time windows to make condition-based decisions, including preventive replacement, imperfect repair, or waiting until the next inspection. Huynh [40] proposes a hybrid deteriorationbased maintenance model for a system subject to continuous deterioration. Then, the system performance is evaluated considering both non-memoryless imperfect preventive repairs and memoryless perfect replacements. Huynh [41] also conducts the condition-based maintenance for a continuous degrading system subject to multiple maintenance actions and quantifies the impacts of past-dependent preventive partial repair on economic performance. However, the system state is assumed to be fully known at inspection instants, which may be an overly ideal assumption for partial inspections. A proportional hazard model is established to conduct system performance estimation, and then to seek the optimal inspection interval with the incorporation of cost and time of inspections [42] . Truong-Ba et al. [43] propose a maintenance optimization method that considering the time-varying economic conditions by combining partial opportunities and condition-based maintenance. These partial opportunities are assumed to arrive randomly, in contrast to hierarchical inspection schemes. Su et al. [44] develop a multi-level decision-making approach for the optimal planning of the maintenance operations of railway infrastructure by considering maintenance and renewal as interventions to respond to the known system state. Maintenance and renewal are considered in the model, which is equivalent to the maintenance at full inspections and the overhaul after a service time. However, their research overlooks the information value from partial inspections of the system performance improvement.
Existing methods overlook the difference in the amount of information that can be retrieved during partial/full inspections, which may potentially lead to different maintenance activities. Complying with the hierarchical I&M policy, three maintenance actions need to be involved: a simple maintenance action at partial inspections, more efficiency at full inspections, and an overhaul after a service time. The preventive maintenance action following partial inspection would improve the system performance to a limited extent. For the full inspections, condition-based maintenance can be deployed by leveraging the state of the system from the collected information. To avoid ambiguity in terminologies, two classes of preventive maintenance (PM), PM-I and PM-II, are introduced to describe hierarchical PM actions. Specifically, the PM-II class is to be performed preventively at partial inspection instants in case of system failure, whereas the implementation of PM-I is dependent on the system state at full inspection instants. If the system state is revealed to be in a relatively acceptable condition, that is, less than an alarm threshold, PM-II is conducted; otherwise, PM-I is conducted.
Another factor attracting our attention is recoverability. The main difference between recoverable/unrecoverable degradation is emphasized by whether they can be eliminated by routine maintenance actions. Unrecoverable degradation might accumulate over the usage/service time, which depends on the working conditions or its inherent properties, such as the previously presented example of a compressor [15] . This difference determines the inability of transplanting imperfect maintenance models to quantify unrecoverable degradation. The effects of potential unrecoverable degradation deserve attention in system performance assessment, which is generalized as a time-dependent function in this study.

Main contributions and paper structure
The main objective of this study is to consider a continuously degrading system subject to a hierarchical I&M policy. Considering the potential of unrecoverable degradation, this study aims to explore optimal policies with the objective of minimizing the cost rate function. The potential contributions can be expressed as: • Proposing a generalized hierarchical inspection and maintenance policy for systems with partial and full inspections; • Incorporating the collected system information (e.g., working/failure, degradation level) into system performance assessment; • Building an analytical cost model for the generalized hierarchical maintenance model. The remainder of this paper is organized as follows. Section 2 formulates the research problem and provides a detailed system description. Section 3 presents the analytical formulas for the system performance and the cost model subject to the hierarchical maintenance model. In Section 4 , numerical examples are presented and analyzed to visualize the effects of the maintenance model on system performance. Potential applications and limitations of the proposed model are displayed in Section 5 . Section 6 summarizes the study and discusses future perspectives.

Problem statement
Notations used in the following are summarized as follow:  In this study, we consider a degrading system modelled as a continuous stochastic process X (t) , t > 0 . Moreover, the system is subjected to periodic partial and full inspections, which serve as time windows for maintenance interventions.
Each maintenance decision is based on the information collected during inspection instants. The maintenance effects need not to be perfect and are quantified by two deterministic functions, denoted as f 1 (t) and f 2 (t) . Figure 1 shows an example of the proposed model. For convenience in the following discussion, the following common assumptions are made: 1. The system starts to work with a new state, X (0) = 0 . The continuous degrading process is modelled as a homogeneous Gamma process, and can be revealed by inspection. The system is subject to periodic partial and full inspections with interval τ and τ , respectively. Moreover, the relationship between τ and τ can be described as k − 1 < τ τ k, ∈ N + ; 2. The system failure is not self-announcing and its state is only known at either partial or full inspection instants. The system terminates its service under two conditions: 1) the service time reaches a designated overhaul time T 0 ; 2) the system is revealed to be in a fault state at an inspection instant, which implies the degradation level has reached a predefined failure threshold L between two successive inspection instants. Denoting the first hitting time to failure threshold L as T F , the specific state known time is the next inspection time T = τ · T F τ . This means that a renewal cycle length is the minimum value of T and T 0 . Moreover, in these two conditions, corrective maintenance (CM) is performed with a maintenance cost C CM . For condition 2), in addition to the cost C CM , there is a potential downtime-related cost in the 3. The first-class preventive maintenance (PM-I) is possibly conducted at the instants of full inspection with interval τ , depending on the revealed system state. Generally, more information can be collected during full inspections, and the degradation level at inspection instants is assumed to be known perfectly. If the degradation level at nτ, n ∈ N + is beyond the PM threshold M(M L ), it is reduced to a specific value on the function f 1 (n ) , as 2 τ and 3 τ in Fig. 1 . Every single PM-I action induces a cost, denoted as C I PM ; 4. The second-class preventive maintenance (PM-II) is scheduled at the instants of partial inspections with interval τ .
Through the partial inspections, it is only known whether the system is functioning or not. If the system fails, then system service terminates, and CM is conducted with maintenance cost C CM . If the system is working at k τ , PM-II is followed. For example, the instants τ and 2 τ in Fig. 1 , and the system degradation after the conduction of PM-II is reduced to f 2 (0 , 1) and f 2 (0 , 2) , respectively. If the revealed degradation level is less than the PM threshold M at the full inspection instants, such as τ in Fig. 1 , then, PM-II is conducted to reduce the degradation level to f 2 (0 , 3) . The relevant maintenance cost is C II PM . Moreover, it is assumed that C II PM < C I PM < C CM ; 5. In terms of the deterministic functions of maintenance effects, PM-I leads to better system performance than PM-II; this indicates that the slope of the function f 2 (t) is no less than f 1 (t) . Moreover, it is assumed that the f 1 (t) and f 2 (t) remain lower than the preventive maintenance threshold M and failure threshold L in the finite service time T 0 . 6. The time duration spent in maintenance is negligible.

Analytical formulas
As described in Section 2 , the system is subject to replacement either because of failure or because it has reached the designated service time T 0 . To evaluate the performance of the proposed hierarchical maintenance model, we use the average cost per unit time in the operational phase as the objective function: where T C is the total inspection and maintenance cost, and H is the length of a cycle, with H = min ( T 0 , T ). The expected total maintenance can be expressed as follows: where N I PM , N II PM , and T d denote the number of PM-I, PM-II, and the elapsed time in the failed state, respectively.
Let X (t) denote underlying baseline degradation process. Then, we can deduce the first hitting time T F of the degradation process with respect to the failure threshold L as: Thus, the expected length of the renewal cycle is given by: where R T (t) denotes the survival function of T : The survival function of T F , R T F is derived later.

The vector of maintenance levels
As stated in Section 2 , the maintenance decisions during full inspections are based on the system condition. The random variable of the possible maintenance action, ψ n at instant n τ ( n ∈ N + ) can be calculated as, For convenience, a vector of maintenance level (VML), ψ 1: n , is introduced to define the evolution of the system state with specific maintenance actions ψ i at the i th full inspection, i = 1 , 2 , · · · , n . The length of the VML reflects the number of full inspections performed in a renewal cycle. A VML ψ 1: n with a length of n is considered 'legal' if 0 does not appear or only appears as the last element.
Considering the aforementioned assumptions and ψ n , at time τ ( n = 1 ), the occurrence of ψ 1 = 1 implies that the revealed system state is located in between the failure threshold L and PM threshold M, expressed as M X (τ − ) < L ; ψ 1 = 2 indicates that the revealed system state is less than the PM threshold M , X (τ − ) < M . Both implicitly require that the system survive until the instant (k − 1) τ . The term for degradation level f 1 (a ) is omitted in this case because the system is starting to work with a new state. Substituting the degradation level at the post moment of each partial inspection with the value from function f 2 (a, b) , a = 0 , and b depends on the partial inspection instant. X, X stand for the degradation in an entire partial inspection interval and in the last interval from the (k − 1) τ to τ , respectively. The distribution of ψ 1 can be expressed as below: Consider now the joint distribution of a legal VML ψ 1: n : where P [ ψ 1 | ψ 1:0 ] is defined as P [ ψ 1 ] . The distribution of ψ 1: n depends on the appearance of the PM-I. To compute the multiplicand, we introduce the notion of the last PM-I associated with a particular realization of ψ 1: n . Then, we can obtain v ψ 1: n as: The conditional probability of P [ ψ h | ψ 1: h −1 ] in Eq. (8) for ψ h = 1 and ψ h = 2 can be calculated respectively as follows: and To facilitate further discussion about R T F (t) and other quantities, we denote by A n the set of legal VMLs of length n that do not include 0, and by B n the set of legal VMLs of length n that end with 0.

Distribution of the first hitting time
The survival function of T F at an arbitrary instant t can be computed by conditioning it on the nearest time grid before t. Let The pair (ω, m ) fully determines the position of t as demonstrated: ωτ + m τ t < ωτ + (m + 1) τ , then, The special case for R T F (t) where t = τ has a concise expression

PM N II
PM is related to the values of both ω and m in T 0 . The calculation of N II PM is based on the discussion of the first hitting time T F . Assuming nτ + (i − 1) τ < T F < nτ + i τ , for simplification, it is expressed in the following format: T F ∈ (n, i − 1) .

E[ N II
The first term of the conditional expectation is given by: where the term (k − 1)(n − 1) denotes the number of PM-II inside the first n − 1 full inspection cycles, n −1 j=1 1 (ψ j = 2) denotes the number of PM-II performed at jτ , and the last term (i − 1) denotes the number of PM-II conducted before T F ∈ (n − 1 , i − 1) . The joint probability of 1: n −1 and T F is: The second conditional expectation is: The second joint probability is:

Distribution of downtime
The distribution of downtime depends on the joint probability of 1: n −1 and T F .
The first summand can be decomposed as: The second summand is: The last summand is:

Numerical example
The complexity of the analytical formulas in Section 3 necessitates a numerical example to visualize the proposed maintenance model. This section begins with an illustrative example considering only perfect maintenance, before addressing more comprehensive cases with both analytical formulae and Monte Carlo simulations to validate our results. Several factors in the proposed model, including the inspection intervals τ and, τ , the PM threshold M are also explored in the context of the optimization of the cost rate function.

Illustrative example
Consider a deteriorating system whose degradation behavior is assumed to be described by a Gamma process X (t) with shape parameter α = 3 and scale parameter β = 0 . 5 ; then, the mean and variance are α/β and α/β 2 , respectively. The designated service time (overhaul) is T 0 = 30 . If the system degradation level exceeds a predefined failure threshold L = 50 , the system fails. The system is subject to the proposed hierarchical maintenance model, which includes PM-I, PM-II, and CM. Furthermore, each maintenance action can contribute to an improvement of system performance, as described in Section 2 . The relevant maintenance costs are presented in Table 1 .

A special case: Perfect maintenance actions
In the hierarchical model, the deterministic functions f 1 (t) and f 2 (t) that describe the maintenance results can be generalized as power functions f 1 (t) = a 1 · t b 1 and f 2 (t) = a 2 · t b 2 , with a 1 , b 1 0 and a 2 , b 2 0 , respectively. The system is subject to perfect PM actions if the coefficients a 1 = a 2 = 0 . In this case, the full inspection length τ may be treated as the cycle length. Then, in the first interval, τ , It should be noted that when k = 1 , it is time-based maintenance, where the system is repaired with an interval τ independent of the state. If the state is known, it is typical condition-based maintenance with predefined PM and CM thresholds.
A brief study with M = 30 is conducted here to optimize the partial inspection frequency factor k with the long-run cost rate C , as depicted in Fig. 2 . In this case, the optimal value is k = 2 . The system is subject to excessive partial inspections, leading to an increased cost rate along with the parameter k . In contrast, failure-related costs, including C CM and downtime costs, contribute to the highest cost rate with k = 1 .    Table 2 shows the value of several terms, such as N I

PM , E(T d ) and E(H) , by analytical formulas in Section 3.3 and Monte
Carlo simulation with 10 5 episodes. The analytical formulas have been validated, as shown by the slight differences in Table 2 . Figure 3 shows the effects of M and k on the cost rate. Specifically, k represents the frequency of PM-II, whereas M represents the intervention opportunity of PM-I. Figure 3 a depicts the tendency of the cost rate with k for a specific value of M; for different values of M, there is a similar tendency in the relationship between the cost rate and k . The cost rate drops with the parameter k and then bounces after a specific optimal value. For k = 1 , insufficient inspections fail to reveal the system state, and consequently result in the highest cost rate. At k = 1 , Fig. 3 b further exhibits a positive correlation between the cost rate and M, indicating the value of an earlier PM action when inspection opportunities are limited.
Along with the increment in k , which is equivalent to shorter partial inspection intervals, the intensive inspectioninduced cost contributes to an increase in the cost rate. Moreover, for k = 5 and k = 10 , it can be seen from Fig. 3 b that the cost rate shows a negative relationship with PM threshold M. The observed correlation has resulted from the effect of PM-II. Although a single PM-II is imperfect in the mitigation of system degradation, the more frequent interventions can reduce the possibility of system failure and release the effect of PM-I, which has a higher PM cost C I PM . This finding implies that if the system is subject to sufficient partial inspections, PM-I could be intervened at a worse system state from an economic perspective.
The cost rate reaches a minimum value at a specific k , as depicted in Fig. 3 a through the examples, k = 2 for M = 30 and k = 4 for M = 50 . The corresponding optimum cost rate reflects the balance between the benefits and losses of PM-II: briefly, the former comes from preventing system failures, whereas the latter is related mainly to the economic costs of inspections. The effect of M is further investigated at k = 2 which has an optimal cost rate for both M = 30 and M = 40 . An optimal cost rate is shown in Fig. 3 b, where M is approximately 34. Thus, the optimal maintenance policy with the lowest cost rate C ≈ 2.97 is k = 2 and M = 34 .

Sensitivity analysis
This subsection presents a sensitivity analysis of several key factors intended to provide clues for practitioners to apply this hierarchical maintenance model.

Sensitivity analysis of the maintenance policy
The comprehensive maintenance policy emphasizes the frequency and times for the arrangement of partial and full inspection actions. In the designated service time, the maintenance policy is determined by the full inspection length τ and the ratio factor k between the partial and full inspection lengths. The objective function minimizes C t (τ, k ) . The main parameters are in accordance with those in Section 4.1 : T 0 = 30 , M = 30 . The results of these policies are shown in Fig. 4 a. Given any value of M, it is clear that the cost rate has a maximum value of k = 1 . When k = 1 , no PM-II will be conducted in any two successive PM-Is. The tendency of k = 2 is comparable to that of k = 1 . Figure 4 b shows that the minimum cost rate in these two cases are reached at τ = 10 . The failure-related cost dominates the cost rate function, which results from insufficient inspections. However, when k = 10 , the cost rate is negatively related to τ . The system is exposed to excessive PM-II, which increases the cost rate. The projection curve in Fig. 4 b illustrates the potential optimal ( τ , k ) for reaching the minimum cost rate. The failure-related cost dominates the cost rate in the left zone of the projection curve, whereas the inspection-induced cost dominates the right.
τ is further investigated when k takes a specific value. As Fig. 5 shows, there is a similar tendency between the cost rate and the parameter k when τ = 10 and τ = 15 . Specifically, the cost rate first decreases with k and then increases after a certain value. It falls to a low point of approximately 2.99 at k = 2 when τ = 10 . The lowest point shifts to 2.89 at k = 4 when τ = 15 . When k = 1 , the uncertainty regarding the system state leads to the highest cost rate owing to a lack of inspections. The reduction in the cost rate for τ = 10 compared with τ = 15 results from an additional potential inspection opportunity. Then, more frequent partial inspections provide pertinent state information for PM actions, but at a higher related cost. By comparing the two tendency curves, it can be observed that the policy with τ = 15 at k = 4 has a better cost rate than the policy at k = 2 when τ = 10 . These findings provide practitioners with clues concerning decision-making regarding the optimal maintenance policy.

Sensitivity analysis of maintenance efficiencies
Maintenance efficiencies refer to the coefficients of the maintenance functions f 1 (t) and f 2 (t) and the triggering threshold M of PM-I; these factors interact during the decision-making process. As stated in Section 4.1.1 , the system is subject to perfect maintenance if a 1 = a 2 = 0 . However, if b 1 = b 2 = 0 , this indicates that each maintenance action is intended to restore the system degradation to a certain level [17] . The impact of parameters a and b on the cost rate are studied separately in the subsequent sections. The hierarchical model is assumed with the main parameters including T 0 = 30 , τ = 2 τ = 10.
Complying with the predefined assumptions in Section 2 regarding the advantage of PM-I in the restoration of the system state, this implies that a 2 a 1 0 if b 1 and b 2 have the same value (assumed to be b 1 = b 2 = 1 ). As shown in Fig. 6 a, the cost rate reaches a minimum value at M = 30 when a 1 = a 2 = 0 and a maximum value when a 1 = a 2 = 1 . This demonstrates the necessity for better maintenance efficiency in terms of system performance from an economic perspective. When a 1 is fixed, the cost rate increases with a 2 . Meanwhile, Fig. 6 b shows that the cost rate is independent of the value of a 1 when M = 50 . This is due to the absence of PM-I, given the equal triggering threshold M of PM-I and system failure threshold L . The obvious conclusion is that, when the PM threshold is known, it is better to perform maintenance actions with better outcomes.
Then, the cost rate with parameters a 1 and M is studied, as depicted in Fig. 7 . As previously discussed and shown in Fig. 6 b, the cost rate is independent of a 1 when M = L = 50 , which appears as straight lines in Fig. 7 . Among the other values of the PM threshold M, the cost rate increases with a 1 . When a 1 takes quite small values, e.g., a 1 < 0 . 5 , there is an optimization problem between the cost rate and the PM threshold M. The cost rate decreases with M first, given the intervention of PM-I with quite good outcomes in the restoration of the system state. After a certain low point, the cost rate bounces, resulting from the PM-I-related costs. However, when the effect of PM-I is limited, indicating a higher value of a 1 , the cost rate shows a reverse tendency with the increment of M. This is because of the limited benefits of PM-I in improving system performance but a much higher maintenance cost than a single PM-II. This indicates that PM-II actions can be prioritized if maintenance effects are acceptable.
When a 1 = a 2 = 1 , parameter b is assumed to range from 0.8 to 1 and is used to adjust the impact of parameter a on the cost rate. Compared with the previous result, the cost rate has its maximum value when b 1 = 1 , owing to the poor effects of maintenance actions. An optimum value exists at M = 40 when b 1 = 0 . 8 ( Fig. 8 ).
These numerical examples clearly show that the proposed hierarchical I&M framework provides an opportunity to incorporate information collected in both full and partial inspections. It can be simplified to a typical time-or conditionbased I&M policy that relies on the information collected at inspections if unrecoverable degradation is absent as stated in Section 4.1.1 . The inspection interval length is the main contributor to the cost rate in the system service cycle, which requires a compromise between the inspection-induced cost and downtime cost.
When partial inspections are introduced over-frequently in two consecutive full inspections, the benefit in the improvement of the system performance exceeds the economic cost, which releases the necessity of the preventive maintenance threshold at full inspection instants for the minimization of the cost rate. The unrecoverable degradation, which correlates with the elapsed service time on system performance, deserves more attention given its direct impact on the maintenance cost.

Potential applications and limitations
The proposed model follows the Total Productive Maintenance philosophy seeking solutions regarding the I&M interventions of a degrading system. It provides clues for practitioners, e.g. maintenance managers and reliability engineers, in the decision-making in case of multiple factors, including maintenance threshold, partial and full inspection allocation, and unrecoverable degradation. A hierarchical model is proposed considering the variety of maintenance interventions. PM-II in the proposed model refers to the basic maintenance activities which can be easily implemented with minor cost and a high frequency, such as lubrication, tightening the screws, changing filters, etc. Meanwhile, mechanics, electricians, or other maintenance technicians can do more sophisticated inspections with techniques such as oil, vibration, and infrared analysis which correspond to PM-I. These activities require more resources and are carried out less frequently unless certain conditions are met. However, the effects of maintenance intervention actions are affected by many factors presenting as either perfect or imperfect maintenance. Unlike the common as-good-as-new PM effect in most existing studies, the proposed  model releases the assumption and provides a generalized solution to quantify the maintenance effects by introducing the function f 1 (t) and f 2 (t) . For example, the unrecoverable deterioration caused by certain factors, e.g., aging, which might not be eliminated even under PM-I by experienced technicians, could be considered and quantified with the function f 1 (t) . The proposed model, thus, has its potential advantage and applicability in actual industrial cases.
It is recommended that practitioners should pay more attention to validating model assumptions rather than calculations, as numerical formulas and simulation approaches are developed in this study. Practitioners should establish the mapping between their practical problem and the model to ensure basic assumptions are met. For instance, in an oil analysis scenario, the degradation could be the amount of water or ferrous wear particles. PM-II may be simple inspections performed by the equipment operator to look at and smell the lubricating oil together with minor corrective actions; PM-I consists of Karl Fischer titration and/or analytical ferrography to determine the water amount or wear particle count followed by preventive/corrective actions such as change of filter or oil replacement. Taking frequent samples and trending the data could be a preliminary for validating the Gamma process assumption. Necessary model modification, e.g., changing the assumed Gamma process to a Wiener process, should be considered ad hoc. Note that the assumption validation process requires enormous teamwork: knowledge and experiences from operators, technicians, engineers, and eventually condition monitoring experts are indispensable.
Limitations of our model consist of, first, the overlooking of the randomness of the IR effect. In the literature, it is most common that the reduction is random and possibly proportional to the degradation level. The deterministic functions f 1 and f 2 are a simplification, and introducing randomness will be addressed in future work by defining f 1 and f 2 as the mean of IR effect. Second, it is assumed that the model parameters are known. This is generally over-optimistic, and parameter estimation should be carefully considered by practitioners using maintenance records and/or expert experiences to ensure the accuracy and efficiency of the proposed model.

Concluding remarks
A hierarchical maintenance policy is proposed in this study to incorporate multi-level measures in the operational phase of critical systems, such as compressors and safety valves. Partial inspections determine whether the system is functional and are followed by partial preventive maintenance, while full inspections provide more accurate information about the system's state and allow for condition-based maintenance. Compared to existing works, the proposed model can quantify the unrecoverable degradation that accumulates over time.
Numerical examples are conducted to demonstrate the advantages and usefulness of the proposed model, which is demonstrated to be effective in covering both imperfect and perfect maintenance models. The cost rate function is employed as the evaluation criterion for maintenance policy selection. When maintenance actions are perfect, the preventive threshold should be determined by weighing the benefits of PM actions against associated maintenance costs. For hierarchical models, if the maintenance actions are effective in restoring the system state, the full maintenance action is helpful. However, maintenance actions with limited effects contribute to higher cost rates. In this case, a partial inspection can be applied with a priority.
It would be interesting to consider more practical issues, such as the state-based maintenance cost and duration, for future studies. Other important research directions include validating the proposed model using field data, estimating parameters, and introducing randomness in the maintenance effect.

Data availability
No data was used for the research described in the article.