Sensitivity analysis of repairable redundant system with switching failure and geometric reneging

Article history: Received September 16, 2016 Received in revised format: October 22, 2016 Accepted February 22, 2017 Available online February 22 2017 This study deals with the performance modeling and reliability analysis of a redundant machining system composed of several functional machines. To analyze the more realistic scenarios, the concepts of switching failure and geometric reneging are included. The time-tobreakdown and repair time of operating and standby machines are assumed to follow the exponential distribution. For the quantitative assessment of the machine interference problem, various performance measures such as mean-time-to-failure, reliability, reneging rate, etc. have been formulated. To show the practicability of the developed model, a numerical illustration has been presented. For the practical justification and validity of the results established, the sensitivity analysis of reliability indices has been presented by varying different system descriptors. Growing Science Ltd. All rights reserved. 7 © 201


Introduction
Queueing and reliability models have become popular for the performance analysis of several functional operations of machining systems.Markov modeling of machine repair systems has drawn the attraction of the many queue theorists and practitioners because of its effective applicability in reducing the overall functional cost of the organization/machining system and providing better services to the users.Machine repair problem which represents the finite population queueing systems has played an important role because of its high practicability in many industrial sectors ranging from car repair workshop to space stations.Wherever, there is a machining system, there will be failures associated with it and should be tackled in effective and economic manner so that it would not have a serious adverse impact on the system performance.These become machine interference problem becomes particularly more critical when the number of functional machines in a system becomes large and the machines are subject to failure most often.The study of finite population queueing systems and associated machine repair problem and reliability issues have gained importance with the rise of the software embedded machining systems wherein, the computer systems are employed for the automatic allocation for different job processes.
In many machining systems, the failures of machines can lead to several problems like blocking and delay, drop in profits, unsatisfactory uses, etc.To control this situation, we need to have system spare machines along with some maintenance policies.For illustration, we cite the automated manufacturing system (AMS) where manufacturing of the machinery parts is done using automated methods like CNC machines and robotic arms etc.But, like any other system, the functional machines in AMS are also susceptible to failure.In case of failure of any machine, the available standby machine is put in place with the automatic switching mechanism.Also, every time a machine fails, it is send to a repair shop where repairmen are present to repair the machines.But the number of repairmen will also be limited.In case if a large number of machines fail, then there will be a queue of failed machines at the repair shop which can lead to further delay and thus results in increment in the working costs for the organization.Sometimes, as a result of such long queues, the caretaker who is responsible for getting the failed machines repaired, might decide to leave the queue after waiting sometimes and go away; this type of give up is called the phenomenon of reneging, and adds up the additional cost on the functioning of the system.Whenever a machine fails and is replaced with another available standby machine, there may occur switching failures in the middle of switch over process.Thus, imperfect switching may happen and in turn affects the redundancy process.In a machining system encountered in production and manufacturing system or any other machining system, detection of faults and correcting them is necessary in order to bring down the total cost of maintenance and also to increase profits for the organization.The present investigation addresses a generic machine repair problem encountered in a machining system which can operate in a normal and short mode based on an available number of operating machines.The failure rate of operating machines in a short mode operation when there are fewer machines than required number of operating machines for normal system functioning, is greater than the failure rate in normal mode due to load sharing.In general, this phenomenon is termed as   , m M policy in queueing theory terminology.
The purpose of developing Markov model in this paper is threefold.Firstly is to provide steady-state and transient-state performance indices for the redundant machine repair problem with geometric reneging, standbys switching failures and multi-repairmen.The various system performance measures to be developed are, the average number of failed machines, throughput of the system, average number of operating machines/standby machines, average reneging rate/switching failure rate, machine availability and operative utilization, etc.We also establish some reliability indices viz.reliability of the system, MTTF, failure frequency etc.Secondly, we develop a cost model to give a quick insight into the optimal values of the number of standbys and the number of repairmen simultaneously at a minimum cost.Thirdly, sensitivity analyses are done for the reliability characteristics to study the effect of changes in specific values of the system parameters.
The research article is organized as follows.Section 2 provides a brief literature review related to concerned machine repair problem (MRP).In Section 3, assumptions and notations are presented to provide detail system description.Then we formulate Chapman Kolmogrove differential difference equations using Quasi-Birth-Death (QBD) process.In next section 4, matrix approach to implement Runge-Kutta method of order 4 to compute the stationary distribution for the number of failed machines is outlined.In Section 5, various queueing and reliability characteristics for the governing model are proposed.Using these performance indices a cost structure is also developed.Some numerical results by taking illustration for the practical situations that fit the model are given in the section 6.The sensitivity analysis of reliability measures with respect to different system parameters and is also carried out.Finally, Section 7 concludes the research article with a summary and highlighting the noble features of investigation done.

Literature Review
Since the inception of Markov model for the machine interference problem of textile industry (Palm, 1943), a lot of research works have been reported in survey articles in this area (e.g.Haque & Armstrong 2007;Jain et al., 2010;Krishnamoorthy et al., 2014;Chandrasekaran et al. 2016).Many researchers have worked on the problem of machining system with standby machines (cf.Wang & Sivazlian, 1989;Wang & Ke, 2003;Wang et al., 2007;Hajeeh 2011).Jain and Gupta (2014) suggested an optimal policy for the maintainability of a repairable system by including the realistic features of imperfect fault coverage and multiple vacations.
When the faults are found in the machining component, there exists the possibility to replace it by the available standby components so that the system can operate in spite of unexpected failure of component and the faulty component can be sent for the repair.The switching failure during the transition of standby to the operational mode of a standby machine in case of replacement of failed operating machine in machining system cannot be ignored.The switching failure factor was considered by a few researchers while developing the reliability models for the machine repair problems (cf.Kumar & Agarwal, 1980;Lewis, 1996;Wang et al., 2006;Wang et al., 2007;Wang & Chen, 2009).Jain and Rani (2013) studied the performance of multi-server multi-components machining system with switching and common cause failures.By constructing the governing-differential difference equations for the redundant   , m M machining system with switching failure of warm machines, Jain et al. (2014) computed performance measures via matrix method and successive over relaxation.Shekhar et al. (2014) evaluated the MTTF and availability of redundant machine repair problem with switching failure and reboot delay.Kuo and Ke (2016) computed steady state availability of a repairable system with standby switching failure.Ke et al. (2016) determined an optimal number of standby machines through Probabilistic Global Search Lausanne (PGSL) method for determining the minimum cost of machine repair system with standby switching failure.
In classical machine repair problem with reneging, it is assumed that the care taker of the failed machines reneges according to exponential distribution from the repair facility due to long waiting in the queue (cf.Ke & Wang, 2002;Choudhury & Medhi, 2011).This paper considers a probabilistic type reneging in a geometric fashion for the analysis of MRP with imperfect switching and standby support.Dimou and Economon (2013) derived explicit expressions and computational scheme for the single server queue with catastrophes in Poisson fashion and geometric reneging.Yang et al. (2015) determined the optimal threshold policy for the reliability measures of multi-component machining system with the facility of standby system and unreliable server where failed machines may renege sequentially in a geometric fashion.Recently, Shekhar et al. (2016) extended the concept of geometric reneging queues for the machine repair problem with N -policy.

Model Description
Markov model with geometric type reneging for the multi-component machining system with the provisioning of standbys and the multi-repair facility is developed.For the mathematical formulation of the machine repair problem, the following notations and assumptions are made:  The machining system consists of M identical operating machines working simultaneously.
To cope up with the unexpected failures of the machines, there is a provision of S warm standby machines.
 In normal mode, for the normal operation of the system, the system requires M operating machines but can also operates in short mode if the system has ( ) m M  operating machines; the system fails absolutely if fewer than ( ) m M  machines are in the operative state.Also denote 1 L M S m     . The operating as well as standby machines are subject to failure and having the statistically independent and identically exponentially distributed life times with parameters   and      , respectively.When all the standby machines are used and the system operates in short mode, the operating machines fail with degraded rate d  due to overload on the remaining operating machines.Thus, the state dependent failure rate of the machines is given by  When an operating machine fails, it is replaced by an available warm standby machine instantaneously if the switchover is perfect.The switched warm standby machine has the same failure characteristics as that of an operating machine. If the switchover is imperfect, the other available standby machines try to switch one by one in geometric fashion until perfect switch over takes place or available standby machines are exhausted.The switching failure probability is q .
 On failure, the operating machine or standby machine is immediately sent for the repair at the service station where the repair is rendered by one of the R repairmen available.The time-torepair the failed machine by the repairman is exponentially distributed with the mean rate  .
The server renders the repair jobs to the failed machines following the first in first out (FIFO) rule. Once a failed machine is repaired, it is as good as new one and it instantly resumes either operating or standby status depending upon whether the system is running in short or normal mode. The caretaker of failed machines may decide to renege from the system sequentially following the geometric distribution with probabilistic parameter  .The time-to-renege also follows an exponential distribution with parameter r . The stochastic processes namely failure and repair of the operating/standby machines are mutually independent of each other.For the mathematical representation of the present model at an instantt, the states of the system are denoted by the number of failed machines in the system at time t i.e.   N t .Now, the state probabilities are defined by

Fig. 1. Transient State transition rate diagram
By using the quasi-birth-death process and the underlying notations and assumptions, we construct the Chapman-Kolmogrove equations governing the model as follows: 3.1 System states when the number of failed machines is less than or equal to the number of repairmen   0 n R   : In this situation, the service rate continuously increases with the number of failed machines as for each failed machine, one repairman can be allotted.Thus for n failed machines, the service rate is  n .Now following flow balance criteria, we have 3.2 System states when the number of failed machines is more than the number of repairmen but less than and equal to the number of standby machines   R n S   : In this case, the service rate is fixed at  R as the number of failed machines is more than the number of repairmen.Hence, we frame the governing equations as follows:

Computation of Probabilities
For many complex systems, it is difficult to obtain the analytical solution of the set of differential equations; the same case is with the present model.In literature, many researchers used Runge-Kutta method to obtain the transient probabilities of system states.For the computational purpose, Runge-Kutta method of fourth order is used to compute the transient state probabilities.
Initially, all machines are in good states, so that the initial condition of the system is given by     0 0 1; 0 0 1, 2,..., .
We represent the system of equations in matrix form by where   To compute the transient state probabilities using Runge-Kutta method of fourth order, we develop the MATLAB program using routine ode45 which uses the following iterative procedure: where As t   , the vector denoting the state probabilities   , ,..., L  and 0 is a null vector of size 1 L  .For computing the steady state probabilities we employ Successive Over Relaxation (SOR) matrix method with the over relaxation parameter value1.25.

Performance Measures
The performance indices of machining system can be used not only to judge the efficiency and behavior of the existing system but also for future design and development of the same.In this section, the transient state probabilities obtained in the previous section are used to develop some performance indices as follows:  The expected number of failed machines in the system at time t  The throughput of the system at time t  Mean number of standby machines in the system at time t  Mean number of operating machines in the system at time t  Expected number of idle repairman/repairmen in the system at time t  Effective carrying load of failed machines in the system at time t  Expected waiting time of the failed operating machines in the system at time t  The expected delay time in service by the repairman at time t  Effective switching failure rate of standby machines at time t  Effective reneging rate of failed machines at time t  Failure frequency of the system at time t  Machine availability in the system at time t  Reliability of the system To have a quantitative idea about the system performance, we formulate a cost function using various cost elements associated with performance measures.Now we define the specific cost elements as follows:

Numerical Results and Sensitivity Analysis
The previous section provides the several performance descriptors in terms of expected value of random state variables using state probabilities along with reliability characteristics and expected total cost.In this section, we display the numerical results in tables and figures to validate the formulae derived and to give quick insight for the optimal design of the machining system.
For computation purpose, the default value of different parameters are set as follows: 15; apparent results can be observed from the Fig. 2 in which reliability of the system decreases significantly with the increase in failure rate of operating machines, standby machines and switching failure probability.The system's reliability can be enhanced with better service facility which is clear from Fig. 2(iii).It is also observed that the reliability of the system is increasing function of reneging rate r and reneging probability  .This prompts that when the failed machines renege from the system to be get repaired from external service facility at the worth of the system cost, the reliability of the system becomes better.In Fig. 3, the sensitivity of various parameters with respect to the reliability of the system   Y R t is exhibited using the classical theory of calculus for default values of parameters fixed as 15; Here, the relative sensitivity of the reliability   measures the percentage change in the   Y R t with respect to percentage change in system descriptor .The positive and negative sign of   show the increment or decrement in the value of   Y R t when there is increment in system descriptor .Its magnitude reveals the intensity of sensitivity for the relevant system parameters.Fig. 3(i) reveals the similar trend as noticed in fig.2; the reliability of the system seems to decrease with the increment in the values of ,   and q whereas it increases by increasing the parameters , r  and  .It is also clearly seen that the reliability of the system is sensitive for the system descriptor in the order r q          .From this observation, the system user can draw the inference that the preventive maintenance measures should be regularly taken to cope up with the failure of the operating machines and switching failure of standby machines.
The variability of the important reliability characteristics namely, mean time to failure MTTF is depicted in Fig. 4 and Fig. 5 with respect to system parameters for different increment in default value's chosen for Fig. 2 and 3.
  Fig. 4 reveals that MTTF of the system increases with the increase in the number of repairmen at the worth of the expected total cost.It also shows that MTTF gets decreased significantly with the increasing values of ,   and q .The system MTTF can be enhanced on increasing the values of , r  and  .Similar inference can also be drawn from Fig. 5

MTTF MTTF MTTF
It is noticed that MTTF can be made large for providing an appropriate number of standby machines in the system.Table 1 depicts that MTTF is sensitive to parameters in order r q , SR , RR and FF with 5%, 10% and 15% change in the values of system parameters.This table provides quick information for the expectation of various state variables viz. the number of the failed machines, the number of operating machines, the number of standby machines, waiting time for repair, etc.The negative sign represents that the performance indices are decreasing with an increase in the value of the system parameters.

Table 2
Various performance measures for different input parameters  In Fig. 6, the variability of the expected total cost TC with respect to repair rate for different values of , S R and m is shown.Figs 6(i)-(iii) state that as we increase the service rate, number of repairmen, number of standby machines and threshold of minimum number of machines required, the expected total cost likely to increase.Depending on the usability of the system, the analyst can fix the budget for the smooth functioning of the system in advance.

Conclusion
This work complements the earlier studies on the machine repair problem under the provision of standby machines with either switching failure or reneging independently.The inclusion of realistic features viz.imperfect switching and geometric reneging simultaneously make the applicability of the model developed to a broader horizon of performance analysis of redundant repairable machining system operating under the care of repair facility.Besides expected total cost, several queueing and reliability characteristics have established in terms of steady-state and transient-state probabilities.The sensitivity and relative sensitivity for the reliability characteristics discussed may provide a valuable insight to the industrial engineers and manager for the upgradation of existing systems.This study can be further extended for the models with unreliable repairmen and/or server vacation provisioning, etc.Another interesting direction for the future research is the determination of optimal number of standbys machines/repairmen, or threshold level to turn on the repair facility or optimal service rate.
6) 3.3 System states when the number of failed machines is more than the number of standby machines   S n L   : In this situation, the service rate is fixed at  R , but the failure rate of working machines becomes   d    .Thus the governing equations are given by the square coefficient matrix of order 1

fC
: fixed cost per unit time of failed machine waiting in the system O C : fixed cost per unit time of available operating machine in the system S C : fixed cost per unit time of available standby machine in the system I C : fixed cost per unit time of repairman being idle SW C : fixed switching failure cost per unit time R C : fixed cost per unit time associated with reneging of failed machine m C : fixed cost per unit time of providing repair to the failed machines by the repairman with rate  L C : fixed cost per unit time incurred on system capacity Thus, using the above cost elements, the total cost   TC t at time t for the transient state can be framed as: Fig. 2 presents the graph for the variation of the reliability of the system   Y R t with respect to time t in the interval [0,100] unit time for various system parameters.Figs.2(i)-(vi) show the decrement of   Y R t as time grows.Other

For
3(i) and (ii) present sensitivity of the reliability     and relative sensitivity of reliability     respectively.From the knowledge of calculus, we have

Fig. 6 .
Fig. 6.Total cost for varying different parameters and Table 1 in which the variability of MTTF with system parameters for different number of warm standby machines in the system and sensitivity   & relative sensitivity   of MTTF are displayed.It can be noted that

Table 1
Sensitivity analysis and relative sensitivity analysis of MTTF of the system with respect to various Table 2 comprises the percentage change in the value of steady-state performance measures for the default parameters taken same as in fig. 2. Table 2 summarizes the percentage change in the performance indices