System Reliability Growth Analysis during Warranty

This paper presents system reliability growth analysis using actual field failure data. The primary objective of the system reliability growth is to improve the achievement of system reliability performance during system reliability demonstration, in order to achieve the predicted or contractually required system reliability commitment. An effective reliability growth model can be utilized to predict when the reliability target can be achieved based on previous reliability performance. In this paper, the system reliability growth analysis is illustrated using the Duane and AMSAA reliability growth models to determine applicability and aid in choice determination. The Duane model is a better choice for failure terminated reliability growth while AMSAA is a better choice for time terminated reliability growth. Comparisons of the Duane versus AMSAA model are carried out by conducting the statistical analysis on the observed field failures. KeywordsReliability growth modeling, Duane model, AMSAA model, MTBF, FRACAS.


Introduction
System reliability growth is the positive improvement in reliability performance over a period of time due to the implementation of corrective actions to system design, operation, maintenance procedures, or the associated manufacturing process. Refer to reference MIL-HDBK-189C (2011).
The primary objective of system reliability growth is to improve the system reliability performance in order to meet or exceed the predicted or contractual reliability commitment. A corrective action is an improvement to the system design, operation, maintenance procedures, or the manufacturing process for the purpose of improving the system's reliability.
System reliability growth relies on discovering failures which affect reliability performance during the system reliability demonstration period. The system reliability demonstration period usually covers the testing and commissioning period and the Defect Liability Period (DLP) for most of railway and mass transit turnkey systems. DLP is also called the warranty period. DLP usually commences when the first train is handed over to the client and accepted. The DLP duration depends on the fleet size, thus it is not uncommon for DLP to last for several years. The systematic failure is the main focus of the reliability growth process during the reliability demonstration period because the random failure is intrinsic to the equipment, and it is difficult to eliminate the random failure completely. During the testing and commissioning period, sometimes the time log or travelled distance log are too low to be of any use. Therefore, the failure occurrence should be relied on rather than Mean Time Between Failure (MTBF) or Mean Distance Between Failure (MDBF) values.
System reliability performance is monitored and tracked through an effective reliability growth demonstration program and Failure Reporting Analysis and Corrective Action System (FRACAS) processes and procedures. The underlying failure mechanisms can be investigated through Root Cause Analysis (RCA) to implement subsequent corrective actions and fixes.
If we were to look back through the history of the development of mathematical modeling for reliability growth, in 1936, J. T. Duane developed his famous Duane plot, which becomes one of the commonly applied reliability growth model across diverse industries. In 1974, L.H. Crow observed that Duane's methodology could be formulated in terms of a Weibull process and developed Crow-AMSAA model. The applications of the Duane plot and the AMSAA model in the reliability growth testing are introduced in reference MIL-HDBK- 781A (1996) andSystem Reliability Toolkit (2005). The reliability growth model and planning have been studied by many scholars in recent years. In reference Krasich (2016), the methods for planning reliability tests are introduced. One method is based upon purely mathematical principles, and two additional methods are based on the Physics of Failure (PoF) principles. In reference Jin and Li (2016), a lifecycle reliability growth model is presented in which the reliability growth efforts are seamlessly integrated into design, manufacturing and post-installation of a new product. In reference Juskowiak et al. (2013), Reliability Growth Model in early design stages is introduced. In reference Kar (2013), Reliability Growth Analysis for continuous process plants is studied.

Reliability Growth Program and FRACAS
Reliability growth management is the management process associated with planning for reliability achievement as a function of time and other resources and controlling the ongoing rate of achievement by reallocation of resources based on comparisons between planned and achieved reliability values.
Formulating an effective reliability growth program requires that a vigorous failure root cause analysis and corrective action process be developed. Simply replacing failed components during the reliability growth demonstration, without removing the underlying failure modes, will not result in reliability growth. Understanding the root cause of detectable failure modes and devising effective mitigation strategies is the effective method that will enable to achieve reliability growth. In reference Kaminski (2013), an effective reliability growth program development is introduced.
FRACAS is a closed loop process which characterizes and isolates failure modes to a particular root cause. FRACAS processes and procedures ensure failure modes are investigated and resolved with corrective actions to eliminate the failure modes that affect the system's reliability and to reduce the failure mode likelihood, in order to demonstrate reliability growth.

Reliability Growth and Duane Model
In this section we utilize actual field failure data to analyze the system reliability growth.
The following relevant failures in Table 1 have been observed in one of the delivered systems which are in service for a twelve-month period. Table 1 shows the system monthly operating time (hours) and the monthly observed failures. Cumulative system operating hours, cumulative observed failures and cumulative failure rate are calculated based on the monthly operating time and observed failures, in order to implement a reliability growth analysis. We plotted the cumulative failure rate versus cumulative system operating hours in Figure 1. The x-axis represents cumulative system operating hours while the y-axis represents cumulative failure rate. The plotted curve shows that the failure rate has decreased with respect to time, which indicates the system reliability improves with respect to time during the demonstration.  The transformed curve in Figure 2 indicates that the cumulative failure rate appears linear with respect to the cumulative system operating time on the log-log axes. This postulate is known as the Duane Model. In 1964, James T. Duane noticed that the system's cumulative failures appear linear with respect to time when plotted on the log-log axes. The Duane model belongs to the nonhomogeneous Poisson process models, and it is sometimes called the power law process. Refer to Reference Duane (1964).
The dotted trendline in Figure 2 is a straight line and can be expressed in the equation (1).
ln( ( )) = − ln + ln 0 (1) Where: C(t) is defined as the cumulative failure rate, λ0 is a constant that equates with the system cumulative failure rate at time t =1, α is the slope which is defined as the reliability growth rate. The curve can also be divided into two or more segments, each having a specific reliability growth rate α.
The reliability growth rate α is the rate at which a system's reliability improves as a result of corrective actions or fixes. It summarizes the ability of an organization or engineering team to improve a system's reliability. It is partly based on the general effectiveness of corrective actions or fixes.
The reliability growth rate α value in the Duane model is always below 0.6. There are some instances in the automotive and aerospace industry where the reliability growth rate α value have measurements greater than 0.6. action was taken on important failure mode only. 0.3 ~ 0.4 An above average program on reliability improvement exists. There are a wellmanaged plan and action on important failure modes. 0.4 ~ 0.6 A top priority program is in effect to eliminate failure modes. Immediate attention and corrective action prevail.
The cumulative failure rate C(t) at time t, is the cumulative number of previously occurred failures, N(t), divided by t and can be expressed in equation (2).
Substituting equation (2) into equation (1) yields: The failure rate λ(t) at time t is the derivation of the cumulative number of observed failures with respect to time.

Crow-AMSAA Model
The other commonly used reliability growth model is the Crow-AMSAA model. In reference Crow et al (1974), Larry H. Crow noted that the Duane model could be stochastically represented as a Weibull process, allowing for statistical procedures to be used in the application of this model in reliability growth. This statistical extension became what is known as the Crow-AMSAA model. In reference Broemm et al. (2000), the AMSAA Reliability Growth Guide is introduced.
The Army Materials System Analysis Activity Model (AMSAA) is a Weibull process used to model reliability growth. The AMSAA model assumes that the observed system failures follow the non-homogeneous Poisson process which means the failure rate or intensity function may change with respect to time. In reference Ellner and Trapnell (1990), AMSAA Reliability Growth Data Study is presented. In reference Ellner and Hall (2006), an approach to reliability growth planning based on the surfaced failure mode and correction using AMSAA methodology is introduced. In the AMSAA model, the cumulative number of failures N(t) is expressed as: Where: α is the scale parameter, β is the shape parameter.
The N(t) equation can be converted to a linear equation by transforming it to logarithms: The plot has ln [N(t)] on the y-axis and ln t on the x-axis. The slope of the straight line through the data is the estimated shape parameter, β. The scale parameter is determined using the y-intercept.

Comparisons of AMSAA Model versus Duane Model
In this section, comparisons of the AMSAA model versus the Duane model are carried out using the relevant field failures shown in Table 1.
We study the reliability growth analysis using the AMSAA model first. In Table 2, the last two columns show the logarithms of cumulative system operating hours and logarithms of cumulative observed failures. The plot in Figure 3 shows ln (Cumulative observed failures) on the y-axis and ln (Cumulative system operating hours) on the x-axis.
Equation (10) indicates that function of MTBF with respect to time follows the Power Law.
Now we continue the reliability growth analysis using the Duane model. In Table 3, the last two columns show the logarithms of cumulative system operating hours and cumulative failure rate. The plot in Figure 4 shows ln (Cumulative failure rate) on the y-axis and ln (Cumulative system operating hours) on the x-axis.
The comparison of equation (10) versus (11) indicates that the reliability growth function using the AMSAA model is similar to the reliability growth function using the Duane model; they both follow the Power Law. The trendline with the R-squared of 0.9711 using the AMSAA model in Figure 3 shows that it is closer to the reliability growth curve than the trendline with the R-squared of 0.9452 using the Duane Model in Figure 4, which implies that the AMSAA model is a better choice as a reliability growth model for the analyzed field failure data. This approves that the AMSAA model is a better choice for time terminated reliability growth. R-squared is a statistical measure of how close the data is to the fitted regression line. The higher the R-squared, the better the model fits the data.

Reliability Growth Prediction
A system reliability growth model is a model of how the system reliability improves over time during the reliability testing and demonstration process. As system failures that affect reliability are discovered and analyzed, the underlying faults causing these failures are repaired and fixed so that the system reliability should improve during system testing and commissioning progress. Reliability growth modeling involves comparing measured reliability at a number of points in time with known functions that show possible changes in reliability. By matching observed reliability growth with one of these functions, it is possible to predict the reliability of the system at some future point in time assuming that the reliability growth effort will continue indefinitely. Reliability growth models can therefore be utilized to anticipate when the reliability target can be accomplished to support project planning. It is noted that re-using an alpha growth rate should assume that the organizational capability remains the same from one project to the other. The organizational capability is a factor of the process strength, resources maturity and stability, etc.

Conclusion
In this paper, reliability growth analysis is studied using the actual field failure data of the system which is currently in service. The reliability growth functions are derived from the Duane model and the AMSAA model based on the same field failure data. The AMSAA model is a Weibull process and fits to the time terminated reliability growth. Using the AMSAA model, the plot of cumulative failures versus cumulative operating time implies a straight line on the log-log axes. The shape parameter β can be obtained by the slope and the scale parameter α can be obtained by the y-intercept. The Duane model fits to the failure terminated reliability growth. Using the Duane model, the plot of cumulative failure rate versus cumulative operating time implies a straight line on the log-log axes. The reliability growth rate α can be obtained by the slope and the cumulative failure rate at time t =1, λ0 can be obtained by the y-intercept. Finally, we observed that the system reliability growth trend in the Duane model is similar to the AMSAA model. The Duane Model and AMSAA Model are models applicable to time or distance-dependent reliability growth predictions.

Conflict of Interest
The authors confirm that this article contents have no conflict of interest.