The impact of Weibull data and autocorrelation on the performance of the Shewhart and exponentially weighted moving average control charts

Article history: Received 02 February 2011 Received in revised form 12 March 2011 Accepted 14 March 2011 Available online 15 March 2011 Many real-world processes generate autocorrelated and/or Weibull data. In such cases, the independence and/or normality assumptions underlying the Shewhart and EWMA control charts are invalid. Although data transformations exist, such tools would not normally be understood or employed by naive practitioners. Thus, the question arises, “What are the effects on robustness whenever these charts are used in such applications?” Consequently, this paper examines and compares the performance of these two control charts when the problem (the model) is subjected to autocorrelated and/or Weibull data. A variety of conditions are investigated related to the magnitudes of various parameters related to the process shift, the autocorrelation coefficient and the Weibull shape parameter. Results indicate that the EWMA chart outperforms the Shewhart in 62% of the cases, particularly those cases with low to moderate autocorrelation effects. The Shewhart chart outperforms the EWMA chart in 35% of the cases, particularly those cases with high autocorrelation and zero or high process shift effects. © 2011 Growing Science Ltd. All rights reserved


Introduction
Shewhart and exponentially weighted moving average (EWMA) control charts are commonly used to distinguish between random and non-random (assignable-cause) process variability.The underlying assumptions with these charts are that data are independent and identically distributed normal (iidn) about a central mean."Independent" implies that data taken at different times are unrelated."Identically distributed" implies the underlying distribution is the same at each time period."Normal" implies the data follow a normal distribution.Clearly, any process that generates data according to the Weibull distribution and/or generates data that exhibit meandering or autocorrelated behavior will violate these iidn assumptions.However, many physical systems have been observed to generate data following the Weibull distribution (Bloch & Geitner, 1994).For instance, physical models related to the reliability and life cycle of product/machine failure times, bearing lives and environmental concepts such as wind speed are often Weibull distributed.Two parameters (shape and scale) provide the distribution with flexibility to model systems in which the number of events (e.g., failures) increases with time (e.g., product wear), decreases with time (e.g., infant mortality) or remains constant (e.g., failures due to external shocks to the system).Likewise, many real-world processes, ranging from machining to chemical to high technology, exhibit autocorrelated behavior as cited in research by Vasilopoulos et al. (1978); Ermer et al. (1979); Wardell et al. (1992); Box et al. (1976); Alwan and Bissell (1988); Montgomery and Friedman (1989); Berthouex et al. (1978), MacGregor et al. (1990) and Harris et al. (1991).
Although data transformations exist to dampen the effect of certain violations, such tools would not normally be understood and/or employed by naïve practitioners.Thus, the question arises, "What are the effects on robustness whenever these charts are used in such applications?"Consequently, this paper will examine and compare the performance of these two control charts when subjected to autocorrelated and/or Weibull data.As discussed in Section 2, these two iidn violations have not been previously studied together.Specifically, data will be generated from an AR (1) process (i.e., first-order autoregressive process) with a Weibull random error component.The experimental design will investigate various degrees of process shift, autocorrelation and Weibull shape values.Results will be plotted on Shewhart and EWMA control charts.Subsequently, the performance of these charts will be compared using average run length (ARL) as the metric.ARL represents the average number of samples that fall within the control limits before an out-of-control condition occurs.The goal of this research is to ascertain which chart is most robust within the various experimental cases investigated.

Literature review
The literature is extensive in the recognition of nonrandom behavior in SPC applications, especially autocorrelation.Vasilopoulos et al. (1978); Ermer et al. (1979);and Wardell et al. (1992) observed autocorrelation in machining and forging operations.Box et al. (1976) cite it in the chemical processing industry.Alwan and Bissell (1988) cite it in routine clinical chemistry SPC measurements.Montgomery and Friedman (1989) found it to occur frequently in SPC data from computer-integrated manufacturing environments.Berthouex et al. (1978);MacGregor et al. (1990);and Harris et al. (1991) found it in continuous processes and high technology industries.Woodall et al. (1993) noted that positive autocorrelation is more common in manufacturing applications than negative autocorrelation.An example of negative autocorrelation is an operator over-compensating the process, which can easily be corrected.
Much research has been done analyzing various control charts in the presence of autocorrelation.English et al. (1991) analyzed the performance of the Shewhart control chart using the ARL as the metric.The charts were designed based on two different cases.Case one was if autocorrelation was not known and case two was if autocorrelation was known.The AR (1) and AR (2) processes were used to generate the observations and to model the data.It was found that when the autocorrelation was small, the ARL values were similar between both cases.This was true regardless of the shifts in the mean.However, when autocorrelation was large, the ARLs resulted in significantly fewer false alarms (case two); the ARLs were small (under 50) and fairly constant regardless of the shifts in the mean (case one).Alwan (1992) also researched the capability of the Shewhart control chart to detect assignable causes in the presence of autocorrelation.Control limits were fixed and type I and II errors were used to analyze chart performance.The study was restricted to a class of stationary ARMA models, AR (1), AR (2), and MA (1).Maragah et al. (1992) also considered the capability of the Shewhart control chart but analyzed just the AR (1) and MA (1) models.These two studies came to the same conclusion -that even mild levels of autocorrelation can dramatically affect the Shewhart chart's ability to correctly separate assignable causes from random causes.Wardell et al. (1992) investigated the performance of the Shewhart and EWMA control charts in the presence of autocorrelation.Data came from an ARMA (1,1) process and the control limits were modified accordingly.Michelson (1994) also reviewed adjusting the control limits on the Shewhart and the EWMA control charts.However, these limits were adjusted according to data from an AR (1) process.Average run lengths (ARLs) were used to analyze the performance of both charts.It was found that negative autocorrelation values yielded high in-control ARLs for both types of charts, and that positive autocorrelation values yielded high in-control ARLs for the Shewhart chart, but low in-control ARLs for the EWMA chart.Lu et al. (1999) compared the performance of EWMA charts based on the original observations relative to the residuals.The Shewhart chart, based on residuals, was used as a reference point.The fitted model was derived from an AR (1) process.The results indicated that for fairly high autocorrelation, the time required to detect a shift is significantly longer than for the same shift in the independent (i.e., non-autocorrelated) case.Wieringa (1999) examined the case where autocorrelation is either ignored or unknown.The AR (1) model was used to generate the observations.The Shewhart control chart was first constructed with autocorrelation present.It was found that the control limits were too tight with positive autocorrelation, thus resulting in larger numbers of false alarms, and that the limits were too wide with negative autocorrelation, thus causing the chart to be insensitive to changes in the process mean.The same analysis was performed on the EWMA control chart.It was found that autocorrelation affected the EWMA chart to a greater extent than the Shewhart chart.The EWMA control chart was very insensitive to changes in the process mean when autocorrelation is negative.Furthermore, when autocorrelation is positive, the EWMA control chart produced slightly more false alarms than the Shewhart chart.Ramjee (2000) also analyzed the performance of Shewhart and EWMA control charts in the presence of correlated data.However, the data was from an ARFIMA model.The study showed that these charts do not perform well at detecting process shifts and, thus, a new type of control chart, the hyperbolic weighted moving average (HWMA) control chart, was proposed.English et al. (2000) also compared the effectiveness of Shewhart and EWMA control charts under autocorrelated data.This study analyzed the residuals when subjected to shifts in the mean and the autocorrelation.Schmid et al. (1997) proved that, in the presence of autocorrelation, the run length of EWMA control charts will always be greater or equal to the run length of an independent process provided all autocovariances are non-negative.Thus, the true ARL will be underestimated if one falsely assumes an independent process.Barr (1993) studied the Shewhart, EWMA and CUSUM control charts to determine their effectiveness for monitoring processes with data from AR (1), MA (1) and ARMA (1,1) processes.Control limits were adjusted and the ARL was used as the performance metric.Parameter settings were determined for both the EWMA and CUSUM control charts for a range of shift sizes and incontrol ARLs.It was concluded that the process can be monitored effectively if correlation is recognized and the control limits are adjusted accordingly.Tseng et al. (1994) reviewed the performance of the Shewhart and EWMA charts based on an EWMA forecast and an AR (1) model.The results concluded that if an EWMA forecast is used, then the Shewhart control chart should be used since it is less sensitive to violations of the independence assumption.Jarrett and Pan (2007) combined multivariate control charts for independent processes and univariate control charts for autocorrelated processes into the form of a vector autoregressive (VAR) control chart for multivariate autocorrelated processes.The effects of parameter shifts and feasibility of VAR control charts are also discussed.Chen and Cheng (2007) proposed an economic cost model to determine optimal design parameters for an x-bar chart under Weibull shock models.Such design parameters include sample size, time between samples and number of standard deviations from the centerline.As sensitivity analysis showed that non-normality has a significant effect on the design parameters.sensitivity to Weibull shape and process shift values was also examined.Erto et al. (2008) proposed a new Shewhart-style control chart for Weibull data based on a new DT (data technology) approach.Bayesian estimators are used to construct the chart and to allow it to be used with various types of data (statistical and non-statistical).Various tests and transformations exist for non-normality and/or autocorrelation.For example, the Durbin-Watson Test is a well-known test for first-order autocorrelation (Durbin &Watson, 1950 and1951).Moreover, the Chi-Square Goodness of Fit, Kolmogorov-Smirnov (K-S) and Anderson-Darling (A-D) tests can detect nonnormality (Law & Kelton, 1999).Once detected, various data transformations exist to dampen the effects of the violations.Examples include Robust Box-Cox transformations (Marazzi & Yohai, 2006) and the Hildreth-Lu Macro (Pindyck & Rubinfeld, 2000).However, many real-world practitioners ignore (intentionally or unintentionally) or "assume away" such complications when employing statistical tools due to their levels of statistical competence.
In summary, a number of studies have been conducted to compare Shewhart and EWMA control charts in the presence of autocorrelation.Various autoregressive models have been used.Ignoring autocorrelation, plotting residuals and adjusting control limits are the most frequently used approaches.However, violations of the normality assumption in light of Weibull data have not been investigated in conjunction with autocorrelation.As discussed in Section 1, many physical systems and processes follow the Weibull distribution and exhibit autocorrelation.Consequently, this paper will study both violations together using Shewhart and EWMA charts to assess their robustness under various cases in the typical situation in which a naive practitioner fails to explicitly remedy such violations.

Experimental design and methodology
The experimental objective is to assess and compare the performance of the Shewhart and EWMA control charts when used with Weibull and autocorrelated data.Specifically, how quickly can these control charts detect the introduction of a special cause (i.e., a process shift)?Large-scale simulation experimentation will be used to conduct the study.The basic experimentation consists of , the Weibull distribution reduces to the exponential distribution.As γ increases, the Weibull approaches the normal distribution.Thus, the distributions studied span the continuum between exponential and normal distributions.
• Five values for process shift coefficient ν (0, 0,5, 1.0, 1.5, 2.0) • Five values for autocorrelation coefficient Φ (0, 0.2, 0.4, 0.6, 0.8).Woodall, et al. (1993) found positive autocorrelation to be much more prominent in industry since it exists in processes that are "drifting" in some way (e.g., when assignable-cause variability exists).Thus, this study only considers positive autocorrelation.Although negative autocorrelation can exist in compensatory processes, it is usually easily identifiable and correctable.Each of these 100 cases will consist of a retrospective and prospective stage.In the retrospective stage, data will be generated from which the control limits are computed.In the prospective stage, additional data will be generated until an out-of-control observation occurs, thus providing the performance metrics needed to assess control chart performance across the various cases.Further details appear in the following paragraphs.In the retrospective stage, 50,000 observations i y will be generated from the AR (1) model shown below.This sample size was determined in pilot experimentation to provide a precise estimate of the standard deviation and, thus, precision in the calculated control limits.
where 1 − i y is the previous observation, Φ is the autocorrelation coefficient and ε is a random error component.When For the Weibull distributions considered, the random error ε is generated using the following Weibull cumulative distribution function: γ and δ are the shape and scale parameters, respectively, under consideration.To obtain a random Weibull value , ε Eq. ( 2) is rewritten by substituting a uniformly distributed random variable and then re-expressing in terms of ε which yields: After the 50,000 i y observations have been generated, the sample mean and standard deviation will be computed.These statistics will be used to compute control limits for the Shewhart ( x -bar) and EWMA charts using their typical control limit formulae.At this point, the retrospective stage ends and the prospective stage begins for each of the 100 major cases studied.The prospective stage will generate further i y observations until such time that an out-of-control condition occurs.These observations will be generated in a similar manner to those above, except now the process mean will be shifted by a multiple ν (0, 0.5, 1.0, 1.5 or 2.0) of the true standard deviation of the Weibull distribution .
w σ Thus, the revised AR (1) model is: , where ε is computed from Eq. ( 3).w σ is computed using the expression for the standard deviation of a Weibull random variable: (5) Using the above procedure, i y values will be repeatedly generated until a value falls outside of the control limits.At this time, the run length (i.e., number of observations until an out-of-control point occurs) is recorded.This process will be repeated 10,000 times for each case to yield 10,000 run length values, at which time the average run length (ARL) and standard deviation of the run lengths (SRL) will be computed.The rationale for 10,000 replications was based on pilot experimentation examining simulations with run lengths of 1000, 4000, 10,000 and 100,000.The ARL values became more consistent between runs when 10,000 replications were used.No major difference was found at 100,000 replications (except increased computer time).Thus, 10,000 replications appeared to be a good compromise between response consistency and computational efficiency.The above procedure (retrospective and prospective stages) will be repeated for each of the 100 experimental cases, thus providing ARL and SRL values for each case.In turn, these metrics will be used to assess and compare control chart performance across the various cases.Specifically, to determine if there exists a significant difference in how quickly the Shewhart and EWMA control charts can detect an out-ofcontrol condition within a particular case, a 95% confidence interval will be computed on the difference between the two chart's ARLs as follows: where 1 x is the ARL from the Shewhart chart, 2 x is the ARL from the EWMA chart, 2 1 s is the square of the SRL from the Shewhart chart, 2 2 s is the square of the SRL from the EWMA chart, and 1 n and 2 n are each 10,000.If zero is not contained within the confidence interval, we will conclude that a particular chart is better than the other.Further details appear in Section 4.

Results and discussion
Table 1 shows experimental results in terms of ARL values for the Shewhart and EWMA charts in each of the 100 cases.95% confidence intervals are shown for the difference between the two chart's ARLs.When no process shift occurred  but no process shift occurs, the Shewhart chart outperformed the EWMA.In the 16 cases where both autocorrelation and a process shift occurred, the EWMA chart was better in 10 cases.In general, the frequency of the Shewhart chart being the preferred chart tended to increase as both the process shift and autocorrelation values increased together.The remainder of Table 1  At extreme process shift values (0 and 2.0), the Shewhart chart was better at a few more moderate autocorrelation values.Again, the ARL decreased as the process shift and/or autocorrelation increased.
In summary, the EWMA chart outperformed the Shewhart chart in 62 of 100 cases.The Shewhart was better in 35 cases, and the remaining three cases were inconclusive.Specifically, the EWMA chart was better in 19 of the 20 cases with zero autocorrelation.It was also better in the four cases with zero process shift and zero autocorrelation.The Shewhart chart was better in 26 of 40 cases where autocorrelation coefficient was high (0.6, 0.8), particularly those cases where the process shift was either zero or high (1.5 or 2.0).These results suggest that, in general, the EWMA chart is safer to use whenever the underlying distribution is Weibull and autocorrelated data is suspected, unless of course specific process shift and autocorrelation values are known.In that case, the aforementioned results can be utilized for the specific case under consideration.

Conclusion
As previously discussed, many research papers have cited the presence of Weibull data or autocorrelated data in real-world processes.However, none of these papers has examined the performance of Shewhart and EWMA charts when subjected to both violations together.It is to this end that the research for this paper has been undertaken.The objective of this paper was to determine which chart performed better under a variety of conditions related to the magnitude of the process shift, autocorrelation and Weibull shape parameter.It was found that the EWMA chart outperformed the Shewhart in 62% of the cases, particularly those cases with low to moderate autocorrelation values.The Shewhart chart outperformed the EWMA in 35% of the cases, particularly those cases where autocorrelation was high and the process shift was either zero or high.The remaining 3% of the cases were inconclusive.Future research into this topic could take several forms.For instance, autoregressive models other than just the AR (1) model could be examined.Additional Weibull shape and/or scale values could be studied.Moreover, trends in the process shift could be analyzed.It is common for the process mean of an autocorrelated process to exhibit a trend.Finally, non-linear process shifts could be studied.These future directions, although not exhaustive, do appear to be logical starting points for subsequent research.
distribution to derive baseline results.Pilot experiments found the results to be insensitive to the scale parameter δ and, thus, a single scale value of 100 is used (further details regarding pilot experimentation are presented later in this section).When 1 = γ values are desirable.When a process shift occurred ) 0 ( > ν , low ARL values are desirable.Cases where the Shewhart chart outperformed the EWMA are shown by italics for the confidence interval values.Cases where the EWMA chart outperformed the Shewhart are shown by bold-face.Three of the 100 confidence intervals (i.e., cases) were inconclusive and are shown within a box.

Table 1
Summary of Results of the Experimental Cases The simulated value of 346.49 is fairly close to 333.33 and, thus, the realized ARL appears to be valid.As expected, ARL values tend to decrease as the process shift and/or autocorrelation increases.The EWMA chart outperformed the Shewhart chart in cases where Shewhart chart in 18 of 25 cases.The EWMA chart outperformed the Shewhart chart in four of five cases where no autocorrelation existed.The Shewhart chart was better primarily in those cases in which autocorrelation was high and the process shift ν was zero or high (1.5 or 2.0).Again, note the general trend whereby ARL values decrease as the process shift and/or autocorrelation increase.