USE OF CONTROL CHARTS FOR MULTI-TEMPORAL ANALYSIS OF GEODETIC AUSCULTATION DATA FROM DAMS

Geodesic auscultation can be used to monitor the movement of dam structures by measuring the distance, at different epochs from fixed positions (pillars) to other positions (targets). It is important to identify the targets that present atypical measurements to permit managers to take corrective actions. After fitting a model using the Least Squares Method (LSM), the residuals normally display random behavior. Multivariate control charts are then applied to the residuals of the fitted model from data taken of geodesic survey campaigns conducted at different epochs. Control charts have been widely applied in other fields of research than production processes such as public health, marketing, services. The results show that it is possible for monitoring the multi-temporal stability by the multivariate control charts. The method provides complementary information than the classical univariate statistical analysis.


INTRODUCTION
Geodesic auscultation of dams is a method of understanding and controlling the movement of the dam structure.Distance measurements are regularly carried out over different epochs from fixed positions -such as pillars -to other positions, which are usually called targets.These distances (or angles) should not vary significantly over time, and the variations, measure by the residuals, which do appear in these measurements, should be small (near zero) and random.It is extremely important to identify the targets that present atypical measurements because these must be explained and their causes identified and, when it is possible, eliminated.
This objective of this article is to present a proposal for monitoring multitemporal data derived from geodesic survey campaigns by applying multivariate control charts.Although this quality tool has usually been employed to monitor production processes in industrial fields, its use has expanded to include other areas of research than production processes such as public health, marketing, services.
Multivariate control charts are built with residuals obtained from a fitted model by the Least Squares Method (LSM) at different measurement points over time (i.e., using every individual survey campaign, each conducted at a specific time -a particular time of the year in a particular year -but jointly treated).
This article is organized as follows.Section 2 presents a literature review of multivariate and univariate control charts.Section 3 follows with a data analysis, and conclusions are presented in Section 4.

Ordinary Tests
The statistical techniques commonly used in the literature refer to both the analysis of residuals and to the analysis of the stability of the estimated parameters of the fitted model.
After completed the fitness of the model, the following tests are usually applied (SEVILLA et al.1990): the Chi-Square statistic to verify the quality of the residuals (and goodness of fit); the F statistic to test the unit variance; Student's ttest to identify for systematic errors; and Pope's test to detect outliers.
These univariate tests comprise the usual statistical tests to make inferences on the single mean and variance and equality between two variances.These tests are applied to each set of results from each survey campaign.In the case of making inference on the mean, Student's t-test is used to monitor if the expected mean value remains stable equal to a known value.Similarly, Chi-Square statistic (χ 2 ) is used to verify the stability of the variance.
In any least squares adjustment, the first test concerns on the variance factor.This test verifies if the residuals are consistent with the precision of the measurements obtained from a long-run term.In practice, the test infers if the residuals are those expected, based on the precision of measurements as well as the reliability of the mathematical model.
In a model fitted by LSM (GEMAEL, 1994), the sample variance -usually represented by 2 0 S based on the variance of the residuals is the statistics used to monitor if the variance remains stable equal to a reference variance which is initially assumed to be equal to one (called as a priori variance).Rejection of this null hypothesis may be an indicator that gross errors are present ("blunder" values) or that the fitted model may be equivocal.In this case, the residuals do not follow a normal distribution and the observed values that seem to be inconsistent with the measured values.
Once performed the variance factor test, the residuals should be individually analyzed to verify whether they constitute gross errors.Although this procedure is only absolutely necessary when the global test ( 2 0 S ) is rejected, it is always applied as the global test acceptance does not guarantee the presence of gross errors.If the null hypothesis of the global test is not rejected, the sample is considered representative of the population; otherwise, an approximation and the standardized residuals are used.
Moreover, it is important to detect outliers in a network fit.Pope (1976) used the τ (Tau) test for this purpose.Other methods exist to evaluate data including, for example, the data snooping technique (BAARDA, 1968) and the iterative reweighting method developed by Krarup (CASPARY, 1987).
The stability of the parameters is verified by comparing the results measured in two epochs using the so-called Global Congruency Test (GCT), as described in Niemeier (1981), Caspary et al. (1990), Denli and Deniz (2003), and Marotta et al. (2013).This test aims to determine the temporal stability of points from a network of deformations and/or structures control; the test defines which points remain stable and can be used in the determination of targets, which are the objects of the deformation analysis.
According to Giles (2000), the Anderson-Darling test is widely used and displays good properties.It is a nonparametric test to test the null hypothesis that a sample of size n taken from a population (with an unknown mean and variance) follows a specific distribution function.In the current case, it is assumed that the data (in this case, the residuals) follows a normal distribution.The distribution of test statistics θ is the vector of the parameters, when they are completely or partially unspecified, they should be by their estimatesθ ˆand ( ) denotes the cumulative distribution function.

Control charts
In general, companies have made efforts to improve the quality of their products by implementing quality-control programs.Most of these efforts are focused on using Statistical Process Control (SPC), which is an important part of quality management that aims to intervene in processes to prevent the production of many non-conforming items and those that exceed specification limits.Control charts have been used to detect deviations in the behavioral stability of specific characteristics in a production process.Shewhart introduced these charts in the 1920s to monitor production processes (MONTGOMERY, 2005).However, their application has recently expanded and the variety of research in which control charts have been employed is exhibited in the literature.The references cited are not comprehensive but are meant simply to illustrate the applicability of control charts in various fields of scientific research.For example, Marshall et al. (2004); Benneyan, Lloyd and Plsek (2003); Arantes et al. (2003); Mohammed, Worthington and Woodall (2008); and Gustafson (2000) have used control charts for monitoring in the public health, hospital infections, and service processes.Woodall (2006) offers a good review of the use of control charts in monitoring public health surveillance problems.Ning, Shang and Tsung (2009) present a review of the use of control charts in service processes.In civil engineering, Kullaa (2003) used control charts to detect damage to bridges in Sweden; Kano and Nakagawa (2008) have applied them to evaluate steel production.MacMarthy and Wasusri (2002) conducted a survey about the application of control charts in fields other than production processes.
In summary, a control chart is a progressive graphic representation of a quality characteristic of interest (attribute or variable) or a statistic calculated from observed samples in a process that is periodically collected over time using the following reference lines: where D is a statistic associated with a quality characteristic, d is a predetermined constant, and μ D and σ D are the mean and the standard deviation of D, respectively.If the value of the statistic D is recorded within the control limits, the process is said to be under statistical control (no action has to be taken); otherwise, the process is said to be out of control (actions should be taken, such as searches for specific causes, stoppages of production, etc.).
It is typically assumed that items are produced one by one and samples of fixed size n are collected at regular intervals of duration h.The design of a control chart is completely by determining the values of n, d and h, adopting both statistical and economic criteria.Statistical criteria are associated with the probability of a false alarm or with the average number of samples after deviation occurs until its detection, while economic criteria seek to optimize an objective-function usually associated with operational costs (inspection, false alarm, adjustment, rework, etc.) per produced unit.
In manufacturing industries, it is a common practice to frequently inspect produced items to measure the quality of the production process.The control chart is a simple tool that generates an efficient decision-making process using the information from the inspection.There are several types of control charts, as listed in Champ and Woodall (1987), and in this article is used the Shewhart-type control charts, which are notable as decisions on the process are made considering only the data from the last sample.Also, the called Shewhart-type control charts may be split into two types: control charts for attributes and control charts for variables.The control charts called 'np' control chart or 'p' (for monitoring the non-conformity fraction), control charts 'u' or 'c' (for monitoring the defect rate per produced item) belong to the former group; the x-bar charts (for monitoring the process mean), or charts 'R', 'S' or 'S 2 '(for monitoring the variance) are elements of the latter group (for additional details, see MONTGOMERY, 2005).Figure 1 is an example of a control chart x-bar for monitoring a process mean.The literature shows that Shewhart-type control charts are efficient for detecting the occurrence of large deviations relative to the target value (GOLDSMITH and WHITFIELD, 1961;CHAMP and WOODALL, 1987;MONFARED and LAK, 2013).To detect the occurrence of small deviations from the target value, EWMA-type and CUSUM-type control charts are more efficient and they are recommended to be used as they take into account past data in the decision-making process (MONTGOMERY, 2005; REYNOLDS JR and STOUMBOS, 2010).
However, in practice, there is typically interest in monitoring more than one quality characteristic.Therefore, multivariate control charts are developed in which the stability of several characteristics is assessed through a single chart -taking into account the structure of dependence among the p characteristics -instead of drawing separate control charts for each characteristic of interest.Literature reviews on multivariate control charts may be found in Alt and Smith (1988) and Bersimis, Psarakis, and Panaretos (2007).There are two common types of multivariate control charts; one monitors the stability of the vector of means and the other monitors the stability of the variance-covariance matrix.Considering that, a sample of size n is taken at regular intervals and from each sampled item, p characteristics are collected, the Hotelling's T 2 chart is typically used to monitor the stability of the vector of means.The statistic calculated and used in the T 2 control chart is given in ( 2): (2) where is the vector of the sample mean of the p characteristics, , is the vector of the mean (target value) of p quality characteristics; is the variancecovariance matrix of p quality characteristics; and the lower control limit (LCL) and upper control limit (UCL) are equal to zero and , respectively.When values are above the UCL, the process is said to be out of control and corrective actions should be taken.
To monitor the variance-covariance matrix, two types of charts may be used.One is based on the W statistic expressed in (3): (3) where S is the sample variance-covariance matrix, tr(.) denotes the trace of matrix (.), n is the sample size, and p is the number of monitored quality characteristics; |A| denotes the determinant of the matrix A. LCL and UCL are equal to zero and , respectively.The other chart is constructed by utilizing the generalized variance |S|, where |S| is the determinant of the sample variancecovariance matrix.In the case of p=2, it is known that follows a Chi-Square distribution with (2n-4) degrees of freedom.In the case of p >2, using asymptotic properties of |S|, control charts can be created using as control limits with ; ; (4) ; In the case of processes, which behaviors are not known or are still in the development stage, i.e., the parameters are unknown, unbiased and consistent estimators may be used in their place.

DATA ANALYSIS
For the geodesic auscultation of a specific dam, 77 distance measurements from pillars (fixed positions) to other positions (target points) were carried out sequentially over five epochs.These distances should (supposedly) vary little over time, and the variation measured by the residuals, should be small and random.The nominal accuracy of the equipment used in the measurement of the distances and the distribution of the targets in the dam are left undisclosed due to confidentially. (5) A LSM is proposed to adjust the distances to the model and the standard residual should (hopefully) follow a standard normal distribution mean equal zero and variance equal 1).For more details, see Gemael (1994).
To test the null hypothesis whether the standardized and raw residuals follow a normal distribution, the Anderson-Darling test is applied.Table 1 displays the pvalues where R i and Z i represent the raw and standardized residuals of the i-th survey campaign, respectively.The null hypothesis is rejected to all raw residuals as also for the standardized residuals of the first and third survey campaigns Table 2 shows the sample means and variances of the raw and standardized residuals of the 5 survey campaigns.According to it, the null hypothesis of the zero mean is not rejected for both the raw and standardized residuals.Table 3 shows the statistic values that are used to test the null hypothesis: the variance is equal one.It can be concluded that the null hypothesis is rejected for all standardized residuals.The rejection region with a type I error equal to 5% is [53.782; 101.999].The rejection of the null hypothesis (normal distribution) may indicate that there are gross errors (outliers) in the measurements (because of the parametersparticularly, the standardized residuals that seem to be inconsistent with the measured values) or that the fitted model can be equivocal.In this sense, the multivariate control charts are used to identify the unfit points by comparing jointly different survey campaigns in a single chart.
Figures 2A and 2B represent the T 2 control charts of the standardized residuals and the raw residuals, respectively.Similarly, Figures 3A and 3B show the generalized variance control charts of the standardized residuals and the raw residuals, respectively.There are several data points outside the control limits in both Figures 2A-2B, as also in Figures 3A-3B.These data points may have led to the rejection of the previously tested null hypotheses.
For processes in which the behaviors are unknown or in the development stage (i.e., unknown parameters), the values observed are typically removed -if they are outside of the control limits -to obtain more stable parameter estimates.With this aim, samples 10, 12, 39, 41, 43 and 44 (as a function of the T 2 control charts) and observations 12, 14 and 20 (as a function of the generalized variance control chart) are excluded of the analysis.
New Anderson-Darling tests are submitted for the remaining data and p-values are summarized in Table 4.Note that there is a significant improvement in the results and, in particular, for the raw residuals.However, both the raw and the standardized residuals of the third and first survey campaigns still do not follow a normal distribution.The T 2 and generalized variance control charts are rebuilt after excluding the data points outside the control limits.Figures 4A-4B present the T 2 control chart and the generalized variance chart respectively related to the standardized residuals.Those prepared with the raw residuals present a similar pattern and, therefore, will not be presented to avoid repetition.The stability of both the mean and the variance are shown in Figures 4A-4B.According to the preliminary analysis, data from the survey campaign 3 do not follow the normal distribution.These data are excluded because one of the assumptions to the use of these control charts is data normality.However the exclusion of data from survey campaign 3 to redraw the new control charts provided insignificant effect, as shown in Figures 5A-5B.

CONCLUSIONS AND RECOMMENDATIONS
This study proposed a method for monitoring multi-temporal data derived from geodesic survey campaigns by applying multivariate control charts.This quality tool has usually been employed to monitor production processes in industrial fields and, its use has expanded to include other areas of research, than production processes, such as public health, marketing, services.
The proposed method is simple and generates additional information beyond that obtained from traditional tests as it enables the analysis of individual survey campaigns associated with a temporal approach relative to multiple survey campaigns.

A
is complex and Anderson and Darling (in addition to other authors) have used simulations to calculate the critical points of this distribution at different significance levels (1%, 5% and 10%)

Figure 1 -
Figure 1 -Example of a control chart for monitoring the process mean.

Figure
Figure 4A -T 2 control chart -standardized residuals after exclusion of the data points outside the control limits.

Figure 4B -
Figure 4B -Generalized Variance control chart -standardized residuals after exclusion of the data points outside the control limits.

Figure
Figure 5A -T 2 control chart -standardized residuals after exclusion of the data points from survey campaign 3.

Figure 5B -
Figure 5B -Generalized Variance control chart -standardized residuals after exclusion of the data points from survey campaign 3.

Table 1 -
Descriptive level of the Anderson-Darling test.

Table 2 -
Sample Mean and Variance of the raw and standardized residuals.

Table 4 -
P-values of the Anderson-Darling test-after exclusion the data points outside the control limits.