Case study using analysis of variance to determine groups ’ variations

This paper aims to present the analysis of a part manufactured in three shifts, which has a specific characteristic dimension, using DFSS (Design for Six Sigma) ANOVA (Analysis of Variance) method. In every shift, the significant characteristic, “SC”, dimension should be produced within the given tolerance. The question that arises is: “Does the shift have any influence on the “SC” dimension realization?” By using the one way ANOVA method, one can observe the variation between the means of each of the three shifts. Afterwards, specific action can be undertaken to adjust, if necessary, the differences between the shifts.


About anova
Anova, which is the abbreviation of "analysis of variance", tests the hypothesis that the means of three or more populations are equal.ANOVA assess the importance of one or more factors by comparing the response variable means at the different factor levels.The null hypothesis states that all population means (factor level means) are equal, while the alternative hypothesis states that at least one is different, http://support.minitab.com[1].
We are using the one-way ANOVA method because the sample data are separated into groups according to one characteristic, which is the "SC" dimension.Instead of referring to the main objective of testing for equal means, the term analysis of variance refers to the method we use, which is based on an analysis of sample variances, M. F. Triola [3].
What are we doing is comparing the variance between samples with variance within sample.

𝐅𝐅 = 𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗 𝒃𝒃𝒗𝒗𝒃𝒃𝒃𝒃𝒗𝒗𝒗𝒗𝒗𝒗 𝒔𝒔𝒗𝒗𝒔𝒔𝒔𝒔𝒔𝒔𝒗𝒗 𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗𝒗 𝒃𝒃𝒗𝒗𝒃𝒃𝒘𝒘𝒗𝒗𝒗𝒗 𝒔𝒔𝒗𝒗𝒔𝒔𝒔𝒔𝒔𝒔𝒗𝒗
(1) A correlation between the p-value and test statistic for one way ANOVA, F, states that for larger values of F, we get smaller p-values and for that the ANOVA test is right-tailed.Also smaller values of F result in larger values of p-values and the graph will be left-tailed.
The P-value is the probability of obtaining a sample statistic with a value as extreme as, or more extreme than the one determined by the sample data.
Fig. 1, below, shows the relationship between F test statistic and p-value.

Data under anova analysis
The part that has been observed is an assembly component on which, from the design, a specific characteristic dimension is given.

Fig. 1. Relation between F test statistic and p-value
That has a major impact over the functionality of the part and, for that, systematic measurements are taken during every shift.
Fig. 2 presents a detail from the drawing of the part and with the specific characteristic dimension marked on it.That is Φ4.15±0.025and it represents the diameter of the hole.Geometrical tolerance, circularity, presented also in this figure, is not considered for this observation even if it had the SC specification on it.
Also the coaxiality tolerance is not a part of the actual observation.First thing that influences the quality of the part is the significant dimension, respectively the MATEC Web of Conferences 126, 04008 (2017) DOI: 10.1051/matecconf/201712604008 diameter of the hole.All other information are to be taken in the next phases of observations and control.

Fig. 2. Drawing of the part
Those measurements are used to determine, by the ANOVA method, the differences of the mean of every shift, regarding this specific dimension.For that, 15 th measurements are taken in every single shift.
A measurement report is given and it includes the 15 th measurements of the specific characteristic dimension on every shift.In the table below these measured values can be seen.First of all, we can observe from this measurements table a small variation of the means of the part during the three shift.Thus we can say that every shift produces good parts regarding the upper and lower tolerances.
To have an overall opinion over the part variation we can use the "one-way ANOVA method".

Using one way anova to compare the means of the three shifts, DfSS trainings materials [4]
One way ANOVA is used to compare the mean of at least three conditions in a certain experiment considering every shift as a group.
Data for this experiment is given in table 1.We can observe that we have three groups (representing the three shifts) and 15 measurements of a specific dimension for every group.
To begin with the ANOVA method, the first thing that we have to do is to define the statements of no and alternative hypothesis.

Hypothesis of one way ANOVA
Statistical tests prove, or attempt to disapprove, if assumptions, statements or hypothesis about unknown populations are valid or not.The goal is to be able to decide if a factor or more have a significant influence on a certain response.Sometimes, the differences are significant, other times they are not and they have to be observed.
The hypothesis tests will help us decide with a certain confidence if a statistical difference between samples exists or not.Always assume that the null hypothesis (H 0 ) is true, unless we find a strong evidence otherwise.The alternative, called alternative hypothesis (H a ), is when a strong evidence is found and then we have to reject the null hypothesis.
In our case, for NO hypothesis (or null hypothesis) we will say that there is no difference between the means of every shift.If we fail to reject H o hypothesis, that means that we can accept the H o hypothesis and also the fact that the factor shift has no influence on the functional results.
Alternative hypothesis or H a will state that there is at least one difference among the means or the factor has an influence on the functional results.
The two statements are written as bellow:  0 :  1 =  2 =  3   :      ℎ  We will use α level =0.05, significance level.By definition, the alpha level is the probability of rejecting the null hypothesis when the null hypothesis is true, http://support.minitab.com[1].
Using the standard α level = 0.05 cutoff, the null hypothesis is rejected when p <0.05 and not rejected when p >0.05.The p-value does not, in itself, support reasoning about the probabilities of hypotheses but is only a tool for deciding whether to reject the null hypothesis, R. L. Wasserstein, N. A. Lazar [5].
P-value is better represented in the fig. 3 which shows where and how it influences the decision to reject or fail to reject the null hypothesis.

Analyzing the degree of freedom to determine the critical value
Using the degree of freedom (d f ) the statistic test will be compared.There are two ways to calculate the degree of freedom: between the groups and within the groups.
For calculating the d f between the groups: where k represents the number of conditions (we have three groups then there are three conditions).
Results that    will be then 2.
For    =  − where N represents the number of values in groups. Then, Total degrees of freedom will be :

F-critical value
Using table (or FINV function in excel, e.g.=FINV(0.05,2.42),fig.2), we get a F crit = 3.22.where G is the sum of total values between groups and N is the total conditions, representing the mean of all groups.

Sum of squares total
Will be calculated as per following equation:

Sum of squares between the groups
From the equation bellow, we find that   is 0.000138

Calculation of the variance between and the variance within
First, we calculate the mean square between, which is: Second, the mean square within: 11) 9 Calculate the F-value for specific set This is to compare with F crit in order to reject or not the H 0 hypothesis.F-value is calculated as a ratio between the MS between and MS within .If the result is less than F crit , then we are fail to reject the null hypothesis and that means that the mean of each group is almost the same and there is no significant difference between the three groups.and gives us the possibility to reject the null hypothesis.We can say, as we can observe from the measurement table, that there is no big difference between shifts, even if the 3 rd shift has the mean bigger than the other two shifts.The difference is not so big and no further action is needed to be taken.
10 Using excel add-in, "Data analysis" Excel has an add-in function that allows easy calculations of various statistic elements.Through those functions, one way ANOVA is presented and it give us all the values as we calculated before.
For the three groups that we have evaluated, the following results are shown in excel data analysis add-in.
In table 2 it can be observed the main information regarding the group structure: count of observations, sum and average and the variance as well.3.Here we can find the sum of squares (within, between and total), the degrees of freedom, the mean square, F-value, pvalue and also F-critical.With a p-value of 0.66 calculated, bigger than α level of 0.05 that we had at the begging of the process (which means that we have 95% of confidence that the analysis is correct, using one tailed test), we can ascertain again that we fail to reject the null hypothesis and to accept that there are no big differences between the mean of the part during the three shifts.In other words, even if there is a slight difference of the third shift in respect to the other two, the difference is too small to influence the quality of the part.The significant characteristic dimension is respected, the variation not being too influent.

Conclusion
We saw that by using the one way ANOVA method, we were able to determine if there is a statistically significant difference among the groups that are not related to sampling error.If we find that there is a difference, then we'll need to examine where the groups' differences lay and undertake proper measures.For the analysis presented here, we observed that there is no significant difference between the groups (shifts), null hypothesis failed to be rejected, so no further action should be taken.

Fig. 3 .
Fig. 3. P-value, on the left and right tailed graphic, with null hypothesis

Table 1 .
Values from measurement report From this table we can see that the mean of every single shift is given in the last raw.

Table 2 .
Main information regarding groups under analyze summary

Table 3 .
Main information of anova