ANOVA
This page is devoted to the ANOVA. Don't be put off by this very technical term. ANOVA means "Analysis of Variance". Variance is a feature of a data set which gives a measure of variability within the data. So we are looking at the variability within the data. You may have two data sets with the same average but with different variances. For  instance, examine the data sets , D1 and D2:
       D1: 8 10 12
       D2: 1 10 19
Both of these have an average of 10, but clearly there is a marked difference in the data. With a cursory glance you would hazard a guess that D2 has greater variability and you are quite right. All the statistical analysis will do is to confirm your suspicions. 

Use Microsoft Excel again to analyse the data. Put the values 8, 10 and 12 in column A and 1, 10 and 19 in column B. Use the technique shown in the last article to create the statistics: Tools > Data Analysis but this time asking for the ANOVA in the Data Range $A$1: $B$3. Microsoft Excel, returns the averages or means as both 10 for the two columns and also the Variances. Clearly there is a difference. Many times it is not so easyu to see the differences in the data set itself and so you have to rely on the results to tell you the secrets.

Before we proceed, let us clarify the terminology. In ANOVA, we usually are dealing with a block design as we had seen earlier.  The variables that inhabit the row and column headings are called FACTORS. If there are only columns, then there is only one factor, although in this one factor there may be many levels. In the exam data that we had analysed before, the design is one factor and two levels. The levels being - English and Mathematics. If there are three columns and no rows, then there is one factor but there are three LEVELS. If you have rows and columns, and there are say two rows and three columnsthen you have a 2 x 3 block design, and the design can be represented as R(A x B) where A and B are the factors and you typically use subscripts to indicate the levels. I am restricted here from doing so by the software I am using but that is normally how it is done.  It is possible to have more complex groupings and this creates more complex designs and thus more complicated analyses, but ANOVA is a very powerful technique and can handle just about all the complexities that you may come up with. I recommend that you look for the experiment of Child as an example of a complex experimental design used in the past.

ANOVA like the t-test and like regression analysis, all depend on the same sum of squared deviations, and all by and large give the same results in slightly different formats. The appreciation of the value of the variances and the use of the ANOVA is another very powerful tool that you have at your disposal. The ANOVA is really an extension of the simple comparison that you used to compare the means of two sets of data. With ANOVA, you can compare means of several sets of data simultaneously. Clearly then, we can perform a one-way ANOVA for two data sets and the results should be similar to what we had before for the Paired t-test. We can also perfome simultaneously paired comparisons of the means of several groups. Let us look now at a simple example: Consider the two sets of examination
scores for English and Mathematics. We saw that this design is a one factor design with two levels. This can be summarised as R (A), where A has two levels A1 and A2. R, represents the repeated occurrences of the measure. This is a simple design and we have already seen what the results look like using a paired t-test. Now let us see what the ANOVA does for us.

Examine
this output. Observe that the variances for the data in the columns is exactly the same as that reported in the paired t-test. Now let us see how this works with the conclusion we came to in the previous analysis. We had decided that there was no evidence to discount the Null Hypothesis. Well, let's see here whether this ANOVA gives us the same conclusion.  Recall that for this data we had said that there is evidence to suggest that the Null Hypothesis be discounted. The ANOVA output gives us the same values for the variances in the two columns in the SUMMARY. In the ANOVA table below, we have the p-value of 0.000542. This is extremely small, indicating that the probability that the means are equal is quite small. We are forced to reject the Null Hypothesis as we had done before. Of course, while you can appreciate that the analyses are very similar in this siimple design, the ANOVA is capable of returning results on very complex designs.

The whole question of deisign of experiments while not complex is conceptually challenging to say the least until your mind adapts to this type of thinking. I can recall that after many years of studying statistics, I had reall not got into the design aspects until I went to Toronto in 1990. There under Professor Burrill I became aware of the full potential of the power of design and analysis. Sometimes, we just need a halpiing hand to open our eyes to the marvels that exist about us.

Another remarkable tool is
regression.We shall deal with that in th next article.
Further Reading
Block Designs
Name:
Robert Anthony Geofroy
E-Mail:
[email protected]
Hosted by www.Geocities.ws

1