Week 10: March 5 - 9, 2001
 
     
 

Back to main page

2000
Week 36

Week 37
Week 38
Week 39
Week 40
Week 41
Week 42
Week 43
Week 44
Week 45
Week 46
Week 47
Week 48
Week 49
Week 50
2001
Week 2
Week 3
Week 4
Week 5
Week 6
Week 7
Week 8
Week 9
Week 10
Week 11
Week 12
Week 13
Weeks 14-20

Back to main page

This week was primarily spent on report writing, reading and statistical analysis.

As it is sometimes difficult to use the SAS program for statistical analysis an alternative program, Statistica, was used. However, this program was not able to import data properly. Thus, the sometimes bothersome analysis with SAS continued.

A few meetings regarding the opening of the new climate chambers (March 19th) were held. Most people at the centre will take part in this event. Our responsibility will be to assist with the serving of beverages/food, guiding of guests, preparing demonstration test set-up concerning air velocity in one of the climate chambers and assisting Jørn with his presentation of our results. On one hand we would very much like to participate in the preparations and the opening festivities, but we would also like to minimize the time spent on these activities as our master thesis has very high priority. Do not forget that we have applied to postpone the deadline for our report.

The statistical analysis continued and we sought more advice from Henrik Spliid. Below is a list of points concerning our statistical analysis:

1. Non-parametric statistical tests are frequently used (among other places at our centre), because they do not have rigorous requirements to the data-set with regard to normal distribution. However, it is much better to check and possibly transform the data so that they fulfil the requirements for parametric tests, as these tests are much stronger than the non-parametric tests. The disadvantage with non-parametric tests is that they omit much of the information in a data-set. This makes it difficult to obtain significant results; only in the best case will a non-parametric statistical test yield equally significant results as the parametric statistical tests. Moreover, one of the requirements for the frequently used tests for comparison of means is that the data for each of the two data sets are of normal distribution and of equal variance. This requirement is often neglected when the test is applied and results are presented.

2. In order to test whether a data set is normally distributed one must make a sorted probability plot of the residuals. If this does not fit a normal distribution, the residuals of the logarithm of the results can be used instead. The logarithm must be taken because the effect (e.g. performance during text typing) is increased/decreased by a percentage (e.g. 5%) instead of being increased/decreased by a constant additive factor (e.g. an additional page typed, no matter how much time is spent).
For the data to be evenly distributed the estimated residual plot must lie on a straight line; moreover, it must be evenly distributed around the middle with only a few outliers. If an outlier is found, it may be removed, if an explanation exist of why that specific data differs from the rest of the data. Some sort of transformation was mentioned as a method to streamline the data, but we do not expect to go into this level of detail.

3. The GLM (general linear model) does not care whether the experiments are balanced and whether the amounts of data for each condition are equal.

4. The GLM yields two outputs, type I and type III. It is output of type III that should be used for determination of significance for each factor.

5. The “estimate coefficients” comparing class variables (e.g. groups) in between themselves is of little use; the computed p-values cannot be used without transformation and can be obtained in alternative ways.

6. For the Newman-Keuls Test care must be taken when comparing variables that are nested. For example, the variable “subjectno” is nested under “groups” and “sex”. The variance of the subjects (subjectno) is for instance much greater than the variance of the groups (group), which intuitively makes it difficult to determined whether there is a marked difference between groups. SAS incorrectly tests the group variation against the mean square variation of the error. However, in this case it is not the mean square of the error but rather the much larger mean square of the subjects that should be used for division to determine the F-value and the subsequent p-value of the group variation.

To summarise, Henrik Spliid thought that our model and experimental design looked fine. The slightly unbalanced design was not estimated to have a significant effect on the outcome of the results. The preliminary analysis of addition and text typing data indicated an acceptable normal distribution of the data

 

 
 
           
Hosted by www.Geocities.ws

1