The Purpose of Randomization

The type of statistical inference that can be made from a body of data depends on the nature of the data. It is easy to conduct an experiment in such a way that no useful inferences can be made, and many of the experiments brought to the statistician, particularly in earlier years, have been of this type. To take a simple example, suppose that in the comparison of the calculating machines (see earlier page) each sum of squares had been computed first on machine A and then on machine B. Now it is quite possible that increased familiarity with the data will enable the second computation to be done faster than the first. The advantage is unlikely to be great in such an easy calculation; it could be so if the computation were more difficult. (In the experiment this advantage was estimated at about 4 seconds, though the confidence limits for the true advantage are rather far apart.)

If the experiment is conducted in this way, the observed difference in speed (B - A) is an estimate of the true difference, plus the unknown difference in speed between a second calculation and a first. Confidence limits set up by statistical techniques apply not to the true difference between B and A but to the true difference plus this unknown advantage. Consequently, the limits tell us nothing definite about the true difference between B and A. The interpretation of tests of significance also becomes confused. If A is found significantly faster than B by ordinary statistical tests, we can be confident that there is a real difference in favor of A, since A was handicapped in the course of the experiment. But if B is found significantly faster than A, we do not know what to conclude. In this case we are dealing with a bias whose nature can be anticipated before the experiment has started. In other experiments where less is known about the type of variability that is present, similar biases that are quite unexpected can occur from some apparently innocuous rule about the way in which different treatments are handled.

In order to avoid these biases we need some means of insuring that a treatment will not be continually favored or handicapped in successive replications by some extraneous source of variation, known or unknown. This is done by the device known as randomization, due to Fisher. Instead of performing every calculation first on machine A, we apply the principle of randomization by tossing a coin to determine whether A or B shall be used first in any trial. The decision is made independently in each trial. The effect is that in any trial each machine has an equal chance of being tested under the more favorable conditions. Of course, the result of any specific randomization may favor one or the other treatment. But this happens only to an extent that is allowed for in the calculations that are used for tests of significance and confidence limits. This important result has been illustrated in detail by Fisher, who has shown how tests of significance and confidence limits can be constructed, using only the fact that randomization has been properly applied in the experiment. Randomization is one of the few characteristics of modern experimental design that appears to be really modern. One can find experiments made 100 or 150 years ago that embody the principles that are now regarded as sound, with the conspicuous exception of randomization.

The occasions on which randomization is required vary with the type of experiment and must be left to the judgment of the experimenter. One occasion arises when the treatments are allotted to the experimental material. Suppose that the effects of different diets on the heights and weights of children are to be ascertained. Since different children grow at different rates, a treatment that happens to be assigned to a group of fast-growing youngsters is favored. Consequently, we allot the diets in any replication at random to the children who are to receive them, with a new random allotment in each replication. Similarly, if 4 different oven temperatures for the cooking of roasts are under comparison, the 4 temperatures are assigned at random to the 4 roasts which form the material for any replication. Sometimes this is the only randomization required, but frequently other operations that are carried out in the course of the experiment are also potential sources of bias. With a repetitious operation the order of events may be important, either because a learning process is involved which tends to make later operations better than the earlier ones, or because fatigue tends in the opposite direction. Systematic biases may be guarded against by randomizing the order in which the operation is performed on the different treatments in a replication. In other cases the equipment that is used introduces variation. For example, if 4 ovens are available to cook the 4 roasts, we would not always use the same oven for the same temperature, in case biases should be introduced because of systematic differences among the ovens. Instead, the temperatures could be assigned at random to the ovens in each replication. Thus we have two randomizations, one to assign the temperatures to the roasts and one to assign them to the ovens. If, however, we decide, before randomizing, which roast is to go in which oven, the two randomizations can be reduced to one. This method can always be used, if convenient, to cut down the number of randomizations that must be made.

Randomization is somewhat analogous to insurance, in that it is a precaution against disturbances that may or may not occur and that may or may not be serious if they do occur. It is generally advisable to take the trouble to randomize even when it is not expected that there will be any serious bias from failure to randomize. The experimenter is thus protected against unusual events that upset his expectations. Of course in experiments where a great number of physical operations are involved, the application of randomization to every operation becomes time consuming, and the experimenter may use his judgment in omitting randomization where there is real knowledge that the results will not be vitiated. It should be realized, however, that failure to randomize at any stage may introduce bias unless either the variation introduced in that stage is negligible or the experiment effectively randomizes itself.

Adapted from:

Cochran, William G., and Cox, Gertrude M., "Experimental Designs", John Wiley & Sons Inc, 1957, pp.6-8.

HOME | LEVEL ABOVE

Hosted by www.Geocities.ws