In reconciling thus, the two desiderata of the reduction of error and of the valid estimation of error, ... no principle is in the smallest degree compromised. An experiment either admits of a valid estimate of error, or it does not; whether it does so, or not, depends not on the actual arrangement of plots, but only on the way in which that arrangement was arrived at. (Fisher, 1926)
Replication of the experiment provides the data to estimate the experimental error variance. Blocking provides a means to reduce experimental error. However, replication and blocking alone do not guarantee valid estimates of experimental error variance or valid estimates of treatment comparisons.
Fisher (1926) was making the point that randomization, alone, in the experiment provides a valid estimate of error variance for justifiable statistical inference methods of estimation and tests of hypotheses. Randomization is the random assignment of treatments to experimental units.
Our analysis of data from experiments assumes the observations constitute a random sample from a normally distributed population. This assumption is plausible for comparative observational studies that use random samples of the available observation units from different treatment populations. However, whether experimental units can be considered a random sample is questionable when they are carefully selected, controlled, and monitored in experiments.
Independent observations are critical for estimation and tests of hypotheses because they provide valid estimates of experimental error variance. But the assumption of independence among the experimental units cannot be justified when relationships exist among them. For example, it is well known that field plots tend to respond more similarly when they are adjacent. Any type of proximity can produce correlated responses whether it be physical location of units or temporal performance of tasks on the units.
Fisher (1926) recognized these potential difficulties with field plot experiments and justified random assignment of treatments to experimental units as the means to obtain valid estimates of experimental error variance. In a more detailed discussion, Fisher (1935) showed that randomization provided appropriate reference populations for statistical inferences free of any assumptions about the distribution of the observations. He showed that significance tests could be based on the distribution created by randomization and that the normal theory tests provided reasonable approximations to these test results. Thus, the random allocation of treatments to the experimental units simulates the effect of independence and permits us to proceed as if the observations are independent and normally distributed. These randomization tests illustrated in this section, form the basis for valid statistical inferences in properly randomized experiments.
Further justification for randomization (Cochran & Cox 1957; Greenberg 1951; and Ostle & Mensing 1975) is based on the need to eliminate biases in the comparison of treatments that arise through systematic assignment of treatments to experimental units. If, for example, procedure A is always performed before procedure B, any systematic variation over time will bias the resulting comparisons between A and B. Randomization over these potential systematic sources of variation ensures estimates of treatment means differ from true values only by random variation.
Excerpted from:Kuehl, Robert O., "Design of experiments: statistical principles of research design and analysis ", Brooks/Cole, Thomson Learning, 2000, pp.20-21.