Data Collection
In the first article in this series, we looked broadly at the research process. "Research" activity must be made more accessible for it involves finding out and assessing situations and forming conclusions based on sound methods and analysis. Most students can be involved in research evn once they have been exposed to some simple computational techniques. In my mind, the two crucial steps once the problem has been properly articulated are the research design or the design of the experiment, and the analysis and interpretation of the data.  We looked at two simple experimental designs with different applications, one in education and the other in agriculture.

The reason why the design is so critical is that it determines the methodology opted for in the research including the data collection instruments employed, the size of the sample required, and the type of analysis of the data to be used. Clearly, in the case of the comparison of the Mathematics (average 82.5) and English (average 89.4) marks it looks as though the English marks are better, but delving into the analysis a bit more will reveal that this experiment cannot really lead to this conclusion. There are weaknesses! We can conclude that for these three students, it seems that the English performance is better, but reflect again on the fact that you have just three students, and even if these scores were based on tests given, you are still just taking a shot at the situation at a particular instance. You cannot draw any inferences in general about performances in the twi subjects.

If the hypothesis is that "For these three students, there is no difference in the level of performance in English and Mathematics", then you can say that the perfomance in the English examination was better, but sttatistically, there is no reason to suggest that "performance in English is better than performance in Mathematics". Moreover, you cannot draw any conclusions about any wider population. The narrower your population and the more restricted your hypothesis, the less useful is your research, even if it does lead to a conclusive result. For instance, if we increase the number of students here to say 26, and perform the same test, this will lead to a conclusive and statistically sound conclusion, but we still cannot make any inferences about say the population of students in the country. To be able to do that involves a technique called "sampling".

In the design, we need to specify what the "population" is. This is the wider body of students about which we would like to make the inferences. It is usually impossible or not practical to test each member of the population, so that we select a sample, or smaller set which represents the population. The experiment done on the sample will enable us to make inferences about the population, and thus make the research project worthwhile. The data collection is run on the sample. The data collection moreover, must be done in a manner that assures the integrity and accuracy of the situation we are trying to "look at". It is important to ask the relevant and perinent questions, and not interfere too much in the actual situation. There is a principle in Physics that says that measurements made to determine the state of a system interfere with the system and thus do not reveal a true picture of the state of the system.

The data collection instruments may be simple tests, or questionnaires. They may be just checklists or even a plain piece of paper, depending on what you want to find out. The data collection instruments may be simple paper forms, and need not be sophisticated ASP based computerised forms. Thus, the technology need not be a hindrance in creating a sound data collection instrument. I have also found that with my own necessity for data collection across different countries, I have hed to resort to the latter, there is nothing better than getting the group together distributing a set of forms and collecting the data at the end. This way you are sure that you have your data. Of course, you realise that this physical grouping of the sample also "intereferes" with the actuality of the situation.

In both the experiments described above, the data collection instruments would be simple tasks. In the case of the tests, a sound teacher made test would return the scores in English and Mathematics which would be required for the scores to be determined. In the case of the pigeon peas, a simple form recording the dates that the weight or volume or produce was harvested would suffice. I have inserted a sample form on the left. If the experimental design is even a bit more complex, as in a determination of the relationship between violence and levels of education and looking at the situation in different locations, then the form must contain some method of recording this location as well as collecting data on both educational levels and tendency for violent action. I am using one such form in my own study to try to establish a correlation between these two, hoping to suggest that by making education more readily available, there will be a reduction in the levels of violence. Of course, this is begging the question. At least it does not form the main hypothesis of the study so I feel comfortable using it.

There are various technologies, such as Active Server Pages (ASP), and Hypertext Pre processor  (PHP) which could be employed to facilitate the data collection process. In order to create ASP forms, you would need to be competent with the technology and have a server that you can load your forms and make them accessible. If you do want to use these technologies, I would offer my services to create these forms. You would however have to send me the set of questions. I have found that in the case of questionnaires, the simpler the design the better, with simple the deisgn the better. Also, you should make it quite clear on the questionnaire what the purpose of the form is and how that data will be used. This is especially necessary in cases where your investigations may be the cause of suspicions about your intentions! Do not discount the use of a simple form created and distributed via e-mail. This can also be an effective data collection method, and one that does not demand too much technological know-how. The main idea here is that you need to be able to collect data from your sample.

Needless to say, the questions asked on the data collection form, need to relate to the central issues of the study, and moreover need to be put in such a way that they do not lead the person answering the questions not put them in uncompromising positions. The data should include some reference to the sample item, whether it is just a number assigned or in less sensitive situation the person's name if you want it to be more personal. At any rate you need to be able to justify that data was collected for all the items in the sample or else you would have no right to make any inferences about the population and your study would not be worth the effort you are putting into it.

Once the data has been collected, and it has beeb collated and tabulated, the next step would be to
analyse the data. I know that this is a veritable nightmare for many a good student. This is probably because the level of statistical knowledge and techniques required is beyond the average student at secondary level, and beyond that students in their specialisations tend not to take any more statistics courses. Ay any rate, there is no need to worry as there is sufficient knowledge around and help for analysis and interpretation. In the next article I shall deal with some of the aspects of analysis, and interpretation of the results of the analysis. Once the statistical analysis has revealed, its secrates, the experimenter then has to make sense of the results in the context of the study, and draw his conclusions and make recommendations to suit.
Example (PHP)
Name:
Robert Anthony Geofroy
E-Mail:
[email protected]
Hosted by www.Geocities.ws

1